Tuesday, February 27, 2018

Welcome To Oracle Cloud

I do know it has been quite some time since I published a blog here.  Sorry about that.  Changes in life and work do come along.  I now must mention I work for a large company, and that the opinions expressed here are my own.  I have been working with cloud software from various vendors for some time now, and quite recently, Oracle has begun to make their presence known in this space. 

Just recently, I was asked to troubleshoot a client installation patching issue for their Oracle Cloud database.  Cloud supported databases in the Oracle world are referred to as "DBAAS" or "database as a service".  However, like any other database hosting installation, from time to time, patching and maintenance may be required.  From a special console website, the patching menu is presented with a drop down menu, where first instances and then eligible patches are listed.  It can be as simple as point and click to pre-check (verify) and then install the applicable patch. 

However, what if something goes wrong?  In this particular case, the point and click methodology didn't work!  Using conventional methods is not generally advised, as the "dbaas" implementation of Oracle's software is somewhat customized (enhanced?) both to assure consistency as well as monitoring and metering of the service for support and to charge based upon usage. 

In one particular case, a customer who had worked-around the standard patching methodology lost support, and in another, was charged an incorrect fee (much higher) due to this. 

I cannot go into specific detail at this time, due to the confidential nature of the customer's business and Oracle's requirements as well.  Eventually, we worked with Oracle support and received a custom rpm (unix source code) fix to make patching for dbaas functional in this specific case.  There were some things we noted that once again should convince readers to tread gingerly before doing any customizations or non-standard configuration activities on a cloud system:

1)  Root is a very powerful privilege.  You can invoke it, but at your own peril.  Oracle gives a default account of "opc" (standing for Oracle Public Cloud) from which you can sudo su to oracle or root.  In this case, some permissions for grid executables were set incorrectly, quite possibly by someone with root privilege for some other valid reason.

2) Patching outside of the interface is dangerous - While more mature administrators such as myself may prefer command line because of the extra options and ability to configure and control output and logging, you are probably better off to try using standard tooling first.  This was pointed out by Oracle, who, even after fixing the patching tools and running manually, had us execute via the console as well!

3) Think twice and think again - before any task, or before adding/removing any software/logs/configurations or making any changes to files like /etc/fstab /etc/hosts or /etc/oratab. 

We've already had instances where valid and necessary changes to these files caused the instrumentation to fail.  While I readily agree, as Oracle Cloud software matures, such small items should not do so, changing these items can cause problems which are very very difficult to solve!

This is sort of a rambling post.  But I am glad to get back up on the horse again.  I hope to write again soon.  Have a good one!  And happy patching!!!

Wednesday, June 8, 2016

Enabling X11 On Exadata and Oracle Enterprise Linux 6

A strange thing happened with the latest distributions of Oracle Enterprise Linux, causing an unforeseen issue, especially on engineered Oracle servers.  With an upgrade from version 5 to version 6 of the linux base kernel, for security reasons, X11 (or Xwindows) is no longer enabled and pre-installed, nor is it in the same location as prior versions.  

While these features are not required to run or maintain Oracle software, it is quite common for them to be used for product maintenance and installation, as most people have become quite dependent upon graphical user interfaces for new releases, patching, and installs.  So, this presents a dilemma.  Often, Oracle itself, a third party, or a systems engineer are tasks to install new linux versions.  They have a checklist (often supplied by Oracle) to assure all needed options, features and packages are included in the new build, and will receive verification that they are complete and functional.  It can be days/weeks/months before the next installation or maintenance is attempted, often in very tight windows.  If such activities are dependent upon X11 features, they will fail.  This happened to a client quite recently.  

The solution is to identify this issue, and pre-install the needed packages.  Then, any account which needs them need to be configured and hardened to use them safely (this is different than the prior configuration).  For one thing, executables are no longer in /usr/bin, nor /usr/X11/bin.  Oracle attempts to address support requests raised on their "My Oracle Support (MOS)" website with the following document:

Unable to run graphical tools (gui) (runInstaller, dbca and dbua) on Exadata - (Doc ID 1969308.1)

However, this document is not applicable to only Exadata, and is maddeningly incomplete in most circumstances.  Two solutions are offered, with the first being recommended, but quite unwieldy and somewhat convoluted.  It suggests basically downloading the entire linux package library to your machine, and then running a compare and merge for incomplete or missing components.  This is great, but often undesirable on a production or heavily used system, and has numerous possibly negative consequences.  

The second solution seemed simple and reasonable.  The only downside is that future versions COULD possibly cause it to be re-applied.  If you've been paying attention since the start of this discussion, that is quite unlikely, since linux and Oracle have decided to remove this feature and packages going forward.  I will therefore submit that solution 2 is the most practical.  

However, there are pitfalls, and downright incorrect/incomplete information in the instructions.  

First off the following packages are listed and supplied as a zip in the document:

  • xorg-x11-utils
  • xorg-x11-xauth
  • libXv
  • libXxf86dga
  • libXxf86misc
However, these have prerequisites of their own, and are incomplete.  The following packages are additionally required (in most cases, as in after a linux upgrade to support OEL for Oracle 12c):


On some systems, these prerequisites would be flagged and listed as requirements in detail.  Not so on a recent exadata installation.  This required considerable time to research.  Once the complete list of packages are added, then the instructions for implementation from the Oracle documentation can be added.  

There are other problems, as well.  If the "oracle" user, typically invoked to install products, is still pointing to the old set of executables for X, it won't find the new ones.  This will have to be addressed.  If this is a completely new build, sshd will need to be configured, stopped and then started to support X functionality.  Finally, this configuration will have to be done for every node in a cluster, and verified.  These things can take time, both in preparation and implementation.  It might be wise to first implement in other environments, and work out these issues, including command and control, politically, as this was also an issue for some clients.  

I am intentionally leaving some detail out of this document, under the assumption that the audience:

a) knows what X windows is.
b) knows how to use it to maintain oracle products.
c) have a certain familiarity with linux and services such as sshd.
d) can understand and work with linux package installation and configuration.

If any of the above is not the case, I suggest opening a support request with Oracle, as their document is insufficient in it's current form.  

Saturday, May 16, 2015


A troubling parallel processing default value keeps constantly cropping up in my various client RAC installations.  Called "parallel_force_local", the default value is false.  It sounds innocuous enough, until it causes some issues and it's daunting implications are soon realized.  Here is a description of the parameter and it's values from the Oracle 11gR2 documentation:


Parameter typeBoolean
Default valuefalse
Range of valuestrue | false
PARALLEL_FORCE_LOCAL controls parallel execution in an Oracle RAC environment. By default, the parallel server processes selected to execute a SQL statement can operate on any or all Oracle RAC nodes in the cluster. By setting PARALLEL_FORCE_LOCAL to true, the parallel server processes are restricted so that they can only operate on the same Oracle RAC node where the query coordinator resides (the node on which the SQL statement was executed on).

As stated above, a default value of "false" allows a SQL statement which "qualifies"* for parallel processing to unrestricted processing across the entire cluster.  The theory is that a SQL statement can be executed faster with cooperative processing across many nodes, each with multiple CPU and large amounts of memory.  Just like two or three people can complete a large jigsaw puzzle faster than one, more resources should get a task done faster.  However, considering the overhead involved in managing the vast resources of a RAC cluster, and maintaining status of such complex processing as a SQL statement, finding the right "blend" of resources to deliver consistently faster results can be a challenge.  Add in the fact that there are quite a few parallel processing parameters, the fact that they can be different for each node, and that they may not always be appropriate for a large query on any one of the nodes, you have to wonder if attempting such an activity by default is truly appropriate.  

I could go into the details of how the cache resource manager and block processing resources for parallel processing across nodes are not always efficient, but needless to say, it's been my experience, with overwhelming proof via demonstration, that only in the rarest cases will a query complete faster by being parallelized across nodes.  This is even true, in most cases, on an Exadata machine.  

Needless to say, since in most cases, this parameter will not achieve desired results, and can often result in deliterious performance of an entire cluster for a particularly complete query, I would always set this value to TRUE, rather than the default!

* SQL statements which "qualify" for parallel processing will either have a 'hint' of parallel processing, or an object with a parallel degree setting of 2 or higher.  It would be better to set parallel_force_local=false at the session level in those rare cases where it can be proven that a given query or user process (such as an ETL load) can benefit from this setting.

Your mileage may vary.  Objects in mirror are closer than they appear.  Always test for every case and condition.

Have a good day!

Tuesday, November 25, 2014

A Quick Tour Of The Oracle Database SYSAUX Tablespace

The Oracle database SYSAUX tablespace was introduced in Oracle 10g.  It was made necessary by the constant issues surrounding operational object increases in number, size and purpose.  A document presenting it's use and purpose was first presented by Oracle in support documents nearly 10 years from the date of this writing:

SYSAUX New Mandatory Tablespace in Oracle 10g and higher (Doc ID 243246.1)

Since this introduction, more than a little water has run under the bridge, and the size and purpose of this tablespace have subtly evolved.  While the original intent may have been to maintain objects in support of auxiliary products, it's become a requirement that the tablespace and many of it's original contents be maintained.   At the same time, some objects originally specified for SYSAUX (such as those related to auditting) can optionally be stored elsewhere.  These facts can sometimes make it difficult to know what exactly SHOULD be in SYSAUX, and whether or not they can safely be added or removed.  

There are many support documents related to these issues, and with good reason.  I am writing this because I've never seen the topic covered comprehensively.   Here are some of the better Oracle support documents about SYSAUX, and required reading before even starting to consider management or maintenance of the beast:

How to Reduce SYSAUX Tablespace Occupancy Due to Fragmented TABLEs and INDEXes (Doc ID 1563921.1)

Suggestions if Your SYSAUX Tablespace Grows Rapidly or Too Large (Doc ID 1292724.1)

Troubleshooting Issues with SYSAUX Usage by the Automatic Workload Repository (AWR) (Doc ID 1399365.1)

Fixing a Corrupted SYSAUX Tablespace (Doc ID 950128.1)

How to Relocate the SYSAUX Tablespace (Doc ID 301186.1)

High Storage Consumption for LOBs in SYSAUX Tablespace (Doc ID 396502.1)

These are a few of what may be hundreds of articles available to those licensed for support.  However, what if you are either not licensed, do not have access, or wish to research beyond what is presented by support?  Since there's not much out there, perhaps that's why you're reading this.  I will attempt to present the topic at a high enough level to hopefully get you pointed in the right direction.  

I believe SYSAUX was originally intended as a "pressure relief valve" for the rapid expansion of the SYSTEM tablespace as Oracle databases grew larger, and supporting software to manage it grew as well.  Like any other such vaguely-purposed mechanism, doubts grew as when, why and how it should be used.  In a couple of product cases (OLAP, XDB), those product's objects were moved INTO sysaux from a dedicated tablespace in prior releases.  While SYSAUX may have achieved the loosely stated purpose of controlling the growth of SYSTEM (and consolidation of other tablespaces), how was the growth of SYSAUX to be controlled?  It got to the point where Oracle created a view (sys.v$sysaux_occupants)  to display all the objects and products within the SYSAUX ecosystem:

(1b) SYSAUX occupants space usage (v$sysaux_occupants)
| Occupant Name        Schema Name               Space Usage
| -------------------- -------------------- ----------------
| SM/AWR               SYS                      735,289.3 MB
| SM/OTHER             SYS                        9,464.2 MB
| SM/OPTSTAT           SYS                        1,415.3 MB
| SM/ADVISOR           SYS                          733.6 MB
| SQL_MANAGEMENT_BASE  SYS                          461.3 MB
| EM                   SYSMAN                       385.9 MB
| XDB                  XDB                          109.5 MB
| LOGMNR               SYSTEM                       106.9 MB
| SDO                  MDSYS                         87.2 MB
| AO                   SYS                           48.4 MB
| XSOQHIST             SYS                           48.4 MB
| XSAMD                OLAPSYS                       32.1 MB
| ORDIM/ORDDATA        ORDDATA                       22.1 MB
| WM                   WMSYS                         14.4 MB
| EXPRESSION_FILTER    EXFSYS                         9.8 MB
| JOB_SCHEDULER        SYS                            7.4 MB
| EM_MONITORING_USER   DBSNMP                         6.9 MB
| PL/SCOPE             SYS                            5.6 MB
| SMON_SCN_TIME        SYS                            3.3 MB
| LOGSTDBY             SYSTEM                         2.8 MB
| STREAMS              SYS                            2.3 MB
| ORDIM                ORDSYS                         1.1 MB
| AUTO_TASK            SYS                            0.6 MB
| AUDIT_TABLES         SYS                            0.0 MB
| ORDIM/ORDPLUGINS     ORDPLUGINS                     0.0 MB
| STATSPACK            PERFSTAT                       0.0 MB
| TEXT                 CTXSYS                         0.0 MB
| TSM                  TSMSYS                         0.0 MB
| ULTRASEARCH          WKSYS                          0.0 MB
| ULTRASEARCH_DEMO_USE WK_TEST                        0.0 MB

| Others (Unaccounted space)                         28.1 MB

The view V$SYSAUX_OCCUPANTS has a column that will indicate IF the associated product can safely be moved to a different tablespace, and the Oracle-supplied package for doing so.  Keep in mind that some products MUST be in the SYSAUX tablespace, even if they can be moved operationally, in order for certain patches and upgrades to be successful.  This depends upon release and patch level, but should be covered in the readme for the particular patch or upgrade.  Also note that although Oracle will not support moving objects owned by user SYS, these packages do so and are authorized and supported by Oracle.  There are a great many service requests and errors reported from end users moving these objects on their own.  

Over time, several products evolved into or were created with objects in SYSAUX. One very good example is AWR.  Depending upon how large your tuning repository is defined, this can place space pressure on SYSAUX (see above!), and, according to Oracle, it's objects MUST remain there for compatibility and proper operation! Another famous occupant (perhaps by accident?!?) of SYSAUX is the AUD$ table. Originally, the table SYS.AUD$ was housed in the SYSTEM tablespace, but beginning with Oracle 11g, it became located in SYSAUX at database creation, but remained in SYSTEM during an upgrade (depending upon versions).  Later, Oracle devised a package called DBMS_AUDIT_MGMT to relocate this object, only after several support papers mistakenly directed customers to move the object by hand with occasionally disastrous results:

ORA-600[4412] After Moving The AUD$ Table To Another Tablespace (Doc ID 1059958.1)

There are great documents and blogs for properly moving AUD$ to it's own tablespace, and it's highly recommended beginning with Oracle 11gr2 and above.  In fact, any Oracle implementations that have it in SYSTEM or SYSAUX should move it immediately.  The best example is here:

SCRIPT: Basic example to manage AUD$ table in 11.2 with dbms_audit_mgmt (Doc ID 1362997.1)

I will post the code from the document for those who don't have access.  I have used this many times without issue:

-- Example re-locating AUD$ and setting up a purge job

set serveroutput on

prompt First Step: init cleanup (if not already)

dbms_output.put_line('Calling DBMS_AUDIT_MGMT.INIT_CLEANUP');
audit_trail_type => dbms_audit_mgmt.AUDIT_TRAIL_AUD_STD,
default_cleanup_interval => 24*7);
dbms_output.put_line('Cleanup for STD was already initialized');
end if;

prompt Relocate AUD$ to a dedicated tablespace AUDIT_DATA

audit_trail_type => dbms_audit_mgmt.AUDIT_TRAIL_AUD_STD,
audit_trail_location_value => 'AUDIT_DATA') ;

prompt set last archive timestamp to a week before now

last_archive_time => sysdate - 7);

prompt setup a purge job

(AUDIT_TRAIL_PURGE_NAME => 'Standard_Audit_Trail_PJ');
-- exception
-- when others then
-- null;

AUDIT_TRAIL_PURGE_NAME => 'Standard_Audit_Trail_PJ',

prompt call DBMS_AUDIT_MGMT.SET_LAST_ARCHIVE_TIMESTAMP regularly to advance
prompt the last archive ts, Audit Vault will do this for you automatically

prompt Optionally Schedule automatic advancement of the archive timestamp

create or replace procedure set_archive_retention
(retention in number default 7) as
last_archive_time => sysdate - retention);

-- exception
-- when others then
-- null;

   DBMS_SCHEDULER.create_job (
   job_name => 'advance_archive_timestamp',
   job_type => 'STORED_PROCEDURE',
   job_action => 'SET_ARCHIVE_RETENTION',
   number_of_arguments => 1,
   start_date => SYSDATE,
   repeat_interval => 'freq=daily' ,
   enabled => false,
   auto_drop => FALSE);
    (job_name =>'advance_archive_timestamp',
     argument_position =>1,
     -- one week, you can customize this line:
     argument_value => 7);

    DBMS_SCHEDULER.run_job (job_name => 'advance_archive_timestamp',
    use_current_session => FALSE);

-- end example


SQL> @audit_example
First Step: init cleanup (if not already)
Cleanup for STD was already initialized

PL/SQL procedure successfully completed.

Relocate AUD$ to a dedicated tablespace AUDIT_DATA

PL/SQL procedure successfully completed.

set last archive timestamp to a week before now

PL/SQL procedure successfully completed.

setup a purge job

PL/SQL procedure successfully completed.

PL/SQL procedure successfully completed.

the last archive ts, Audit Vault will do this for you
Optionally Schedule automatic advancement of the archive timestamp

Procedure created.

PL/SQL procedure successfully completed.

PL/SQL procedure successfully completed.

PL/SQL procedure successfully completed.


Wednesday, July 23, 2014

How to Create an Oracle Database In Two Words: "CREATE DATABASE"

About five years ago, I wrote my first blog on this site, and titled it "DBA Minimalism".  Based upon the fact that I can count the hits on all my digits with room to spare, it was not well received.  Still, I thought it an interesting concept - how can I create an Oracle database in a minimum amount of time, in very few steps?

Since that time, a lot has changed in the Oracle infrastructure.  We now have ASM prevalent as a file system, managed systems are a big deal, OEM now has database administrative features including the ability to create a database, and Oracle's own tool for DB creation (dbca) has very much matured.

Still, there is an elegance to doing a complex task as simply as possible.  With many of these standard tools, you get many unnecessary add-on "features" in the form of schemas and applications (for example, scott/tiger, XDB and Oracle Spatial) that you may never use, need nor want.

While my original method will still work, a client recently tried it with an ASM file system and ran into a seeming roadblock.  Apparently, you cannot create a database in a single pass without an spfile on the ASM disks.  What this means is that you must first use my skeletal pfile and execute command "create spfile from pfile" while the database is in an unmounted state.  Then you can shutdown, startup (nomount) and issue the create database command with parameters for things like control files and undo tablespaces on the ASM diskgroups.

I don't really think Oracle planned this out very well.  You should not have had to make allowances for a different file system back end in order for a standard syntax like "create database" to work.  On the other hand, perhaps at the time, ASM was considered a nonstandard luxury option, and an add-on that required a little forward thinking.

I am still unsure about the whole move to ASM only databases that Oracle seems to be gravitating towards.   I suppose there's a benefit, certainly with engineered systems.  I do think we'll see the day when databases can be created, out of the box, with a default ASM setup.  This is not in 12c yet, as far as I can tell.  Now, 12c has a new feature (pluggable databases) which may make of creating new databases less and less of an issue.  With this feature, you can have a "multi-tenant" database topology where several sets of data housed in different DB instances in the past may now share overhead.

Time will tell how this feature works out.  I am doing research on it and may publish my findings in the future.

Friday, February 14, 2014

Oracle RAC Install Troubleshooting

Over the past year and a half, and especially in the past three months, I have been in complete RAC installation and upgrade mode.  Inevitably, when such projects come along, problems requiring detailed and exhaustively tedious troubleshooting will come up.  It's been my experience that the largest set of books, or even the internet, can contain the exact sequence of problems you may encounter.  It's not that they aren't there, just not in the condition and set of circumstances that meet your situation.

Just recently, a colleague asked for assistance doing the simplest step in the upgrade process from RAC to, the software installation.  The following message was received before the guts of the installation could begin:

"[INS-40418] The installer has detected that Oracle Clusterware is not running on local node.  oracle recommends that you upgrade Oracle Clusterware in a rolling manner."

Now, needless to say, the above was untrue.  We verified, using:

crsctl stat res -t -init

And also:

ps -ef | grep ora (which shows oracle owned processes at the linux level )
ps -ef | grep grid (which shows grid owned processes in linux)

So the clusterware and associated processes WERE running on the local node at the time the installation was attempted.  We understand this is necessary so the upgrade process can collect information for the configuration of the new version.  We had just done several installs, had a 100+ page detailed procedure which we developed to assist and assure things were done consistently, and yet this condition had never occurred!

Luckily, this is a testing platform where a delay at least does not impact customers or revenue.  Still, at times when even experienced professionals are exasperated and clueless, Oracle support must be contacted.  This assures that the latest information on the software is available, and verifies we're using appropriate methods of installation and configuration.  In order to get the issue resolved expediently, we entered an SR (Service Request) on the Oracle Support website (support.oracle.com) at level 2, business impact, but workarounds are available.  The support analyst was prompt, but asked for an awful lot of information about the environment in a short period of time.  Luckily, Oracle now has diagnostic tools such as rdalite and diagcollection.pl to collect the data quickly.  An additional piece of luck  -   this was only a two-node RAC. We have installations at this site with up to 26 nodes!  But still, deadlines are deadlines, and critical test by the QA team was dependent upon this environment.

After two days of inquisition by an Oracle analyst, including a live webcast and deep-dive into the entire RAC infrastructure, the problem was still unsolved, until, my colleague Jayanth was skimming through the original installation and configuration to collect more data for Oracle, and stumbled across the actual problem.  The host file will not be published to protect the client's configuration security.

You had to really look for it, and neither the DBA in charge, this writer, the Oracle analyst or the various diagnostic scripts that Oracle recommends caught it, but there are TWO ENTRIES for ip xxx.xx.xx.xx in the hosts file.  Even though the Oracle message from their installation program says nothing about it, that is the TRUE REASON why the installation could not precede.

All if this makes me wonder if diagnostic messages are ever of any use at all!  They are often wrong, or simple misdirection, causing you to take even longer to do a relatively simple task.   The good news is, after this issue was finally discovered, the upgrade and patching was done in only a few hours, which is a very short period of time in RACworld!

I've gone on a lot longer about this particular issue than I meant.   I hope it someday helps someone else.   My next post will involve more detail on how to engage support, and what to do while you're working with them, and what your backup plan should be.  Good day!

Monday, September 23, 2013

Dropping an Uncooperative Oracle Tablespace

We recently had a client database that was under some space pressure.  After several steps to reorganize data and segregate applications by tablespace for ease of administration and management, a couple of "apperently" empty tablespaces remained.  The next step in the process was to drop these tablespaces and datafiles to recover space.  This is usually a relatively trivial matter, and can often be done expediently.  

However, every now and then, unusual problems can arise.  In this one case, a problematic tablespace, with no segments listed in the DBA_SEGMENTS view or the "Tablespaces" tab of Oracle Enterprise Manager, failed to drop, as follows:

SQL> drop tablespace team_index01 including contents and datafiles;
drop tablespace team_index01 including contents and datafiles
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-02429: cannot drop index used for enforcement of unique/primary key

SQL> select owner, segment_name
  2  from dba_extents
  3  where tablespace_name = 'TEAM_INDEX01';

no rows selected

SQL> !oerr ora 2429
02429, 00000, "cannot drop index used for enforcement of unique/primary key"
// *Cause: user attempted to drop an index that is being used as the
//         enforcement mechanism for unique or primary key.
// *Action: drop the constraint instead of the index.

The above led me to conclude that there is still some logical, though likely not physical, connection to this tablespace and a current database object, so:

SQL> select constraint_name, index_owner, index_name
  2  from dba_constraints
  3  where index_name in
  4  (select index_name from dba_indexes
  5    where tablespace_name = 'TEAM_INDEX01');

------------------------------ ------------------------------

The following verifies the object above is designated to be on the TEAM_INDEX01 tablespace:

SQL> set long 4000
SQL> select dbms_metadata.get_ddl('INDEX','PK_NOTE_ED_PAYLOAD','MRA') from dual;



The following command corrects this (luckily it's a rather small index):

SQL> alter index  "MRA"."PK_NOTE_ED_PAYLOAD" rebuild tablespace mra_index01 online;

Index altered.

Then the reverification confirms:

SQL> select dbms_metadata.get_ddl('INDEX','PK_NOTE_ED_PAYLOAD','MRA') from dual;



So now the DROP command works:

SQL> drop tablespace team_index01 including contents and datafiles;

Tablespace dropped.

I am a bit mystified as to how this might have happened.  I am inclined to believe this object was recreated in this tablespace after having been moved earlier.  I'd be interested in hearing any logical explanations from a reader so inclined to respond.