Wednesday, November 3, 2021

Oracle Restore Points - Guaranteed/Clean/Normal - Which One To Use?

 A 'restore point' is a designated point of recovery for an Oracle database.  It can be a timestamp, or an SCN (system change number).

One might want to create a restore point to identify a significant point where data was in a certain condition.  Typically, this is before an upgrade, or application changes, or patching.  

Some other reasons to create one might be before a migration, or perhaps an OS change of some kind.  

While a database can be recovered to a point in time should any of the above activities cause a problem, it can be difficult to identify the correct moment before the trouble started after the fact.  

Using restore points, most often when database activity is paused, or ideally, when all connections to the database are stopped and suspended, assures a quicker valid recovery.  Usually much faster than a standard RMAN recovery!

There are three main types of recovery in the Oracle documentation.  Even an experienced DBA can be unsure of which one to use, since they've sprouted up at different periods and database versions.  These are, for a better word "standard" recovery points (the word standard is implied, and not included in the syntax), guaranteed recovery points. and "clean" recovery points.  Each one represents a different concept and presents different capabilities to aide in a quick recovery to a designated point in time.  Using each one for their specific purpose is the ideal way to assure a quick recovery, including the ability to repeat it, as necessary.  

GUARANTEED RESTORE POINTS:

Any of these options must be used with care. Any and all manipulations involving database recovery can have side-effects. One big issue is the use of redo.  Redo logs will be necessary to perform any of these options.  If the length of time between the creation of a restore point is extended, these logs will remain in the FRA (fast recovery area), and could cause space pressure.  The database could even stop until space is made available in such a case. THIS IS WHY GUARANTEED RESTORE POINTS SHOULD BE DROPPED within seven days of creation!  They should be used for a planned period of time.  They do not expire on their own.  

"NORMAL" RESTORE POINTS:

A restore point with out the guaranteed option will automatically expire once the backup retention period is expired.  "Normal" is not an Oracle syntax, but is a restore point without the "guarantee flashback database" is what is referred to here.  

CLEAN RESTORE POINTS:

The Oracle documentation defines a "clean restore point" as a point in time when the database is in closed or mounted mode.  This assures that no transactions are running before or during the creation of this restore point.  The limits on this type of restore point, as well as a normal one, is that they will automatically expire after the period defined by the spfile parameter DB_FLASHBACK_RETENTION_TARGET.

Complete documentation of this complex topic, including the implications of using it in a multitenant environment, is available using this link.

Friday, September 17, 2021

Renaming Oracle Pluggable Databases in Oracle 19c (Or: How seemingly trivial tasks can burn an entire day!)

Today's little thought provoking exercise is - what if you need to change the name of an existing PDB (pluggable database) in Oracle?  

The reason this is worthy of this write-up is as follows: back in the old days (I am thinking version 12c), this was a fairly complex operation.  It involved "unplugging" and "replugging" and exporting and importing keys and Geesh!!!  A colleague was about to start this operation when I caught on and notified him there is a much better way!

If you're not on at least 19c, get there as soon as you can!!!

So we had a little problem - we are pumping out PDBs for projects and suddenly realized we would soon run out of letters and numbers...not to mention that while it's not always good to add intelligence to a technical name, in the case of a database, one minor keystroke due to an unrecognizable sequence could be disastrous.  So we came up with a little standard.  First 3 letters are a mnemonic for the application, second 3 were environment purpose.  In our case these are:

1 - GLD for Gold environment.  Used as the golden image to build out the others.   No code changes.

2 - DEV for a development environment.  Where change rules, and all the fun happens (haha)!

3 - TST is a place for testing changes which pass muter in DEV.

4 - QAS the Quality Assurance System.  This is where project leaders make final integration tests.

5 - PRD is Production, where real end users and complex and costly transactions involving data occur.

So, in our case, the app is EBS and the environment is GLD.  

We have 8 letter positions to work with, so the seventh is a static P, standing for pluggable, and the eighth is a number designation so we can have more than one (needed during migrations, or for cloning for new environments, or so that parallel development an take place.

So now that I've laid out the reasons for a PDB rename, let's get into the mechanics.  I found the syntax in an Oracle support manual.  I cannot publish the full statement and all options verbatim here, because it is copyrighted, but here is a link You may need a valid Oracle support contract to view it.  

Here is what we actually used:

alter pluggable database rename global_name to EBSGLDP1;

It is really that simple.  But it's not.  Please note the slightly tricky global_name syntax.  

I have this theory.  There is code in the back of all complex software that prevents someone from executing a statement for the first time to have success, even with perfect syntax, and dedicated research.  Now, I do understand that all sorts of havoc can ensue were you to rename a database on the fly, but a novice is often prone, or even led to believe,  that it can be so easy. Likely from reading a google post like this one!  Not to "throw shade" on any other posts, but I do think renaming needs to be thought through.  

Even though I am not a novice, in a development environment, with a good backup, sometimes it is useful to try simple things first and learn from them.  

So here is the actual sequence of statements which must occur to rename a PDB in an open CDB (Container database), and have network end users be able to connect to it with the new name:


 $ sqlplus / as sysdba

Connected to:

Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

Version 19.11.0.0.0

SQL> alter pluggable database ebsgld1 close; -- You must close the PDB using the OLD name.  

Pluggable database altered.

SQL> alter pluggable database ebsgld1 open restricted; -- Restricted mode is required for renaming

Pluggable database altered.


SQL>  alter pluggable database rename global_name to EBSGLDP1; -- Here is the actual rename!


Pluggable database altered.


 SQL> alter pluggable database ebsgldp1 close; -- Now it must be closed because it is "restricted". 


Pluggable database altered.


SQL> alter pluggable database ebsgldp1 open read write;  -- Now end users can connect but only locally.


Pluggable database altered.


SQL> alter session set container = ebsgldp1; -- now we need to connect to the PDB


Session altered.


SQL> alter system register; -- this is the first step in getting the PDB "registered" with the Oracle Listener.


System altered.


Now we have to move over to the network side of things.  The network listener configuration needs to be updated to reflect the new name.  


First we will check to see if the listener "sees" the database with it's new name.  The output below is abbreviated and masked for confidentiality, but I will highlight what we are looking for in red:


lsnrctl stat


LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 17-SEP-2021 13:48:16


Copyright (c) 1991, 2021, Oracle.  All rights reserved.


Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))

STATUS of the LISTENER

------------------------

Alias                     LISTENER

Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production

Start Date                16-SEP-2021 16:54:45

Uptime                    0 days 20 hr. 53 min. 31 sec

Trace Level               off

Security                  ON: Local OS Authentication

SNMP                      OFF

Listener Parameter File   /u01/app/19.0.0.0/grid/network/admin/listener.ora

Listener Log File         /u01/app/grid/diag/tnslsnr/yourhostname/listener/alert/log.xml

Listening Endpoints Summary...

  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))

  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=yourhostname)(PORT=1521)))

Services Summary...

Service "+APX" has 1 instance(s).

  Instance "+APX1", status READY, has 1 handler(s) for this service...

Service "+ASM" has 1 instance(s).

  Instance "+ASM1", status READY, has 1 handler(s) for this service...

Service "+ASM_DATA" has 1 instance(s).

  Instance "+ASM1", status READY, has 1 handler(s) for this service...

Service "+ASM_RECO" has 1 instance(s).

  Instance "+ASM1", status READY, has 1 handler(s) for this service...

Service "ebsgldp1" has 1 instance(s).

  Instance "DB0720", status READY, has 1 handler(s) for this service...

The command completed successfully


Now, each end device will need this name recorded, unless it's entered in DNS.  The connect string needs to look something like this:


EBSGLDP1 =

  (DESCRIPTION =

    (ADDRESS = (PROTOCOL = TCP)(HOST = yourhostname)(PORT = 1521))

    (CONNECT_DATA =

      (SERVER = DEDICATED)

      (SERVICE_NAME = EBSGLDP1)

    )

  )


The proof is in the pudding!  A sqlplus "loopback" connection shows if we can successfully connect:


 sqlplus system/password@ebsgldp1


SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 17 13:54:33 2021

Version 19.11.0.0.0


Copyright (c) 1982, 2020, Oracle.  All rights reserved.


Last Successful login time: Fri Sep 17 2021 13:46:21 -04:00


Connected to:

Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

Version 19.11.0.0.0


SQL> show con_name

CON_NAME

------------------------------

EBSGLDP1


-- Above validates we're in the right container, provided your account has proper privileges.  

If you do not, check with security to see which command you are permitted to execute.  

In summary, even though this seems simple enough, some things went wrong even though I've done it a few times.  First, the original name was very close to the new one, so some typos were made.  Secondly, we needed to get the restricted syntax right, and found it was REQUIRED to make the name change.  Finally, we needed to get the service name correct in the connect string.  

Also, as a colleague notes in a different blog, the directory names for database files will not be impacted by the name change.  If this matters to you, that would be a separate operation.  

Good luck to you.  I hope this helps!


Monday, July 26, 2021

Oracle OCI Port Troubleshooting

 Oracle Cloud Infrastructure, or OCI, can be a daunting environment for the most experienced technical professional.  While it is exceedingly quick to build out an entire infrastructure, and administrative tasks that used to be measure in months can now be done in days, finding our where or why something is not working can still have the twists and turns of a detective novel:


  

My subject for today is troubleshooting connectivity issues, and especially to an Oracle database.  Oracle typically runs on port 1521 for transactional connections.  In most on-premise environments, this port was 'open' to traffic on the network specifically and intentionally a long time ago.  However, as delivered, OCI locks down all ports unless explicitly open to traffic between each device.  Doing this up front used to be the standard some years back, but with a Cloud implementation, you want to be very careful about opening ports to all traffic due to security and audit concerns.  This has been flagged as an ongoing concern due to famous incidents of hacking and ransomware exploits - as described on this website.

I recently had two projects where, although this was identified, and the network admins claimed to have opened traffic on this port between all recognized and identified "instances" (usually virtualized servers), we were getting connection errors, such as, from an Oracle SOA Suite installation: 

'Unable to connect to the database.  Check if DB connection details entered are correct'.  

No error codes, no specific details as to what went wrong.  We verified everything was correct, and could connect from a client to the database over port 1521 without issue.  This seemingly proved the port was open, but......not between this specific port and target.  After much deliberation, and assurances from network teams that the ports were open, we did some more detective work.  

There are many ways to validate port connectivity, but the tool of choice for us is called netcat.  This free tool is available for both linux and Windows, and has a simple syntax.  It can also be abbreviated as ncat or just 'nc'.   

What is nice about this utility is that it is easily installed (root permission needed in linux, admin permission for Windows), and will be available to run in mere seconds. It might be a wise standard to include it with a hardened OS build, as it only connects if allowed by security audit rules, and reports back if unsuccessful.  Here is an installation tutorial.

I will also mention, should it not be accepted as permanent, it is easily uninstalled as well.  

Simple is as simple does.  Or not!!!!  In our use cases, this tool proved to our network administration team that port 1521 was not open system-wide, and they quickly took steps to correct this.  It's probably safest to open it only for specified source and targets, due to security concerns. 

Once installed, here is the syntax to validate a port as open, and the results:


1-michael.fontana@10.31.2.5's password:
Last failed login: Sat Jul 24 13:05:33 EDT 2021 from 10.31.2.84 on ssh:notty
There was 1 failed login attempt since the last successful login.
Last login: Wed Jul 21 16:38:01 2021 from 10.31.2.84
0(1-michael.fontana@testserver1 [~])$ sudo su -
Last login: Fri Jul 23 15:33:07 EDT 2021 on pts/0
## - You are now root - ##
0(root@testserver1 ~)$ nc -vz testserver5 1521
Connection to 10.31.2.105 1521 port [tcp/ncube-lm] succeeded!
0(root@testserver1 ~)$ nc -vz testserver3 1521
Connection to 10.31.2.103 1521 port [tcp/ncube-lm] succeeded!
0(root@testserver1 ~)$ nc -vz testserver4 1521
Connection to 10.31.2.104 1521 port [tcp/ncube-lm] succeeded!
0(root@testserver1 ~)$ #An example of a failure:
0(root@testserver1 ~)$ nc -v faketarget 1521
Ncat: Version 6.47 ( http://nmap.org/ncat )
Ncat: Connection timed out.

The command will hang if the port is not open.  You can use options like -w (wait) to return an error code after a few seconds if you want to use netcat for scripting or other purposes.  


Tuesday, June 8, 2021

Oracle 21C New Features?

 Oracle 21c New Features    

I just finished migrating to Oracle 19c.  It was a lot of work.

Now I find out that Oracle 21c is out?  Please click on the preceding link for detailed documentation on what's new in 21c.

My first thought (and maybe yours), is: "What's 21c got that I ain't got?".  My second thought is that these major releases are coming way too fast!  

At first glance, this appears to be true.  In fact, the length of time between Oracle 11g and Oracle 12c release dates was six years!  The clear and logical explanation for this is that Oracle has moved to a yearly release cycle.  What is less obvious is that what Oracle is calling new "versions" more or less correspond to what used to be called "patchsets".  For example, Oracle 11g was originally identified as version 11.1.0.6 in 2007.  By 2013, subversion identifiers were added as patchsets leading up to the terminal release of 11.2.0.4.  In between, these patchsets came out with new features (often announced at Oracle Openworld) every September.  

All of this was documented with the "shocking!" release of Oracle 18c, and the move to a yearly release cycle.  While it was explained as a logical, well-thought out change,  I couldn't help but recall someone explaining that Larry Ellison was an admitted triskaidekaphobic with a fear of the number 13.  Keep in mind that the newly modified release designations also identify the significance of the version.  For example, Oracle 20c is called an innovation release.  So is 21c.  So no compelling reason to upgrade to them.   If some feature(s) are of interest, you can download and evaluate them.  I cannot identify a single one that is a must have.  For more technical details on what significant changes are in 21c, I invite you to review Oracle VP William Hardie's excellent blog post on the subject.  

While there is no specific technical detail in this post, I hope you will find it enjoyable from a historical point of view.  I invite you to click on the imbedded links to get the technical scoop.  This blog was written to help myself (and others) find detailed information about Oracle releases corresponding to versions, as I am often asked to support or upgrade an older version.  We might need to know if a feature is/was available for a given release.  

This is my first blog post in a while.  I hope to do many more in the near future!

Wednesday, June 17, 2020

Oracle Enterprise Manager (OEM) Patching

Be forewarned - patching OEM has changed!

I assume you are reading this because it has become necessary to patch your installation of the product now known as "Cloud Control".  Past iterations of this product were called:

Oracle Enterprise Manager (OEM)
Oracle Monitoring System (OMS)
Oracle Grid Control (GC)

In fact, many of the commands in the software stack for this product still have these legacy names, and many of the documents associated with installation, administration, maintenance and use of OEM still use these names/acronyms. 

If you are about to patch, keep in mind, very similar to standard Oracle database software, that a utility named "opatch" will be used.  It comes with the software stack (still called "oms"), and is located in the standard location: $ORACLE_HOME/Opatch (which indicates the home directory for the product + the location of the "opatch" utility.  However, there is another patching utility, specifically for the OMS, called OMSPatcher, which is used as well. 

As of this writing, the proper document to be followed for applying the April 2020 OMS "PSU" patch are at the Oracle Support site (support.oracle.com).  It is also assumed you have access to support and a valid product license.  Please reference:


Patch 31058360: enterprise manager for oms plugins 13.3.1.0.200331


If your OMSPatcher software is out of date, you will have to update it by following procedures documented for patch 19999993 OMSPatcher v13.9.3.



This patch is applied the traditional way.  
Move the patch bundle zip to $ORACLE_HOME/OMSpatch
and then unzip it into the same directory (after backing up first).  

Patching Opatch (when required) is where it gets interesting!

 THE OMS MUST NOW BE DOWN TO PATCH OPATCH!

We may no longer use the standard method.  You must now unzip the patch,
and then invoke java (assuring the correct version is in your path, or calling
the version imbedded in your Oracle home):


- If needed, install the software with custom inventory via:

    java -jar /6880880/opatch_generic.jar -silent oracle_home=$ORACLE_HOME


-invPtrLoc $YOUR_INVENTORY






where inventory pointer location is the absolute path to the oraInst.loc file 

In my next post, I will detail patching specifics, including the necessary steps 
of backing up the OMS software and running a "prechceck".  

Sunday, April 19, 2020

Unpublished Syntax for Oracle "dbaas_diag_tool.pl" Script For Diagnostic Collection

Oracle cloud presents a lot of new tooling and scripts, with enhancements and refinements occurring almost daily.  I recently had to work a support request (SR) and was asked by the Oracle support engineer to do an environment collection using a tool called dbaas-diag-tool.  This tool is actually a perl script and will be located here on Oracle provisioned exa-Cloud linux servers:

/var/opt/oracle/misc/dbaas_diag_tool.pl

There was an Exadata compute node which rebooted during a patching event.  So we opened a support request for direction, as the patch failed.  The request from support was to run the tool and collect diagnostic information from a "four hour window around the event".  However, the syntax supplied only includes syntax for days of collection, not hourly.  So I tried to collect syntax documentation from various sources, without much info.  Here is what a syntax help request returns:

# /var/opt/oracle/misc/dbaas_diag_tool.pl --help
Usage: perl /var/opt/oracle/misc/dbaas_diag_tool.pl \
            [-help|-h|-?] [--envtype ] [--dbname ] [--operation collect] [--gettfalogs ] [--getdbaaslogsonly ]  [ --diagdir
 Example:
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl -h
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl --envtype DBCS
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl --healthcheck
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl --envtype DBCS --operation collect
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl --starttime YYYY-MM-DD
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl --starttime YYYY-MM-DD --endtime YYYY-MM-DD
         perl /var/opt/oracle/misc/dbaas_diag_tool.pl --getdbaaslogsonly True|False
Note: Currently tool only supports envtype and operation
Note: TFA Logs are collected only if getTFALogs is set to True. By default it is False
Note: dbname is relevant only for envtype and ignored in other environments


Note that the syntax specifies "starttime" and "endtime" but does not give the format for time in hours, minutes or seconds?  Even the request from the engineer did not include the syntax for such granularity.  Left to ponder this, after trying help, above, I referenced various support documents, without success.  
Doc ID 2219712.1 seemed promising, but only contained information for collecting "TFA logs".  Another document (Doc ID 2094679.1)  seemed promising, but also did not include the needed syntax.   I even reviewed the perl script itself, however, it makes many calls to unpublished subroutines, so I didn't get anywhere.  

However, I did notice that the output log file from the process includes the date, in the same format, with an underscore and colon between each value, so I tried that.  Alas, no luck.  However, within the code was the following one-line hint:

 my ($date_out, $date_err) = $LOG->logexe("date \+\'\%Y\-\%m\-\%d \%H\-\%M\-\%S\'");

This syntax is what will work.  Notice double-quote delimiters are required around starttime and endtime because a space must be imbedded between day and hour:

/var/opt/oracle/misc/dbaas_diag_tool.pl --starttime "2020-04-18 10:00:00" --endtime "2020-04-18 14:00:00"

So, in the end, Oracle should probably update their documentation to include this syntax.  I hope this will help if you need to run this useful utility.  

Tuesday, February 27, 2018

Welcome To Oracle Cloud

I do know it has been quite some time since I published a blog here.  Sorry about that.  Changes in life and work do come along.  I now must mention I work for a large company, and that the opinions expressed here are my own.  I have been working with cloud software from various vendors for some time now, and quite recently, Oracle has begun to make their presence known in this space. 

Just recently, I was asked to troubleshoot a client installation patching issue for their Oracle Cloud database.  Cloud supported databases in the Oracle world are referred to as "DBAAS" or "database as a service".  However, like any other database hosting installation, from time to time, patching and maintenance may be required.  From a special console website, the patching menu is presented with a drop down menu, where first instances and then eligible patches are listed.  It can be as simple as point and click to pre-check (verify) and then install the applicable patch. 

However, what if something goes wrong?  In this particular case, the point and click methodology didn't work!  Using conventional methods is not generally advised, as the "dbaas" implementation of Oracle's software is somewhat customized (enhanced?) both to assure consistency as well as monitoring and metering of the service for support and to charge based upon usage. 

In one particular case, a customer who had worked-around the standard patching methodology lost support, and in another, was charged an incorrect fee (much higher) due to this. 

I cannot go into specific detail at this time, due to the confidential nature of the customer's business and Oracle's requirements as well.  Eventually, we worked with Oracle support and received a custom rpm (unix source code) fix to make patching for dbaas functional in this specific case.  There were some things we noted that once again should convince readers to tread gingerly before doing any customizations or non-standard configuration activities on a cloud system:

1)  Root is a very powerful privilege.  You can invoke it, but at your own peril.  Oracle gives a default account of "opc" (standing for Oracle Public Cloud) from which you can sudo su to oracle or root.  In this case, some permissions for grid executables were set incorrectly, quite possibly by someone with root privilege for some other valid reason.

2) Patching outside of the interface is dangerous - While more mature administrators such as myself may prefer command line because of the extra options and ability to configure and control output and logging, you are probably better off to try using standard tooling first.  This was pointed out by Oracle, who, even after fixing the patching tools and running manually, had us execute via the console as well!

3) Think twice and think again - before any task, or before adding/removing any software/logs/configurations or making any changes to files like /etc/fstab /etc/hosts or /etc/oratab. 

We've already had instances where valid and necessary changes to these files caused the instrumentation to fail.  While I readily agree, as Oracle Cloud software matures, such small items should not do so, changing these items can cause problems which are very very difficult to solve!

This is sort of a rambling post.  But I am glad to get back up on the horse again.  I hope to write again soon.  Have a good one!  And happy patching!!!