Oracle-Ninja.com Andy Colvin's Oracle Blog

9Jun/130

Reminder – “Getting Ready for Exadata” Seminar in London

Just a quick reminder - I'll be presenting my "Getting Ready for Exadata" seminar in London on July 1. We'll cover various topics on what you need to know before you purchase an Exadata and how to manage one of these systems on a daily basis. We'll have lots of time for Q&A, and most likely a time for beers afterward.

This is being put on through Oracle Education, but I guarantee that there won't be anything "salesy" that day - just the facts. Link to register is here - http://goo.gl/iF8k2

27May/134

dbnodeupdate.sh on Exadata Compute Nodes

Rene Kundersma at Oracle just published a nifty new utility named dbnodeupdate.sh that will assist with the sometimes-cumbersome process of updating the compute nodes in an Exadata environment.  Starting last year with 11.2.3.1.0, Oracle introduced yum updates for the Exadata compute nodes.  Previously, each Exadata storage server patch came with a "minimal" or "convenience" pack that included a shell script that forced a bunch of new RPMs onto the compute node.  While that worked on vanilla installations, if users started installing many of their own packages, the installation could fail, leaving admins stuck in RPM dependency hell.

Introducing yum into the picture helped significantly, but that also introduced some challenges.  First, users had to either run a manual set of steps to configure yum, or trust Oracle's bootstrap scripts, which had some issues in the beginning.  After that, you would have to decide if you wanted to update directly from Oracle's Unbreakable Linux Network (not likely), a local yum repository, or working directly from the ISOs linked in MOS note #888828.1 (my preference).  The nice thing about dbnodeupdate.sh is that it can do many of these tasks - and more.

Taking a look at the support note for dbnodeupdate (#1553103.1) shows several examples of what it can do, with examples.  It can run the bootstrap phase to directly update you from version 11.2.2.4.2 up to 11.2.3.2.1 (the latest version as of now, covering more than a year of patches) in 2 reboots, running only 3 commands.  It takes advantage of the free space within the compute node volume groups to take a backup of the root volume, providing an easy rollback method similar to that of the storage servers.  Also, it will disable CRS on reboot, and relink your Oracle homes when you're done. When it's finished relinking, it will enable CRS on startup.

Anyway, on with the demos!  I first tested this on a compute node running the original release of 11.2.3.2.1 (11.2.3.2.1.130109) to the one-off release, containing patches for the NFS bug, and a few other things.  To update the node, I just had to download the latest version of the 11.2.3.2.1 ISO file (patch #16432033) and run the dbnodeupdate.sh script:

[root@enkx3db01 patch_11.2.3.2.1]# ./dbnodeupdate.sh -u -l /u01/stage/patches/patch_11.2.3.2.1/p16432033_112321_Linux-x86-64.zip -s
  (*) 2013-05-27 20:50:03: Collecting system configuration details...
  (*) 2013-05-27 20:50:05: Checking free space in /u01
  (*) 2013-05-27 20:50:05: Unzipping /u01/stage/patches/patch_11.2.3.2.1/p16432033_112321_Linux-x86-64.zip to /u01/app/oracle/stage.824, this may take a while
 
Active Image version   : 11.2.3.2.1.130109
Active Kernel version  : 2.6.32-400.11.1.el5uek
Active LVM Name        : /dev/mapper/VGExaDb-LVDbSys1
Inactive Image version : n/a
Inactive LVM Name      : /dev/mapper/VGExaDb-LVDbSys2
Current user id        : root
Action                 : upgrade
Upgrading to           : 11.2.3.2.1.130302
Baseurl                : file:///var/www/html/yum/unknown/EXADATA/dbserver/11.2/latest.824/x86_64/ (iso)
Iso file               : /u01/app/oracle/stage.824/112_latest_repo_130302.iso
Create a backup        : Yes
Shutdown stack         : Yes (Currently stack is down)
Logfile                : /var/log/cellos/dbnodeupdate.log (runid: 270513205002)
Diagfile               : /var/log/cellos/dbnodeupdate.270513205002.diag
Server model           : SUN FIRE X4170 M3
dbnodeupdate.sh rel.   : 1.35 (always check MOS 1553103.1 for the latest release)
Automatic checks incl. : Issue 1.8 - Hotspare not reclaimed
                       : Issue 1.10 - Cell and Database image versions 11.2.2.2.2 or lower require workaround before patching
                       : Database servers with an ofa rpm earlier than 1.5.1-4.0.28 can encounter a file system corruption
                       : Issue 1.14 - Upgrade to 11.2.3.x failed due to Sas Exp. FW not upgrd. first to 5.7.0 on X4800 and X4800 M2
                       : Issue 1.15 - Filesystem checks not disabled on database servers
                       : Issue 1.16 - Verify the vm.min_free_kbytes kernel parameter on database servers to make sure 512MB is reserved
                       : Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12
Manual checks todo     : Issue 1.11 - Database Server upgrades to 11.2.2.3.0 or higher may hit network routing issues after the upgrade
 
Note                   : After upgrading and rebooting run 'dbnodeupdate.sh -c' to finish post steps
 
 
Continue ? [Y/n]
y
  (*) 2013-05-27 20:50:25: Verifying GI and DB's are shutdown
  (*) 2013-05-27 20:50:27: GI and DB already shutdown
  (*) 2013-05-27 20:50:27: Collecting console history for diag purposes
  (*) 2013-05-27 20:51:03: Performing backup to /dev/mapper/VGExaDb-LVDbSys2
  (*) 2013-05-27 20:54:59: Backup successful
  (*) 2013-05-27 20:55:00: Verifying and updating yum.conf (backup in /etc/yum.conf.270513_205002)
  (*) 2013-05-27 20:55:00: Disabling other repositories, generating Exadata repos
  (*) 2013-05-27 20:55:00: Generating /etc/yum.repos.d/Exadata-computenode.repo
  (*) 2013-05-27 20:55:00: Verifying baseurl
  (*) 2013-05-27 20:55:01: Disabling stack from starting
  (*) 2013-05-27 20:55:02: OSWatcher stopped successful
  (*) 2013-05-27 20:55:02: Emptying the yum cache
  (*) 2013-05-27 20:55:02: Removing rpm libcxgb3-static.x86_64 (if installed)
  (*) 2013-05-27 20:55:02: Removing rpm rpm-build.x86_64 (if installed)
  (*) 2013-05-27 20:55:02: Performing yum update. Node is expected to reboot when finished
  (*) 2013-05-27 20:56:35: All above steps finished.
  (*) 2013-05-27 20:56:35: system will reboot automatically for changes to take effect
  (*) 2013-05-27 20:56:35: After reboot run "./dbnodeupdate.sh -c" to complete the upgrade
[root@enkx3db01 patch_11.2.3.2.1]#
Remote broadcast message (Mon May 27 20:56:43 2013):
 
Exadata post install steps started.
It may take up to 2 minutes.
The db node will be rebooted upon successful completion.
 
Remote broadcast message (Mon May 27 20:56:54 2013):
 
Exadata post install steps completed.
Initiate reboot in 10 seconds to apply the changes.
 
Broadcast message from root (Mon May 27 20:57:04 2013):
 
The system is going down for reboot NOW!

When the node finished rebooting, I ran dbnodeupdate.sh with the -c switch, to force it to relink all homes for RDS:

[root@enkx3db01 patch_11.2.3.2.1]# ./dbnodeupdate.sh -c
  (*) 2013-05-27 21:04:55: Collecting system configuration details...
 
Active Image version   : 11.2.3.2.1.130302
Active Kernel version  : 2.6.32-400.21.1.el5uek
Active LVM Name        : /dev/mapper/VGExaDb-LVDbSys1
Inactive Image version : 11.2.3.2.1.130109
Inactive LVM Name      : /dev/mapper/VGExaDb-LVDbSys2
Current user id        : root
Action                 : finish-post (perform post steps, relink enable/disable crs)
Relinking for release  : 11.2.3.2.1.130302
Shutdown stack         : No (Currently stack is down)
Logfile                : /var/log/cellos/dbnodeupdate.log (runid: 270513210454)
Diagfile               : /var/log/cellos/dbnodeupdate.270513210454.diag
Server model           : SUN FIRE X4170 M3
Remote mounts exist    : Yes (dbnodeupdate.sh will try unmounting)
dbnodeupdate.sh rel.   : 1.35 (always check MOS 1553103.1 for the latest release)
Automatic checks incl. : Issue 1.8 - Hotspare not reclaimed
                       : Issue 1.10 - Cell and Database image versions 11.2.2.2.2 or lower require workaround before patching
                       : Database servers with an ofa rpm earlier than 1.5.1-4.0.28 can encounter a file system corruption
                       : Issue 1.14 - Upgrade to 11.2.3.x failed due to Sas Exp. FW not upgrd. first to 5.7.0 on X4800 and X4800 M2
                       : Issue 1.15 - Filesystem checks not disabled on database servers
                       : Issue 1.16 - Verify the vm.min_free_kbytes kernel parameter on database servers to make sure 512MB is reserved
                       : Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12
Manual checks todo     : Issue 1.11 - Database Server upgrades to 11.2.2.3.0 or higher may hit network routing issues after the upgrade
 
Continue ? [Y/n]
y
  (*) 2013-05-27 21:05:10: Verifying GI and DB's are shutdown
  (*) 2013-05-27 21:05:12: Collecting console history for diag purposes
  (*) 2013-05-27 21:05:53: No rpms to remove
  (*) 2013-05-27 21:05:54: Relinking all homes
  (*) 2013-05-27 21:05:54: Unlocking /u01/app/11.2.0.3/grid
  (*) 2013-05-27 21:06:00: Relinking /u01/app/11.2.0.3/grid as oracle
  (*) 2013-05-27 21:06:12: Relinking /u01/app/oracle/product/11.2.0.3/dbhome_1 as oracle
  (*) 2013-05-27 21:06:27: Executing /u01/app/11.2.0.3/grid/crs/install/rootcrs.pl -patch
  (*) 2013-05-27 21:07:27: Stack started
  (*) 2013-05-27 21:07:27: Enabling stack to start at reboot
  (*) 2013-05-27 21:07:28: Filesystem max mount count is not configured according to best practices. Correcting setting now.
  (*) 2013-05-27 21:07:28: Filesystem check interval is not configured according to best practices. Correcting setting now.
  (*) 2013-05-27 21:07:28: Kernel parameter vm.min_free_kbytes is not set to the recommended minimum value. Correcting setting now
  (*) 2013-05-27 21:07:29: All above steps finished.

Next, I tried it out with something a little more difficult - updating from 11.2.2.4.2 to 11.2.3.2.1 in one shot.  Previously, I'd had problems with this.  Running dbnodeupdate.sh would be helpful, in that it would give me a backup automatically, in case there were any issues.  I simply downloaded the 11.2.3.2.1 ISO file and the dbupdate-helper scripts (attached in the dbnodeupdate.sh MOS note).  First, run dbnodeupdate.sh with the -u (update) -l (ISO location) -p (bootstrap phase) and -x (helper scripts) options, along with -s (shut down CRS on the node).

[root@dm03db04 yum]# ./dbnodeupdate.sh -u -l /u01/stage/patches/patch_11.2.3.2.1.130109/yum/p16432033_112321_Linux-x86-64.zip -p 1 -x /u01/stage/patches/patch_11.2.3.2.1.130109/yum/dbupdate-helpers.zip -s
  (*) 2013-05-27 10:44:21: Collecting system configuration details...
  (*) 2013-05-27 10:44:22: Unzipping helpers in /u01/stage/patches/patch_11.2.3.2.1.130109/yum/dbupdate-helpers.zip to /opt/oracle.SupportTools, this may take a while
  (*) 2013-05-27 10:44:23: Checking free space in /u01
  (*) 2013-05-27 10:44:23: Unzipping /u01/stage/patches/patch_11.2.3.2.1.130109/yum/p16432033_112321_Linux-x86-64.zip to /u01/app/oracle/stage.13449, this may take a while
 
Active Image version   : 11.2.2.4.2.111221
Active Kernel version  : 2.6.18-238.12.2.0.2.el5
Active LVM Name        : /dev/mapper/VGExaDb-LVDbSys1
Inactive Image version : n/a
Inactive LVM Name      : /dev/mapper/VGExaDb-LVDbSys2
Current user id        : root
Action                 : onetime  phase: 1
Upgrading to           : 11.2.3.2.1.130302
Baseurl                : file:///var/www/html/yum/unknown/EXADATA/dbserver/11.2/latest.13449/x86_64/ (iso)
Iso file               : /u01/app/oracle/stage.13449/112_latest_repo_130302.iso
Create a backup        : Yes
Shutdown stack         : Yes (Currently stack is up)
Logfile                : /var/log/cellos/dbnodeupdate.log (runid: 270513104421)
Diagfile               : /var/log/cellos/dbnodeupdate.270513104421.diag
Server model           : SUN FIRE X4170 M2 SERVER
dbnodeupdate.sh rel.   : 1.35 (always check MOS 1553103.1 for the latest release)
 
Note                   : After completing this step continue with phase 2 of the one-time setup by running the following command:
                       : ./dbnodeupdate.sh -u -p 2
 
Continue ? [Y/n]
  (*) 2013-05-27 10:45:04: Verifying GI and DB's are shutdown
  (*) 2013-05-27 10:45:04: Shutting down GI and db
  (*) 2013-05-27 10:47:03: Collecting console history for diag purposes
  (*) 2013-05-27 10:47:24: Performing backup to /dev/mapper/VGExaDb-LVDbSys2
  (*) 2013-05-27 10:55:12: Backup successful
  (*) 2013-05-27 10:55:13: Disabling stack from starting
  (*) 2013-05-27 10:55:13: OSWatcher stopped successful
  (*) 2013-05-27 10:55:29: EM Agent (in /u01/app/oracle/product/agent12c/core/12.1.0.1.0) stopped successfully
  (*) 2013-05-27 10:55:29: Executing bootstrap.sh. Node is expected to reboot when finished
 
Remote broadcast message (Mon May 27 10:57:04 2013):
 
Exadata post install steps started.
It may take up to 2 minutes.
The db node will be rebooted upon successful completion.
 
Remote broadcast message (Mon May 27 10:57:25 2013):
 
Exadata post install steps completed.
Initiate reboot in 10 seconds to apply the changes.
 
Broadcast message from root (Mon May 27 10:57:35 2013):
 
The system is going down for reboot NOW!

Let's walk through this - first, the script checked for free space in /u01, looked to see if CRS was up and running, then spit out some information for me to confirm.  Because it's the first time that the script was run, there is no inactive LVM image version.  Also, it reminds me that the node will reboot, along with the command that I should run once the node comes back up (./dbnodeupdate.sh -u -p 2).  After I've confirmed that I want to apply the patch, the script shuts down CRS, takes a snapshot of the / LVM to /dev/mapper/VGExaDb-LVDbSys2, disables CRS and OSwatcher, and runs the bootstrap process.  The bootstrap process in this case installs the exadata-sun-computenode RPM, along with the latest kernel release.  Following this, the node reboots.

When the node comes back up, it's time to run the "phase 2" bootstrap:

[root@dm03db04 yum]# ./dbnodeupdate.sh -u -p 2
  (*) 2013-05-27 11:03:46: Collecting system configuration details...
 
Active Image version   : 11.2.3.2.1.130302
Active Kernel version  : 2.6.32-400.21.1.el5uek
Active LVM Name        : /dev/mapper/VGExaDb-LVDbSys1
Inactive Image version : 11.2.2.4.2.111221
Inactive LVM Name      : /dev/mapper/VGExaDb-LVDbSys2
Current user id        : root
Action                 : onetime  phase: 2
Upgrading to           : 11.2.3.2.1.130302
Baseurl                : file:///var/www/html/yum/unknown/EXADATA/dbserver/11.2/latest.13449/x86_64/ (iso)
Iso file               : /u01/app/oracle/stage.13449/112_latest_repo_130302.iso
Create a backup        : No
Shutdown stack         : No (Currently stack is down)
Logfile                : /var/log/cellos/dbnodeupdate.log (runid: 270513110345)
Diagfile               : /var/log/cellos/dbnodeupdate.270513110345.diag
Server model           : SUN FIRE X4170 M2 SERVER
dbnodeupdate.sh rel.   : 1.35 (always check MOS 1553103.1 for the latest release)
 
Note                   : After completing this step and after the systems reboots run 'dbnodeupdate.sh -c' to finish post steps
 
 
Continue ? [Y/n]
y
  (*) 2013-05-27 11:04:23: Verifying GI and DB's are shutdown
  (*) 2013-05-27 11:04:25: Collecting console history for diag purposes
  (*) 2013-05-27 11:04:26: OSWatcher stopped successful
  (*) 2013-05-27 11:04:30: Executing bootstrap2.sh. Node is expected to reboot when finished
 
Broadcast message from root (pts/0) (Mon May 27 11:07:26 2013):
 
The system is going down for reboot NOW!
  (*) 2013-05-27 11:07:26: All above steps finished.
  (*) 2013-05-27 11:07:26: After reboot run "./dbnodeupdate.sh -c" to complete the onetime

The node will reboot one more time. When it comes up, the final phase must be completed - relinking the homes and enabling CRS.

[root@dm03db04 yum]# ./dbnodeupdate.sh -c
  (*) 2013-05-27 11:39:31: Collecting system configuration details...
 
Active Image version   : 11.2.3.2.1.130302
Active Kernel version  : 2.6.32-400.21.1.el5uek
Active LVM Name        : /dev/mapper/VGExaDb-LVDbSys1
Inactive Image version : 11.2.2.4.2.111221
Inactive LVM Name      : /dev/mapper/VGExaDb-LVDbSys2
Current user id        : root
Action                 : finish-post (perform post steps, relink enable/disable crs)
Relinking for release  : 11.2.3.2.1.130302
Shutdown stack         : No (Currently stack is down)
Logfile                : /var/log/cellos/dbnodeupdate.log (runid: 270513113930)
Diagfile               : /var/log/cellos/dbnodeupdate.270513113930.diag
Server model           : SUN FIRE X4170 M2 SERVER
dbnodeupdate.sh rel.   : 1.35 (always check MOS 1553103.1 for the latest release)
Automatic checks incl. : Issue 1.8 - Hotspare not reclaimed
                       : Issue 1.10 - Cell and Database image versions 11.2.2.2.2 or lower require workaround before patching
                       : Database servers with an ofa rpm earlier than 1.5.1-4.0.28 can encounter a file system corruption
                       : Issue 1.14 - Upgrade to 11.2.3.x failed due to Sas Exp. FW not upgrd. first to 5.7.0 on X4800 and X4800 M2
                       : Issue 1.15 - Filesystem checks not disabled on database servers
                       : Issue 1.16 - Verify the vm.min_free_kbytes kernel parameter on database servers to make sure 512MB is reserved
                       : Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12
Manual checks todo     : Issue 1.11 - Database Server upgrades to 11.2.2.3.0 or higher may hit network routing issues after the upgrade
 
Continue ? [Y/n]
y
  (*) 2013-05-27 11:40:15: Verifying GI and DB's are shutdown
  (*) 2013-05-27 11:40:16: Collecting console history for diag purposes
  (*) 2013-05-27 11:40:34: No rpms to remove
  (*) 2013-05-27 11:40:57: EM Agent (in /u01/app/oracle/product/agent12c/core/12.1.0.1.0) stopped successfully
  (*) 2013-05-27 11:40:58: Relinking all homes
  (*) 2013-05-27 11:40:58: Unlocking /u01/app/11.2.0.3/grid
  (*) 2013-05-27 11:41:04: Relinking /u01/app/11.2.0.3/grid as oracle
  (*) 2013-05-27 11:42:36: Relinking /u01/app/oracle/product/11.2.0.2/dbhome_1 as oracle
  (*) 2013-05-27 11:44:09: Relinking /u01/app/oracle/product/11.2.0.3/dbhome_1 as oracle
  (*) 2013-05-27 11:45:48: Executing /u01/app/11.2.0.3/grid/crs/install/rootcrs.pl -patch
  (*) 2013-05-27 11:47:48: Sleeping another 60 seconds while stack is starting (1/3)
  (*) 2013-05-27 11:47:48: Stack started
  (*) 2013-05-27 11:47:48: Enabling stack to start at reboot
  (*) 2013-05-27 11:48:33: EM Agent (in /u01/app/oracle/product/agent12c/core/12.1.0.1.0) started successfully
  (*) 2013-05-27 11:48:35: Filesystem max mount count is not configured according to best practices. Correcting setting now.
  (*) 2013-05-27 11:48:35: Filesystem check interval is not configured according to best practices. Correcting setting now.
  (*) 2013-05-27 11:48:35: Kernel parameter vm.min_free_kbytes is not set to the recommended minimum value. Correcting setting now
  (*) 2013-05-27 11:48:43: Cleaned up iso
  (*) 2013-05-27 11:48:43: All above steps finished.

Hopefully in the next week, I'll be able to play around with the rollback functionality, and will report back on that.

23May/135

Adding Windows DNS Records Via Command-Line

This is something that is almost completely off-topic, but something I've found myself doing quite a bit at Enkitec.  You see, we keep adding new hardware (Big Data Appliance in August, new Exadata X3-2 earlier this year), and that means that we need to add a bunch of DNS entries at once.  Even an eighth rack of Exadata needs 54 DNS entries, when you add up the forward and reverse records.

If you're like us, you have a Windows-based DNS server (whether you like it or not).  The DNS configuration interface in Windows Server 2008 is pretty nice, but it can be a pain to repeatedly click to add all of the entries needed.  What is really convenient is to be able to write out the commands and paste them into a command line window.  Since I use this blog as a place to keep things that I'll use again, I wanted to document this here.  Someday, you may find this useful, too.  I know that real men use Linux for DNS, but unfortunately, we were unable to resist the pull of Active Directory (ugh).

First, you'll need to add your A records - these are the normal DNS entries (forward lookups).

dnscmd . /RecordAdd {domain} {hostname} {record type} {IP address}

Next, you'll need to add the PTR records - these are reverse lookups (translate an IP into a hostname)

dnscmd . /RecordAdd {reverse domain name} {last octet of IP} {record type} {fully qualified hostname}

Here's what this looks like when we put it into action:

C:\Users\acolvin>dnscmd . /RecordAdd enkitec.com  enkx3sw-pdua A 192.168.8.245
 
Add A Record for enkx3sw-pdua.enkitec.com at enkitec.com
Command completed successfully.
 
C:\Users\Administrator>dnscmd . /recordadd 8.168.192.in-addr.arpa. 245 PTR enkx3sw-pdua.enkitec.com
 
Add PTR Record for 245.8.168.192.in-addr.arpa. at 8.168.192.in-addr.arpa.
Command completed successfully.

I know that real men use Linux (and manlier men use BSD) for DNS, but unfortunately, we were unable to resist the pull of Active Directory (ugh).

7Mar/139

Exadata 11.2.3.2.1 NFS Issues – Ksplice Support for Exadata?

When the 11.2.3.2.1 release of the Exadata Storage Server software was released, I was a little excited.  There were numerous oneoff patches for the previous release, 11.2.3.2.0, which was the first version to support the Exadata X3, writeback flashcache, run UEK on the X#-2 systems, etc.  With that many large changes introduced in one version, it was likely to see some bugs in the .0 release.  Fortunately, Oracle was quick to fix many of those issues, but it resulted in several separate patches to update the cellsrv software.

I was working with a colleague last week where we ready to apply this patch to a customer's Exadata system.  Everything went off without a hitch - upgrading from 11.2.2.4.2 straight to 11.2.3.2.1.  We even applied the patch to the customer's quarter rack in rolling mode, which took under 6 hours to complete.  After everything was back up and running, we took an archive log backup using RMAN.  For this customer, we back everything up to NFS because it won't fit within the FRA, and they don't want to leave backups inside the production system.  We were greeted with a strange error when we tried to kick off the backup job in RMAN:

RMAN> run {
2>   ALLOCATE CHANNEL DISK1 DEVICE TYPE DISK;
3>   BACKUP DATABASE FORMAT '/mnt/nfs/actest_%U';
4>   RELEASE CHANNEL DISK1;
5> }
 
using target database control file instead of recovery catalog
allocated channel: DISK1
channel DISK1: SID=397 instance=ACTEST1 device type=DISK
 
Starting backup at 13-02-28 21:38
channel DISK1: starting full datafile backup set
channel DISK1: specifying datafile(s) in backup set
input datafile file number=00007 name=+DATA/actest/datafile/tanel_bigfile.325.808412931
input datafile file number=00006 name=+DATA/actest/datafile/ts_data.380.779860027
input datafile file number=00001 name=+DATA/actest/datafile/system.367.779029515
input datafile file number=00002 name=+DATA/actest/datafile/sysaux.368.779029555
input datafile file number=00003 name=+DATA/actest/datafile/undotbs1.369.779029595
input datafile file number=00004 name=+DATA/actest/datafile/undotbs2.371.779029649
input datafile file number=00005 name=+DATA/actest/datafile/users.372.779029687
channel DISK1: starting piece 1 at 13-02-28 21:38
released channel: DISK1
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on DISK1 channel at 02/28/2013 21:38:37
ORA-19504: failed to create file "/mnt/nfs/actest_1jo34pas_1_1"
ORA-27044: unable to write the header block of file
Linux-x86_64 Error: 12: Cannot allocate memory
Additional information: 3

It didn't matter what we were trying to back up, just that it was going to NFS.  This backup job had worked fine prior to the patch (we took a backup immediately preceding the maintenance window), but we had applied both a database bundle patch (this database was 11.2.0.2) and the latest storage server patch (11.2.3.2.1), which updates the Linux OS to OEL 5.8, as well as introduces the Oracle Unbreakable Enterprise Kernel into the mix.

We checked the mount options to make sure that everything was ok, and saw that it was:

[enkdb01:oracle:ACTEST1] /u01/app/oracle/product/11.2.0.3/dbhome_2/rdbms/lib 
> mount | grep "/mnt/nfs"
192.168.12.22:/export/nfs on /mnt/nfs type nfs (rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,nfsvers=3,timeo=600,actimeo=0,addr=192.168.12.22)

After poking around a bit, we opened a service request, which was answered pretty quickly by Oracle support.  It turns out that there is a known bug with the NFS driver included in the version of the UEK packaged with 11.2.3.2.1. Oracle provided 3 possible fixes, which I'll detail below. The fixes were:

1Feb/130

2013 Presentation Schedule

Well, it's been a really busy past few months, and I hate to admit it, but I've been neglecting this space more than anything. Despite having many different posts in the works, nothing is quite finished yet. I do have a little time to mention a few of my upcoming speaking events, though.

First, I'll be in Denver for the Rocky Mountain Oracle Users Group Training Days 2013, February 11-13.  Enkitec has more than a few sessions during the conference, ranging from Exadata to Big Data to APEX.  Check out the agenda for a full list.  Also, we'll have an Enkitec booth there, which is the place that I'll be hanging out when I'm not attending the numerous interesting sessions.  I have 2 presentations during the week:

I'm really looking forward to the RMAN session, as I'll finally be able to talk about some of the new stuff that's coming up in 12c.

I'm also going to be presenting a couple of Expert Seminars for Oracle University in the coming months.  I'll be presenting a "Getting Ready for Oracle Exadata" seminar over 2 half days in March (March 19-20) and May (May 28-29).  I'll be talking about topics like how to functionally manage your Exadata environment (backups, monitoring, patching, etc).  The sessions will be entirely online, and will have plenty of time for Q&A to ask any nagging questions.

Finally, I would like to mention that the call for papers is open for Enkitec's Engineered Systems conference, E4.  Once again, it will be held at the Four Seasons Las Colinas in August.  I know it's not the best time to come to the Dallas area, but they have great air conditioning, and it's nearly impossible to get this kind of access to some of the best Exadata professionals in the world.  The list of speakers from last year was incredible, and I'm sure that this year will be just as good.  Also, the attendees were great to talk to, since many of them were there to share their experiences running Exadata.  The call for papers is open, so if you have something you'd like to share, please submit your abstract!  We'll be covering more than just Exadata, with plans for several talks about Big Data, and possibly even an Exalogic or Exalytics session or two.

1Oct/120

Oracle Announces Exadata X3-2 and X3-8

Well, it's finally public, so we're able to openly talk about the new Exadata X3 systems.  Looking back on my pre-openworld predictions, I was pretty close on a few things.  I was correct on the database servers, which will have Xeon E5-2690 CPUs (8 core, 2.9GHz) with 128GB RAM upgradeable to 256GB.  It looks like we won't get active/active Infiniband for a while, since the cards in there are staying the same.  On the X3-8, the compute nodes are staying the same, for reasons detailed by Kevin Closson a few weeks ago.  I also previously blogged about the X3-2 eighth rack.  I think this will become one of the more popular options for customers, based on the quarter racks that we're seeing purchased.  I'm definitely interested to get my hands on one and see how half of the components have been disabled.  It's very cool that Oracle was able to still give the redundancy of a true Exadata in a smaller footprint.

One of the bigger improvements on the X3 series comes down at the storage level.  I was a little bit off on the CPUS, which will be E5-2630L (6 core, 2.0GHz) with an upgrade from 24GB to 64GB of RAM.  The biggest differences on the storage servers will come via the F40 flash cards, which increase storage 4x (400GB per card), meaning that you'll get 1.6TB of flash per cell.  Also, the version of the Exadata storage server software shipping with the X3 systems will be 11.2.3.2.0, which contains the famous "flash for all writes" cache.  Disk drives will stay the same (600GB or 3TB).

The new storage server software (11.2.3.2.0) should be released to the public some time this week, and it will include the flash write cache for previous systems.  I'm very interested to see what the performance of this feature will look like on the older X2 and V2 systems, where the flash cards are a little bit slower at writes than the new F40 cards.  It is worth noting that the write cache feature will be something that users can enable or disable, so if the performance is not what's expected, it can be disabled.  Rest assured that once the patch is released, it'll find its way onto one of Enkitec's Exadata shortly thereafter.

Also, this new storage server software release will introduce Oracle's Unbreakable Enterprise Kernel to the 2-socket Exadata crowd.  The UEK has been available for the X2-8 systems since their release, but Oracle had yet to run it on X2 systems.  This will change with the release of 11.2.3.2.0.  It is worth noting that it is still possible to go back to the RedHat compatible kernel if there is adverse performance on the UEK.

That's it for now, and as new things come up during the week, I'll try to post on here.

7Sep/124

Exadata X3-2 1/8th Rack

There have been a couple of posts we've seen lately about expectations of an Exadata X3-2 and X3-8 release at Oracle Open World 2012.  I mentioned in my previous post about the possible release of an X3-2 1/8th rack configuration.  I had guessed that this would be similar to the old V2 basic system that would include one compute node, one storage server, and one infiniband switch - all placed in your own rack.  It sounds like I was a little bit off from this original idea.

Oracle has stopped taking orders on X2-2 and X2-8 hardware, and we have had a handful of our customers let us know about emails that they have received from Oracle reps announcing an Exadata X3-2 1/8th rack for sale.  This configuration will work as "capacity on demand" (insert salesy buzz words).  The plan for the Exadata X3-2 1/8th rack is to contain all of the hardware that exists within a 1/4 rack configuration (2 compute nodes, 3 storage servers, 2 infiniband switches), but to disable half of the CPU cores, half of the flash cards, and half of the hard disks via software controls.

Here's what I would expect this to look like:

  • Compute Nodes
    • 8 CPU cores (16 threads)
    • 128GB RAM
  • Storage Servers
    • 6 or 8 CPU cores (12 or 16 threads)
    • 2 PCIe flash cards
    • 6 X 600GB SAS or 3TB SAS hard disks
  • 2 Infiniband Switches

This would leave you with either 10Tb or 54TB of raw disk space depending on whether high performance or high capacity drives were chosen.  The CPU cores and other hardware components would be disabled using software, probably similar to how unlicensed CPU cores in an ODA are disabled.  This would mean that the 1/8th rack configuration would still contain RAC (including RAC licenses), multiple storage servers (only half of the Exadata storage server licenses), and lots of flash cache.  The process from upgrading a 1/8 rack to a 1/4 rack system would simply be a matter of enabling the extra hardware, most likely through a license key.  Based on the increase in CPU/memory/flash that I'm expecting to see from the X2 --> X3 release, I would expect to see quite a few customers looking at Exadata as an option for many hardware refresh upgrades.  It will be really nice to actually test the improvements from the flash write cache that should be announced at Open World as well.

31Aug/123

Pre OpenWorld Predictions (Exadata X3-2?)

With only a month away from Larry Ellison's keynote at Oracle OpenWorld 2012, I thought that I would make a couple of wild guesses about new products that may or may not get announced this year.  I'll lump them into a few educated guesses and wild conjecture.  Insert standard blogging disclaimer (please read this part, Oracle lawyers):

Everything contained in this blog post is pulled from publicly available information and conclusions drawn from products that are currently available outside of Exadata.  None of this information comes from within Oracle - not that Oracle would be willing to give me any information otherwise.

30Aug/120

Where in the World is Andy Colvin?

I've got a handful of presentations coming up in the latter part of the year, so I thought I'd add a quick post with where all I'm going to be.  Seems like I'm all over the map and I couldn't stop thinking of a game that I played way back when I was in elementary school.  Well, over the next few months, I'll be in a few places talking about Exadata, OEM, and other Oracle topics.  Here's a list of where I'll be, and what I'll be talking about.

Oracle Open World (San Francisco, CA - September 30 - October 4)

UK Oracle Users Group Conference (Birmingham, UK - December 3 - December 5)

  • Patching Exadata Demystified (December 4, 11:15AM)
  • Exadata Zero Downtime Migration (December 5, 11:15AM)

Of course, I'll be at Enkitec's booth (Moscone South, #421) at Open World as well, so feel free to stop by and say hi.  We may just have some goodies to give out as well.  I'm also teaching Enkitec's Exadata Administration course for a few sessions over the next 3 months.

25Aug/120

Exadata Flash Write-back – Sooner Than We Think?

If you missed Andy Mendelsohn's keynote at E4 last week, you may not have heard the hubbub that surrounded one of his last slides (tweeted by Frits Hooogland here).  The mention of the write-back enticed Kevin Closson to talk about the potential ramifications of such a feature.  There's a lot of information on that slide to digest (what's a pluggable database?  virtualization of database servers?), but I'm going to focus on the flash-based write-back cache.  Note that this is not mentioning the "Exadata Smart Flash Log" featured introduced last year with the 11.2.2.4.0 cell patch, discussed by Guy Harrison recently.  That feature sends writes to both flash and disk at the same time.  In my experience, the disk wins on > 90% of those writes.

This is something larger than just sending writes to flash...an issue that Oracle has likely been working on for a few years. Kevin had mentioned in his post that he expected it to be a feature in the 12.2 release, possibly 12.1 of the database. Because Mendelsohn mentioned that there was a 12-month timeframe for these items, I expected it would occur with the release of the new version of the Oracle database, 12c. I've been doing some poking around in the latest Exadata patch notes and saw a couple of interesting bugs around a write-back cache on Exadata using flash. Bug 14143451 "Enhancement for ASM write-back flash cache resilvering support" and bug 14132953 "Enhanacement to add Write-back flash cache resilvering support" have both been added to the August 2012 bundle patch for 11.2.0.3 (MOS note #1393410.1). If you look at these bugs, you will see that they are currently listed as fixed in 11.2.0.4. The fact that the enhancement has been added to 11.2.0.3 interests me. It looks similar to the introduction of the Exadata smart flash log feature, introduced in the 11.2.2.4.0 Exadata storage server version, released October 2011. If you look through the Exadata bundle patches for 11.2.0.2, you'll see that it was introduced into the database code in bundle patch 9 (MOS note #1314319.1). That bundle patch was released in July 2011. Sound familiar? I wouldn't put it past Oracle to include the write-back cache through a new version of the storage server software.

This sounds like the kind of feature that Larry Ellison would be very happy to announce at Open World in October. We'll just have to wait and see what gets announced. I'll have another post in the next week or so guessing about what may get announced a month from now in San Francisco.