Wednesday, November 13, 2013

How to shutdown Peoplesoft OVA after init

The Peoplesoft team continues to provide the PUM images on schedule. Great work. The last image HCM 9.2.003 has been released about two-three weeks ago.
I’m still using VMWare ESXi vSphere Hypervisor 5.1 and my workaround to move these OVA to VMWare is still working fine, find out more about this workaround I wrote about few months ago here.

There’re few changes though. The number of updates required in the ovf file have been dramatically dropped down. Right now, we’ve only have two parts to modify:
From
<     <OperatingSystemSection ovf:id="109">
<       <Description>Oracle_64</Description>
<       <vbox:OSType ovf:required="false">Oracle_64</vbox:OSType>
To
>     <OperatingSystemSection ovf:id="101">
>       <Description>oracleLinux64Guest</Description>
>       <vbox:OSType ovf:required="false">oracleLinux64Guest</vbox:OSType>
And From
<         <vssd:VirtualSystemType>virtualbox-2.2</vssd:VirtualSystemType>
To
>         <vssd:VirtualSystemType>vmx-07</vssd:VirtualSystemType>
Apart from that, other changes in the file are no longer necessary.
Note that the sound card tag is not there anymore as many of the others which caused incompatibilities with VMWare.

Then I deployed the OVA with ovftool (the link points to the latest version 3.5, but I’m still using the previous version 3.0.1, it probably does not make any difference for this case):
[root@omsa:/nfs/software/PeopleSoftCD/OVA/HCMDB-85307-PI003_OVA]# ls -la
total 16223832
drwxrwxrwx 2 root root       4096 Nov  8 13:16 .
drwxrwxrwx 7 root root       4096 Oct 31 15:13 ..
-rw------- 1 root root 1235787776 Oct 10 11:36 HCMDB-853-07-disk1.vmdk
-rw------- 1 root root 2931841536 Oct 10 11:39 HCMDB-853-07-disk2.vmdk
-rw------- 1 root root 4845944320 Oct 10 11:46 HCMDB-853-07-disk3.vmdk
-rw------- 1 root root 7583338496 Oct 10 11:54 HCMDB-853-07-disk4.vmdk
-rw------- 1 root root      13757 Nov  8 09:37 HCMDB-853-07.ovf
-rw------- 1 root root      13747 Nov  7 08:32 HCMDB-853-07.ovf.orig
-rw-r--r-- 1 root root        103 Nov  8 12:03 .ovftool
[root@omsa:/nfs/software/PeopleSoftCD/OVA/HCMDB-85307-PI003_OVA]# more .ovftool
lax
datastore=vm
skipManifestCheck
overwrite
powerOffTarget
net:HostOnly=VM Network 2
name=HCM92003
[root@omsa:/nfs/software/PeopleSoftCD/OVA/HCMDB-85307-PI003_OVA]# ovftool HCMDB-853-07.ovf vi://root:<mypwd>@192.168.1.10:443
Opening OVF source: HCMDB-853-07.ovf
Opening VI target: vi://root@192.168.1.10:443/
Deploying to VI: vi://root@192.168.1.10:443/
Transfer Completed
Completed successfully
[root@omsa:/nfs/software/PeopleSoftCD/OVA/HCMDB-85307-PI003_OVA]#

Of course, if you’re working on VirtualBox you don’t have any of these above problems.

And finally booting the VM from the console works like a charm:
HCM92003_boot_002

In the end of this boot, and as expected, we are prompted for the Terms of Use, root password, network properties, database name…
HCM92003_boot_003

HCM92003_boot_005 
etc.
Then, last screen, the setup is completed. We are now ready to go to the front-end:
HCM92003_boot_006
Note that all these actions above are done from within the console, you don’t actually have other choice for the configuration (vm-template init script triggered on 1st VM’s boot).
!!! For the sake of my article I deliberately leave the console like this, not touching the keyboard anymore from here, do not press any key !!!

We can really appreciate all the effort done by the Peoplesoft team. It has never been simpler to deploy a working environment. Now, login page:
HCM92003_boot_007
HCM92003_boot_008
So far, so good. All works as expected.

Now the heck, and I’m coming on the subject of the article.

Not long after setting up this new VM, I wanted to shutdown my host server for some maintenance reason. I have had to shutdown this VM first. How to do it ? Should I go to the console ? Or can I do it through putty (I know the IP, so why not) ?

CASE 1: shutting down the VM when connected through Putty
Note that at this very moment the console is still pending with the message (Will continue in 1800 seconds…) whilst I’m shutting down the VM as root connected to the VM through Putty (most of the time I don’t use the console to work on the VM):
HCM92003_boot_010
Now nothing wrong to want to restart the VM. Let’s see:
HCM92003_boot_011 
And here we go to the nowhere:
HCM92003_boot_012
For some reason, we are going to reinit the VM ! What’s the heck ?!?!? I don’t even want do go any further, I’m too scared about the VM stability now.
For the record, here are the last lines of /var/log/oraclevm-template.log
[LOG]  Nov 12 09:12:04 S96psft-abw: Started PIA Domain peoplesoft
[LOG]  Nov 12 09:12:04 oraclevm-template: ==> 2013-11-12 09:12:04: oraclevm-template --config <==
[LOG]  Nov 12 09:12:04 oraclevm-template: Running oraclevm-template --config
[LOG]  Nov 12 09:12:04 oraclevm-template: Loading parameters from config file: /etc/sysconfig/oraclevm-template
[LOG]  Nov 12 09:12:04 oraclevm-template: Reconfiguring OS
[INFO] Nov 12 09:12:06 oraclevm-template: Regenerating SSH host keys.
[LOG]  Nov 12 09:12:08 oraclevm-template: calling /opt/oracle/psft/vm/oraclevm-template.sh
[LOG]  Nov 12 09:12:08 oraclevm-template.sh: Creating Virtual Environment. Date: Tue Nov 12 09:12:08 UTC 2013

We can really see that the configuration is recalled…
And before rebooting:
[root@hcm92003 ~]# more /etc/sysconfig/oraclevm-template
#
# Template configuration
#

#
# Allow configuration of the template when the service is enabled
#
RUN_TEMPLATE_CONF=YES

According to this, this is very logic that it returns to a pre-config state.

CASE 2: shutting down the VM when connected through the VM console
Restarting from scratch, but now instead of going to shutdown the VM with Putty, we do that from within the VM console itself. Let’s see how it is behave.
HCM92003_boot_014  
On that screen which was still pending for the 1800 seconds, I finally press a key (actually “Enter”). Please note the message “Template configuration disabled”, I’m sure it plays a role somewhere.
Now restarting the same VM:
HCM92003_boot_015 
Much better, now everything looks to work fine, and actually does:
HCM92003_boot_016
As of now, no matter how/where we are shutting down the VM, it will always work.
And the log file is pretty clear, all looks ok now:

[LOG]  Nov 12 08:56:52 oraclevm-template: ==> 2013-11-12 08:56:51: oraclevm-template --disable <==
[LOG]  Nov 12 08:56:52 oraclevm-template: Running oraclevm-template --disable
[LOG]  Nov 12 08:56:53 oraclevm-template: Changed RUN_TEMPLATE_CONF=NO in /etc/sysconfig/oraclevm-template
[INFO] Nov 12 08:56:53 oraclevm-template: Template configuration disabled.

And the config file:
[root@hcm92003 log]# more /etc/sysconfig/oraclevm-template
#
# Template configuration
#

#
# Allow configuration of the template when the service is enabled
#
RUN_TEMPLATE_CONF=NO
This time, no doubt, it won’t reconfigure the VM.

HOW DOES IT ALL WORK ?
After a little digging on oraclevm-template scripts triggered on the boot, I think I found why this behavior.
In order to understand, I’ll try to go step by step:

1. In the very beginning, the first script to be called on the VM boot is /etc/rc.d/init.d/oraclevm-template.
In this script, we can see a configuration file CONFFILE=/etc/sysconfig/oraclevm-template… the same as shown above, which contains whether the VM has to be set or not (RUN_TEMPLATE_CONF=YES).
If the VM has to be configured, it calls an other script with the flag “-config”, as following: “/usr/sbin/oraclevm-template –config”. And eventually disable the template, calling the same script with the flag “–disable” which set RUN_TEMPLATE_CONF=NO.
Here we go (just an extract):
CONFFILE=/etc/sysconfig/oraclevm-template
...
                . $CONFFILE
                case "$RUN_TEMPLATE_CONF" in
                    YES|yes|Yes|1)
                        doconfig=1
...
        if [ $forceconfig -eq 1 ] || [ $doconfig -eq 1 ]; then
               /usr/sbin/oraclevm-template --config
                ret=$?
...
            # Just disable no matter firstboot reconfig succeeds or not.
            /usr/sbin/oraclevm-template –disable

2. If the VM has to be configured (RUN_TEMPLATE_CONF=YES), we can see that the script /usr/sbin/oraclevm-template is launched with flag “config”. A function is called, “do_config”. It goes through every single step to initialize the VM, including a call an other script /opt/oracle/psft/vm/oraclevm-template.sh which will run all the necessary.
When this script /usr/sbin/oraclevm-template is called with config flag, it is ending by running the function “ovm_press_anykey 1800”.

3. This last function can be found in /usr/lib/oraclevm-template/functions:
function ovm_press_anykey
{
    local to=0
    [ -n "$1" ] && to=$1 || to=$DEFAULT_TIMEOUT
    echo
    stty -echo
    if [ $to -eq 0 ]; then
      read -n 1 -p "Press any key to continue..."
    else
      read -t $to -n 1 -p "Will continue in $to seconds, or press any key to continue..."
    fi
    echo
    stty echo
}
We find here the last message when the console is pending as I showed in the screenshots above.

To make it short: in the end of the first VM configuration, it will wait for 1800 seconds or that a key is pressed (step 3, function ovm_press_anykey) to continue (step 2, exit from the script /usr/sbin/oraclevm-template)  and do so to disable the template (step 1, /etc/rc.d/init.d/oraclevm-template) with the command “/usr/sbin/oraclevm-template –disable”…
If you don’t press enter or do not wait for 1800 seconds before doing a shutdown from outside the console, then the template is never going to be disabled…

CONCLUSION:
To my point of view, it’s a mistake. This is a template deployment script bug. It should turn RUN_TEMPLATE_CONF to NO as soon as we finished the configuration as stated in the console screen “The setup of the Peoplesoft Virtual Machine is completed” and not after we press a key or wait for 1800 seconds… Probably not an easy fix though since a lot interaction between different scripts is involved here.
Maybe disabling the template immediately after configuration would be the simplest way ? At the step 2, the “config” function should trigger the “disable” function (as it does at step 1) before any pending action.

What should we do now ?
Awaiting a fix, after the very first boot and configuration of the VM, always, always press enter from within the console !!! No matter what you will do later on, press a key ! Then and only then, you can close the console.

Thanks and enjoy the appliances !

Nicolas.

2 comments:

Nicolas Gasparotto said...

Probably a much easier fix would be to drop the call to the function "ovm_press_anykey 1800" in the end of the function do_config() in the script /usr/sbin/oraclevm-template. Waiting 1800 seconds or waiting for a key is not needed anyway.

Nicolas.

Nicolas Gasparotto said...

Sorry for the bad quality of the screenshots, it was working fine until recently. I have to look why it looks so bad or for a better software.

Nicolas.