Upgrading to OpenNebula 3.4 with Running VMs

Overview

This guide, pulled out from this thread: Upgrade 3.2 -> 3.4 with active VMs, shows how to upgrade to OpenNebula 3.4 without the need to shutdown the running VMs

It is not meant as generic solution, and the user should use this method only if he understands its limitations. Otherwise, the normal upgrade proceduce is recommended.

Rationale

Text based on this email:

The reason why VMs have to be shut down is because the storage model has changed quite a bit in this last version, and we couldn't come up with a migration process that felt solid enough for any deployment.

But, since you said you are ready for lots of manual work, let's try this:

First, edit /usr/lib/one/ruby/onedb/3.3.0_to_3.3.80.rb and comment out lines 35 to 60 to disable the VM status check. Then upgrade the opennebula DB with the onedb command

Before you start opennebula again, you need to adapt the previous VM storage directories to the new 3.4 hierarchy. Assuming you don't have VM_DIR nor DATASTORE_LOCATION defined in oned.conf, this is how the VM directories look in both versions:

3.2: /var/lib/one/<vm_id>/images/disk.i /var/lib/images/abcde

3.4: /var/lib/one/datastores/0/<vm_id>/disk.i /var/lib/one/datastores/1/abcde

So what you need to do in the front-end *and each host* is to individually link each existing VM dir:

ln -s /var/lib/one/<vm_id>/images /var/lib/one/datastores/0/<vm_id>

Let me stress that this must be done in all the hosts.

Please take into account that this is just the basic idea, and that *we haven't tested this*. You may run into problems depending on your infrastructure configuration. For instance, if you were using the tm_shared drivers, VMs containing persistent Images will have links like this one:

/var/lib/one/7/images/disk.1 → /var/lib/images/qwerty

If you move the image files to their new destination in /var/lib/one/datastores/1, like the documentation says, you will break those VM disk links.

Another problem that may be important to you is that VMs over which you executed saveas won't be able to save the changes back to the locked Image once the VM is shut down. In the best scenario, you will find the files inside /var/lib/one/<vm_id>/images/, but I'm not sure about this, you may lose your changes… You can locate these VMs looking for VM/DISK/SAVE_AS, and the Images all have “-” as the value of IMAGE/SOURCE.

As I said, this is not a tested or recommended procedure.

Good luck, Carlos

Implementation

Text based on this email.

the following is a longer email, but I believe there are others also having the need to make a “smooth” upgrade from 3.2.x to 3.4

Thanx again to Carlos to show me the right way, so how did we do the Migration of 3.2.1 to 3.4 having certain active VMs and not the chance to shut them down. Some background for the steps described below: Our old setup was: - ssh as transfer manager (a little bit modified) - VM_DIR was /one/vm in 3.2 - lots of persistent images, no images marked with SAVE_AS - path to image repository: /one/var/images

New setup: - ssh transfer manager - filesystem datastore 1 (/one/var/datastores/1) - system datastore 0 (/one/var/datastores/0)

The migration steps: 1) Normal upgrade procedure (installation of one 3.4 without onedb upgrade yet) 2) as describe by Carlos I commented out lines 36 to 60 in /usr/lib/one/ruby/onedb/3.3.0_to_3.3.80.rb (line 35 does not have to be commented out - “header_done=false”) onedb upgrade run, working fine 3) On each node create a link for each active vm from /one/var/datastores/1/<VM_ID> → /one/vm/<VM_ID>/images

#!/bin/sh
#
DATASTORE_LOCATION=/one/var/datastores
SYSTEM_STORE=$DATASTORE_LOCATION/0
IMAGE_STORE=$DATASTORE_LOCATION/1
VM_DIR=/one/vm
 
# Create Datastore Directory
test -d $DATASTORE_LOCATION || mkdir $DATASTORE_LOCATION
# Create System Datastore
test -d $SYSTEM_STORE || (echo "creating vm dir $SYSTEM_STORE"; mkdir $SYSTEM_STORE)
#
# In our setup the datastore 1 is available on each host, but was mounted to /one/images
#
test -L $IMAGE_STORE || (echo "linking $IMAGE_STORE"; ln -s /one/images $IMAGE_STORE )
 
 
for i in $VM_DIR/*; do
  ID=`basename $i`
  if [ -d $i/images -a ! -L $SYSTEM_STORE/$ID -a ! -d $SYSTEM_STORE/$ID ]; then
    echo "linking running vm $ID to new system datastore"
    ln -s $i/images $SYSTEM_STORE/$ID
  fi
done

After this step ONE was able to monitor again the running VMs, but as I noticed at the first test, it was not to shutdown the vm and migrate the images back to the image datastore 1. What was the problem? Well the information of the TM_MAD and the datastores was missing in the persistent images - this information is stored in the the XML body in the one database, table vm_pool.

<DATASTORE>default</DATASTORE>
<DATASTORE_ID>1</DATASTORE>
<TM_MAD>ssh</TM_MAD>

This information is needed to use the correct transfer manager for the persistent image migration. (This creates an error like: Error executing image transfer script: Transfer Driver 'hostname:/one/var/datastores/0/1322/disk.0' not available)

So we had to add the needed information to each persistent disk on each active vm.

Diving into ruby for the first time in my life I decided to do this creating a kind of db update script in for “onedb” way as this brings out of the box every thing needed (mainly just some data changes in the database) and was the fastest way for me. What had to be done? I created an update script as attached to this email, which does the following: - Adds the above mentioned informations to each image, if it is persistent and not a vm started in the new way (starting at line 46). - Changes the SOURCE of the images to the correct new location.

This worked fine for us using again the normal “onedb upgrade”, but to start oned again - now db_version is set to 3.4.0.5 I had to delete the last entry from the table db_versioning for this version. (just do a select * from db_versioning, note the oid and delete the row with the oid having the entry for version 3.4.0.5)

After having done this ONE was running fine and shutting down some “older” test vms worked also fine - migrating back the persistent images to the datastore 1.

Beware that when shutting down the “old” vms, only the links on the nodes are deleted, not the images themselves. This may be done manually or by a small extension to the delete script of the ssh transfer driver.

Perhaps this mail gives some info to people having the same need for the upgrade.

Best, Michael

# -------------------------------------------------------------------------- #
# Copyright 2002-2012, OpenNebula Project Leads (OpenNebula.org)             #
#                                                                            #
# Licensed under the Apache License, Version 2.0 (the "License"); you may    #
# not use this file except in compliance with the License. You may obtain    #
# a copy of the License at                                                   #
#                                                                            #
# http://www.apache.org/licenses/LICENSE-2.0                                 #
#                                                                            #
# Unless required by applicable law or agreed to in writing, software        #
# distributed under the License is distributed on an "AS IS" BASIS,          #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.   #
# See the License for the specific language governing permissions and        #
# limitations under the License.                                             #
#--------------------------------------------------------------------------- #
 
require "rexml/document"
require "OpenNebula/XMLUtils"
include REXML
 
module Migrator
    def db_version
        "3.4.0.5"
    end
 
    def one_version
        "OpenNebula 3.4.0"
    end
 
    def up
 
        ########################################################################
        # Add each running VMs IMAGES
        # to DATASTORE default (DATASTORE_ID 1)
        #    TM_MAD ssh (thats what we use)
        ########################################################################
 
        @db.run "ALTER TABLE vm_pool RENAME TO old_vm_pool;"
        @db.run "CREATE TABLE vm_pool (oid INTEGER PRIMARY KEY, name VARCHAR(128), body TEXT, uid INTEGER, gid INTEGER, last_poll INTEGER, state INTEGER, lcm_state INTEGER, owner_u INTEGER, group_u INTEGER, other_u INTEGER);"
 
        @db.fetch("SELECT * FROM old_vm_pool") do |row|
            doc = Document.new(row[:body])
            if ( row[:state] == 3 )
              # Update old VMs - or better lets say VMs with old image format
              doc.root.each_element("TEMPLATE/DISK") { |e|
                # if it is an "old" format
                if ( e.elements['PERSISTENT'] && (e.elements['PERSISTENT'].text == "YES") && (!e.elements['DATASTORE']) )
                  # Add additional elements needed by 3.4 (check your TM_MAD)
                  e.add_element("DATASTORE_ID").text = CData.new("1")
                  e.add_element("DATASTORE").text    = CData.new("default")
                  e.add_element("TM_MAD").text       = CData.new("ssh")
                  # Replace old path with new default path (check your pathes)
                  ipath=e.elements['SOURCE'].text
                  ipath=ipath.gsub("/one/var/images", "/one/var/datastores/1")
                  ipath=ipath.gsub("/one/images",     "/one/var/datastores/1")
                  e.elements['SOURCE'].text=ipath
                end
              } 
              # puts "FINAL #{'%s' % doc.root.to_s}"
 
            end
            @db[:vm_pool].insert(
                                 :oid        => row[:oid],
                                 :name       => row[:name],
                                 :body       => doc.root.to_s,
                                 :uid        => row[:uid],
                                 :gid        => row[:gid],
                                 :last_poll  => row[:last_poll],
                                 :state      => row[:state],
                                 :lcm_state  => row[:lcm_state],
                                 :owner_u    => row[:owner_u],
                                 :group_u    => row[:group_u],
                                 :other_u    => row[:other_u]
                                 )
        end
 
        @db.run "DROP TABLE old_vm_pool;"
 
 
        return true
    end
end
upgrade34 · Last modified: 2012/04/20 11:25 by Carlos Martin
Admin · Login