XenServer - Automation Metdata backups

Well, we had some issues with XenServers “automated” metadata backup, so I decided - with the help of one of our consultants - to automate it on our own to an external target. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 #!/bin/bash # Crontab entry for each server: # 02 5 * * * root /usr/local/sbin/xen-pool-backup.sh # Get the pool name POOL_NAME="$( xe pool-list | grep name-label | awk '{ print $4 }' )" HOST_UUID="$( xe host-list hostname=`hostname` | grep "uuid ( RO)" | awk '{ print $5 }' )" DAILY_GENERATIONS=7 WEEKLY_GENERATIONS=4 NFS_MOUNT="nfs.home.barfoo.org:/srv/xenbackup" NFS_LOCAL="/tmp/backup-mount/$POOL_NAME" # Figure out if we're the pool master POOL_MASTER="$( xe pool-list | grep master | awk '{ print $4 }' )" if [ "$POOL_MASTER" == "$HOST_UUID" ] ; then # Only the pool master should backup the pool database, as this is the only # one who has a authoritive pool database # Create the necessary directory and mount the NFS volume mkdir -p ${NFS_LOCAL%/*} mount -t nfs $NFS_MOUNT ${NFS_LOCAL%/*} mkdir -p $NFS_LOCAL if [ -f $NFS_LOCAL/daily.$DAILY_GENERATIONS.gz ]; then rm -f $NFS_LOCAL/daily.$DAILY_GENERATIONS.gz fi OLD_DAILY="$( echo "scale=0; $DAILY_GENERATIONS - 1" | bc )" for OLD in $( seq $OLD_DAILY -1 1 ); do if [ -f $NFS_LOCAL/daily.$OLD.gz ] ; then NEW="$( echo "scale=0; $OLD+1" | bc )" # Save the time stamp somewhere touch $NFS_LOCAL/.timestamp -r $NFS_LOCAL/daily.$OLD.gz mv $NFS_LOCAL/daily.$OLD.gz $NFS_LOCAL/daily.$NEW.gz # Restore the date touch $NFS_LOCAL/daily.$NEW.gz -r $NFS_LOCAL/.timestamp fi done [ -f $NFS_LOCAL/daily.0.gz ] && mv $NFS_LOCAL/daily.0.gz $NFS_LOCAL/daily.1.gz xe pool-dump-database file-name=$NFS_LOCAL/daily.0 gzip -9 $NFS_LOCAL/daily.0 [ -f $NFS_LOCAL/.timestamp ] && rm $NFS_LOCAL/.timestamp umount ${NFS_LOCAL%/*} rm -rf ${NFS_LOCAL%/*} fi With that, I have at least a daily backup - and in combination with our daily TSM backup, I have at least month long history of metadata backups.

June 5, 2012 · 2 min · 337 words · christian

NetApp - Get a list of all volumes not being used

Well, I had another task for today … I have an amount of FlexVolumes (sixty currently per controller), and I didn’t know if we had any, that didn’t have any LUNs on them. Now I thought there was a command for that since my co-worker mentioned something like that. However, once again … there isn’t. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 #!/bin/bash MAILTO="san@barfoo.org" KEY_FILE="/root/.ssh/netapp.dsa" SSH_OPTS="/root/.ssh/netapp-ssh_config" FAS_CTRL=$1 TMPDIR="$( mktemp -d )" ssh_fas() { # $@: commands for Data ONTAP COMMANDS="$@" /usr/bin/ssh -i $KEY_FILE -l root -F $SSH_OPTS $COMMANDS } # Get the hostname of the controller, necessary for the reporting CTRL_HOSTNAME="$( ssh_fas $FAS_CTRL rdfile /etc/rc | grep ^hostname | cut -d -f2 | tr 'a-z' 'A-Z' )" # Get a list of all volumes / luns VOL_LIST="$( ssh_fas $FAS_CTRL vol status | grep ^vol | egrep -v '(nfs|cifs)' | awk '{ print $1 }' )" LUN_LIST="$( ssh_fas $FAS_CTRL lun show | grep '/vol' | awk '{ print $1 }' )" for lun in $LUN_LIST; do VOL_EXTRACT="$( echo $lun | cut -d/ -f3 )" VOL_LIST=${VOL_LIST/${VOL_EXTRACT}/} done for vol in $VOL_LIST; do echo "Empty Flex Volume: $vol." done > $TMPDIR/mailcontent if [ "$( grep Flex $TMPDIR/mailcontent )" ] ; then cat $TMPDIR/mailcontent | mailx -r $MAILTO -s "$CTRL_HOSTNAME: Empty volume check" $MAILTO fi rm -r $TMPDIR

May 30, 2012 · 2 min · 246 words · christian

Automating qual_devices updates

Well, once again I was presented with a nice AutoSupport warning once I logged into my NOW account. Since we don’t have CIFS and/or NFS licensed on our filers, I wrote a cute little script that does the whole work for me. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 #!/bin/bash TMPDIR="$( mktemp -d )" KEY_FILE="/root/.ssh/netapp.dsa" SSH_OPTS="/root/.ssh/netapp-ssh_config" FAS_CTRL=$1 QUALDEVICES=$2 ssh_fas() { # $@: commands for Data ONTAP COMMANDS="$@" /usr/bin/ssh -i $KEY_FILE -l root -F $SSH_OPTS $COMMANDS } #set -x echo "Updating qual_devices on $FAS_CTRL" # Enable ftpd if it isn't enabled already echo -n "Checking FTP subsystem" FTPD_INITIAL_STATE="$( ssh_fas $FAS_CTRL options ftpd.enable | awk '{ print $2 }' )" if [ $FTPD_INITIAL_STATE == "off" ] ; then ssh_fas $FAS_CTRL options ftpd.enable on echo " .... ok" else echo " .... ok" fi echo read -s -p "Please supply the root password for $FAS_CTRL: " ROOT_PASSWD=$REPLY echo echo "Checking qual_devices:" echo -n " Running: " mkdir $TMPDIR/fas-version # Check the old version. ftp -n $FAS_CTRL >/dev/null <<END_SCRIPT prompt user root $ROOT_PASSWD lcd $TMPDIR/fas-version cd /etc mget qual_devices* quit END_SCRIPT FAS_VERSION="$( grep Datecode $TMPDIR/fas-version/qual_devices_v3 | head -n1 | cut -d -f3 )" echo "$FAS_VERSION" # Unzip the qual_devices.zip file and compare it. ORIGPWD=$PWD mkdir $TMPDIR/new-version cd $TMPDIR/new-version unzip $QUALDEVICES >/dev/null echo -n " New: " NEW_VERSION="$( grep Datecode $TMPDIR/new-version/qual_devices_v3 | head -n1 | cut -d -f3 )" echo "$NEW_VERSION" echo # Upload the supplied version if the new file doesn't match the one running # on the controller if [ $NEW_VERSION != $FAS_VERSION ] ; then echo "Uploading qual_devices to $FAS_CTRL" ftp -n $FAS_CTRL >/dev/null <<END_SCRIPT prompt user root $ROOT_PASSWD lcd $TMPDIR/new-version cd /etc mput qual_devices* quit END_SCRIPT # Send an asup message, that the issue is corrected echo "Generating AutoSupport message" ssh_fas $FAS_CTRL options autosupport.doit qual_devices_fixed_$NEW_VERSION fi #set +x # Enable ftpd if [ $FTPD_INITIAL_STATE == "off" ] ; then ssh_fas $FAS_CTRL options ftpd.enable off fi rm -r $TMPDIR The whole thing is surly based, that FTP is configured ( as I described previously).

May 30, 2012 · 2 min · 413 words · christian

TSM and NetApp - Another Quick Hint

Well, we’ve been trying to come up with a decent way to backup NetApp snapshots to tape (SnapMirror To Tape), so we evaluated all the available methods of using NDMP backups. There’s Image Backup in two different variants - FULL and DIFFerntial There’s SnapMirror To Tape So the Image Backup is one of the ways. However the DIFFerntial backup only works for CIFS and NFS shares (which we don’t use). We only have FC luns (or rather FCoE luns), so there’s only a single (or in case of the boot luns more than one) file in each volume. With that however, each run of the Image Backup with the DIFFerential option, it’s gonna backup the full size of the volume (plus the deduplicated amount). ...

May 27, 2012 · 2 min · 215 words · christian

vm-online-backup - Another day, another PowerCLI script

Well, on Friday I had a short chat with someone from one of our application departments, stating he wanted a backup copy of a VM (ain’t to hard), but a) they don’t want any downtime and b) it has to be identical to the original. So I sat down today, googled for a bit and actually found something that pretty much does what I want, though I had to fix it up a bit … So find attached a script, which creates a hot-clone from a snapshot and then only if the latest clone was successful deletes the old one. ...

April 30, 2012 · 3 min · 431 words · christian

TSM and NetApp - Quick Hint

Well, to save everyone else the trouble (since it isn’t documented anywhere - and I just spent about an hour finding the cause for this), if you need to configure NDMP on your NetApp Filer, make sure you also configure an interface other than e0M. Apparently the necessary controlport for NDMP (10000) is being blocked on e0M, thus ndmp may be configured and running, however TSM is gonna complain that it is unable to connect to the specified data mover. ...

April 25, 2012 · 1 min · 118 words · christian

SLES11-1 and updated multipath-tools

Well, after I scripted the installation the other day, I tried installing SLES11.1-Updates to the freshly installed systems. Guess what ? The thing broke. Initially (it was late Friday afternoon - like 6 PM - before my one week vacation) I didn’t have much time to debug the issue, so I sat down last week and looked at the issue. During the installation, when first starting multipath via command line, the scsi-mpatha device appears, and each and every occurance of this is subsequentially being used (and other stuff replaced by this actually) during the whole installation phase. ...

April 7, 2012 · 2 min · 401 words · christian

VMware Update Manager issues

Well, I recently (last Wednesday) had a lot of trouble with Update Manager. First I thought, upgrading vCenter and modules to 5.0U1 would solve my troubles, however it did not. Update Manager was still complaining about something. Since neither in the vCenter Update Manager nor the vCenter log itself were having any useful information I enabled SSHd and the ESXi Shell via the vCenter client: SSH’ed into the ESX host and looked at /var/log/esxupdate.log, and found this particular log: ...

March 31, 2012 · 2 min · 305 words · christian

WDS and multi-architecture boot images

Well, I recently stumbled upon another cute bug/feature with Windows Deployment Services. When you already have 32bit boot images (as we do) and then add an 64bit boot image (which we needed, since the drivers for UCS firmware v2.0 only support Windows Server 2008 R2) you still only see the 32bit images. Why ? Because apparently the client (in my case a UCS blade) isn’t reporting it’s architecture correctly in the PXE phase. Microsoft actually has a KB article for this. You only need to enable architecture discovery. ...

February 24, 2012 · 1 min · 108 words · christian

Microsoft Cluster on VMware and Devices

Well, once again the Microsoft Cluster on VMware bit my ass … As you might know, MSCS on VMware is a particular kind of pain, with each upgrade you end up with the same problem over and over again (SCSI reservations on the RDM-LUNs being one, and the passive node not booting being the other). So I opened up another support case with VMware, and the responded like this: Please see this kb entry: http://kb.vmware.com/kb/1016106 ...

February 11, 2012 · 2 min · 299 words · christian