patch2mail for SLES10

Well, there is this “nifty” tool called patch2mail, which basically converts the XML for the updates to a more readable format. But you’re screwed if you want to do the same on SLES10. Since it ain’t shipping with the zypper xml wrapper thing, you need to do it a bit different. So I ended up writing a small (and yet, ugly) shell script to generate me a mail of my liking .. ...

August 8, 2014 · 2 min · 302 words · christian

MessPC Ethernetbox 2 and Nagios

As I talked to Tobi yesterday, we came to talk about our Ethernet Box thermometer. It’s a neat device, which works pretty much out of the box. Integrating it with Nagios is a bit of a bummer. That’s what the ~300 EUR box looks like. It’s basically a small black box with a RJ45 jack, and four RJ11 jacks for attached external devices. The box itself only functions as a " management station" and doesn’t come with a sensor. Normally, you can attach up till four RJ11 sensors to it. But, MessPC also has RJ11 port splitters, which enables you to attach up to eight RJ11 sensors to the MessPC. As you can see, the box has a RJ45 jack on the other side, which you basically hook up to your network and then configure an IP address (or if you fancy DHCP for those things, it’s possible too). On the opposite site, are the RJ11 jacks for the sensors. As you can see, we currently do have 4 splitters attachted to the box, enabling up till 8 sensors to be measured. Once you have it up and running, you can look at the web interface and you’ll be able to see the state of the sensors right on the first page.

August 8, 2014 · 3 min · 480 words · christian

Linux-HA and Tivoli Storage Manager

Well, since we received part of our shipment on Wednesday, I finally looked at how we’re gonna deploy our active/active Tivoli Storage Manager configuration. Right now, we do have a single pSeries box hosting ~100 client nodes which we’re looking to split by two (since we do have two x366 for that purpose now). Now, as there ain’t no solution for this scenario yet (neither from International Business Machines nor someone out of the open source community), I sat down and started writing an OCF Resource agent for dsmserv (that is the Tivoli Storage Manager server). ...

August 8, 2014 · 13 min · 2746 words · christian

VMware vCenter: is not connected

Well, today I once again had the case where a virtual machine (in my case a Virtual Machine Template) was kinda stuck. You couldn’t remove the template (as in the entries for “Remove from inventory” was grayed out) and you couldn’t re-add the Virtual Machine’s VMX from the datastore browser either. VI Client - Disconnected templates Though, a simple putting the host into maintenance mode and rebooting helped that problem. Maybe there is a simpler solution for this, I just don’t know about it. ...

August 8, 2014 · 1 min · 182 words · christian

Tivoli Storage Manager Client and Microsoft Cluster Services

Well, I just had another look at our client scheduler services on our Microsoft Cluster. A while back we noticed that those scheduler services were going nuts after some time. Well, as it turns out, I can tell why. Microsoft Cluster Services have a feature called registration replication, which replicates a given key, if changed when the resource is online, to all connected cluster nodes. Now, we added the obvious registry key to the settings of our cluster resources for the scheduler services ( SOFTWAREIBMADSMCurrentVersionBackupClientNodes) and the scheduler service would use the same registry key to store it’s passwords. But it seems we were far off with that assumption. ...

August 8, 2014 · 1 min · 132 words · christian

IBM RDAC: Installing the driver for a (not yet) running version

Well, kernel updates on our Linux servers running IBM’s RDAC driver (developed by LSI) is a real pest .. especially if you have to reboot the box two times in order to install the drivers/initrd correctly. So I sat down and looked at the Makefile. Turns out, it just needs four tweaks in order to be working with a different kernel version (which you have to pass using environment variables to make). ...

August 8, 2014 · 2 min · 290 words · christian

Novell KMP: KMP'ing IBM's RDAC driver

Well, after yesterday’s lesson about getting the IBM RDAC to install for a not-yet-running kernel, I decided to take it a step further. Novell does have some documentation about KMP’s, which is actually rather good, especially the guide written by Andreas Grünbacher. After a short tinkering, I got it actually working. I was kinda surprised, at how easily it actually is. One problem I still have to deal with, is modifying the %post, to generate the mpp-initrd image. For now, the KMP only contains the default %post, which updates the modules.* stuff. ...

August 8, 2014 · 2 min · 318 words · christian

Novell KMP: Useable version of ibm-rdac-ds4000

After some more tinkering, a lot more looking at the macros in /usr/lib/rpm/rpm-suse-kernel-module-subpackage and /usr/lib/rpm/suse_macros, I think I finally have a usable RPM’ified version of IBM’s Multipathing driver ready for use. There is still one major annoyance left: each time you install a new ibm-rdac-ds4000-kmp RPM, you also need to reinstall the corresponding ibm-rdac-ds4000-initrd package, as the macros in /usr/lib/rpm don’t allow for custom %post or %postun. As mentioned before, I’m gonna send them to LSI/IBM for review, and maybe, MAYBE they are actually gonna make use of that. ...

August 8, 2014 · 2 min · 278 words · christian

Weird TS3500 problem

Well, today we had a rather weird problem with our TS3500. TSM running on AIX basically went bonko and spit out weird media sense errors, all stating that there is a hardware or media error of unknown nature: 1 2 3 4 5 6 7 8 9 ANR8943E Hardware or media error on library LIB3584 (OP=00006C03, CC=-1, KEY=04, ASC=44, ASCQ=00, SENSE=70.00.04.00.00.00.00.46.00.00.00.00.44.00.00.00.00- .00.40.82.00.00.00.40.00.00.02.00.48.01.A1.00.00.00.00.0- 0.06.1B.00.01.09.00.00.00.00.00.00.00.00.00.00.00.00.00.- 00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00- .00.00.00.00.00., Description=An undetermined error has occurred). Refer to Appendix C in the 'Messages' manual for recommended action. ANR8381E LTO volume HG4480L4 could not be mounted in drive DR9 (/dev/rmt8). After restarting the TSM server (as in the service, not the whole box) five times, which didn’t resolve squat we decided to take a look at the TS3500 itself. We opened up the Management interface and tried moving a tape into a drive. That didn’t work. Hrmmmmm. ...

August 8, 2014 · 2 min · 269 words · christian

Linux-HA and Tivoli Storage Manager (Finito!)

As I previously said, I was writing my own OCF resource agent for IBM’s Tivoli Storage Manager Server. And I just finished it yesterday evening (it took me about two hours to write this post). Only took me about four work days (that is roughly four hours each, which weren’t recorded in that subversion repository) plus most of this week at home (which is 10 hours a day) and about one hundred subversion revisions. The good part about it is, that it actually just works :-D (I was amazed on how good actually). Now you’re gonna say, “but Christian, why didn’t you use the included Init-Script and just fix it up, so it is actually compilant to the LSB Standard ?” The answer is rather simple: Yeah I could have done that, but you also know that wouldn’t have been fun. Life is all about learning, and learn something I did (even if I hit the head against the wall from time to time ;-) during those few days) … There’s still one or two things I might want to add/change in the future (that is maybe next week), like adding support for monitor depth by querying the dsmserv instance via dsmadmc (if you read through the resource agent, I already use it for the shutdown/pre-shutdown stuff) I still have to properly test it (like Alan Robertson mentioned in his one hour thirty talk on Linux-HA 2.0 and on his slides, Page 100-102) in a pre-production environment I’m probably configure the IBM RSA to act as a stonith device ( s hoot t he o ther n ode i n t he h ead) - just for the case one of them ever gets stuck in a case, where the box is still up, but doesn’t react to any requests anymore

August 8, 2014 · 7 min · 1337 words · christian