Snmp | BAFM

Nagios: Integrating Cisco switches

Well, as I wrote recently, we received a new BladeCenter a few weeks back. Now, as we slowly take it into service I was interested in watching the utilization of the back planes as well as the CPU utilization of the Cisco Catalyst 3012 network switches. The first mistake I made, was to trust Cisco with their guide about how to get the utilization from the device using SNMP. They stated some OID’s, which I tried with snmpwalk and got a result from. 1 2 snmpwalk -v1 -c public -O n 10.0.0.35 .1.3.6.1.4.1.9.5.1.1.8 .1.3.6.1.4.1.9.5.1.1.8.0 = INTEGER: 0 Now, as I tried retrieving the SNMP data by means of the check_snmp plugin, I got some flaky results: 1 2 3 4 /usr/lib/nagios/plugins/check_snmp -H 10.0.0.35 -C public .1.3.6.1.4.1.9.5.1.1.8 SNMP problem - No data received from host CMD: /usr/bin/snmpget -t 1 -r 5 -m '' -v 1 [authpriv] 10.0.0.35:161 Those of you, who read the excerpts carefully will notice the difference between snmpwalk and the OID I passed on to check_snmp. The point being, the OID’s Cisco gave in their Design tech notes are either old, or just not accurate at all. After passing on the .0 to each value given by Cisco, the check_snmp is all honky dory and integrated into Nagios. As usual, the Nagios definitions are further down, for those interested.

Monitoring Brocade FC switches with SNMP/Nagios

I looked into the mess a bit more, and as it turns out, the weird crap I was talking about only happens if you have a port with LossofSynchronization, LossofSignal or LinkFailures value with the base of ten (i.e. 10, 101 or 10.000). Additionally, the OID’s for those three failure elements seem to be dependent on the firmware version, as with 6.3.x they appear as different OIDs. So I may need to introduce another command-line switch, which selects the firmware version and depending on that, the OID. ...

Monitoring Brocade FC switches with Nagios

The last four days I spent looking for ways on monitoring a Brocade Fibrechannel switch (in my case IBM 2145 B32/F40). The first thing I came up with, is using SNMP. As it was already configured for the previous monitoring with Munin, getting information should be quite easy. After looking through Google for a bit, there is already one script that worked for me. Only trouble I had with that script, is that it crams every single port into one result. As I wanted something, that a) could watch a single port and b) return performance data, I went ahead an used the script to do a basic rewrite. But after a short while, I grew antsy and started writing a script from scratch, using the OIDs I got from that script and a Cacti template. ...

Nagios: SNMP OID's for IBM's RSA II adapter

Well, after some poking around I finally found some OID’s for the RSA’s (only through these two links: check_rsa_fan and check_rsa_temp). For Nagios, I dismissed the fans, since the fan speed is only passed on in percent values. So I only added this: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 define hostgroup{ hostgroup_name rsa-snmp alias Remote Supervisor Adapter (allowing SNMP connections) } define service{ use generic-perfdata check_command check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.2.1.1!45!60!°C!Temperature CPU0! hostgroup_name rsa-snmp service_description TEMP CPU0 } define service{ use generic-perfdata check_command check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.2.2.1!45!60!°C!Temperature CPU1! hostgroup_name rsa-snmp service_description TEMP CPU1 } define service{ use generic-perfdata check_command check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.5.1.0!29!35!°C!Temperature Ambient! hostgroup_name rsa-snmp service_description TEMP AMBIENT } Oh, and if anyone else is curious like me, here’s the list with the OID’s, courtesy of Gerhard Gschlad and Leonardo Calamai. ...

Monitoring the IBM BladeCenter chassis with Nagios

Today I ended up working out the details on what we want to monitor regarding our BladeCenter. The most interesting details (for us that is) are these: Fan speeds for Chassis Cooling/Power Module Cooling Bay(s) Temperature Power Domain utilization It wasn’t * that* hard to implement. Only trouble(s) I ran into, were ( 1) IBM did a real shitty job with the MIB’s. If you look closely into the mmblade.mib, you’re gonna notice, that not a single OID is specified for the events. ( 2) As the MIB’s weren’t documented anywhere, I had to look them up via snmpwalk (which I had never used before). So as a reminder (to myself), here’s how it is done: 1 snmpwalk -v1 -c public -O n 10.0.0.35 .1.3.6.1.4.1.2.3.51.2.2 This will get you a list, with a lot of output (5154 lines to be exact). Lucky me, the web interface of the management module/ssh interface is rather verbose, so all you need to do is compare those values with what you are looking for. So for myself (and anyone interested) read ahead for the list of checks we are currently running on the management module.