Well, as I wrote recently, we received a new BladeCenter a few weeks back. Now, as we slowly take it into service I was interested in watching the utilization of the back planes as well as the CPU utilization of the Cisco Catalyst 3012 network switches.

The first mistake I made, was to trust Cisco with their guide about how to get the utilization from the device using SNMP. They stated some OID’s, which I tried with snmpwalk and got a result from.

1
2
snmpwalk -v1 -c public -O n 10.0.0.35 .1.3.6.1.4.1.9.5.1.1.8
.1.3.6.1.4.1.9.5.1.1.8.0 = INTEGER: 0

Now, as I tried retrieving the SNMP data by means of the check_snmp plugin, I got some flaky results:

1
2
3
4
/usr/lib/nagios/plugins/check_snmp -H 10.0.0.35 -C public
                                   .1.3.6.1.4.1.9.5.1.1.8
SNMP problem - No data received from host
CMD: /usr/bin/snmpget -t 1 -r 5 -m '' -v 1 [authpriv] 10.0.0.35:161

Those of you, who read the excerpts carefully will notice the difference between snmpwalk and the OID I passed on to check_snmp.

The point being, the OID’s Cisco gave in their Design tech notes are either old, or just not accurate at all. After passing on the .0 to each value given by Cisco, the check_snmp is all honky dory and integrated into Nagios.

As usual, the Nagios definitions are further down, for those interested.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
define host {
  use                   generic-network
  host_name             c3012-1
  alias                 c3012-1.home.barfoo.org
  address               10.0.0.35
  hostgroups            network
}

define service {
  use                   generic-service
  host_name             c3012-1
  service_description   SYS: Backplane utilization
  check_command         check_snmpv1_public!.1.3.6.1.4.1.9.5.1.1.8.0!60!80!
                                            %!"Backplane utilization:"
  action_url            /pnp/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
  notes                 View PNP RRD grap
}

define service {
  use                   generic-service
  host_name             c3012-1
  service_description   SYS: CPU utilization - 5sec
  check_command         check_snmpv1_public!.1.3.6.1.4.1.9.2.1.56.0!130!160!
                                            %!"CPU utilization (5sec):"
  action_url            /pnp/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
  notes                 View PNP RRD grap
}

define service {
  use                   generic-service
  host_name             c3012-1
  service_description   SYS: CPU utilization - 1min
  check_command         check_snmpv1_public!.1.3.6.1.4.1.9.2.1.57.0!80!95!
                                            %!"CPU utilization (1min):"
  action_url            /pnp/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
  notes                 View PNP RRD grap
}

define service {
  use                   generic-service
  host_name             c3012-1
  service_description   SYS: CPU utilization - 5min
  check_command         check_snmpv1_public!.1.3.6.1.4.1.9.2.1.58.0!35!60!
                                            %!"CPU utilization (5min):"
  action_url            /pnp/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
  notes                 View PNP RRD grap
}

For now, the pnp4nagios graphing for the back plane utilization isn’t working (don’t ask me why). Also, it might be a good idea, to combine the CPU utilization commands into one, so you’d get a single graph out of it.