Well, I just stumbled upon something .. My Nagios at work wasn’t working anymore, and I went looking.
1
2
3
4
5
6
7
8
9
10
11
| nagios3 ~ [0] > tail -f /var/log/nagios/nagios.log
[1238658394] Error: Unable to save status file: No space left on device
[1238658403] Error: Unable to save status file: No space left on device
[1238658413] Error: Unable to save status file: No space left on device
[1238658423] SERVICE ALERT: tsm1;POWER WARN;OK;SOFT;4;-u OK - 0
[1238658423] Error: Unable to save status file: No space left on device
[1238658433] SERVICE ALERT: tsm2;LOAD;WARNING;SOFT;1;WARNING - load average: 6.25, 5.72, 5.36
[1238658433] Error: Unable to save status file: No space left on device
[1238658443] Error: Unable to save status file: No space left on device
[1238658453] Error: Unable to save status file: No space left on device
[1238658463] Error: Unable
|
After that, zip - nada. Next thing, check whether or not the device is really full … Okay, df ..
1
2
3
4
5
| nagios3 ~ [130] > df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 3.5G 1.2G 2.1G 37% /
udev 506M 88K 506M 1% /dev
/dev/sdb1 7.9G 7.7G 0 100% /var
|
So, it is actually completely filled up. So, now we need to find who’s hogging the space. Since I had a assumption (pnp4nagios), I went straight for /var/lib …
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
| nagios3 lib [0] > du -sh *
16K CAM
1.1M YaST2
8.0K acpi
4.0K apache2
28K autoinstall
16K dhcpcd
4.0K empty
96K hardware
4.0K logrotate.status
8.0K misc
78M mysql
2.1M nagios
4.0K net-snmp
4.0K news
24K nfs
8.0K nobody
36K ntp
4.0K pam_devperm
824K php5
359M pnp4nagios
22M rpm
28K scpm
4.0K smpppd
4.0K sshd
4.0K support
8.0K suseRegister
4.0K uniconf
4.0K update-messages
4.0K wwwrun
33M zmd
14M zypp
|
That wasn’t it .. so heading to the next place, that’s suspicious most of the time, /var/log.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| nagios3 log [0] > du -sh *
5.2G YaST2
4.0K acpid
1.4G apache2
28K boot.msg
28K boot.omsg
4.0K cups
4.0K dsmerror.log
148K dsmsched.log
4.0K faillog
4.0K krb5
12K lastlog
4.0K localmessages
16K mail
16K mail.info
198M messages
0 mysqld.log
14M nagios
0 ntp
4.0K pnp4nagios
4.0K sa
8.0K scpm
4.0K vmdesched.log
16K vmware-imc
4.0K vmware-tools-guestd
82M warn
348K wtmp
115M zmd-backend.log
24M zmd-messages.log
|
I was like “WTF ? 5.2G for YaST2 logs ?” when I initially saw that output … As of now, I got a crontab emptying /var/log/YaST2 every 24 hours …