A few days ago a colleague of mine came to me with a strange problem. His server could not write to the /var partition. Every time he got a "No space left on device" error message. Of course he did look at a df output to see if there was really nothing left. And this was what he got.
# df -h /var /dev/mapper/VG_LOCAL_STORAGE-LV_VAR 504M 309M 171M 65% /var # touch /var/test No space left on device
Now, what happens here? At first I noticed the relative small size of the partition. So among the first things that came to my mind were Quotas or Inodes. And Inodes was the way to go. They were all used up and thus no file could be written. Since every new file needs at least 1 Inode to be created. Now that we know the problem, we can search for the root cause. At first we need to find the folder, which contains all these files. A quick way to do this would be this script. It shows the number of files contained in the subdirectories.
for DIR in $(ls /var) do echo -n $DIR" " find $DIR| wc -l done
At the end we found, that pacemaker stores all cluster transitions an separate files. These are located in /var/lib/pengine (pacemaker) or /var/lib/heartbeat/pengine (heartbeat2). Pacemaker does never delete any files. So by time they all pile up, until the disk is full. If you got a /var partition that is multiple gigabytes in size, you usually will never notice. But I think it better to prevent this from ever happening. Both pacemaker and heartbeat2 have Options you can set to specify a maximum number of files, which are kept as history. I think, a reasonable amount of 1000 Files is enough for debugging possible problems. Before setting this you will have to delete all previously created files manually, but from now on Pacemaker will never use more than 1000 files for logs of pengine operations.
crm configure crm(live)configure# property pe-error-series-max=1000 pe-input-series-max=1000 pe-warn-series-max=1000 crm(live)configure# commit
crm_attribute -t crm_config -n pe-error-series-max -v 1000 crm_attribute -t crm_config -n pe-warn-series-max -v 1000 crm_attribute -t crm_config -n pe-input-series-max -v 1000