How to find growing files on Linux
This may become handy, if you are logged in to a machine, which you do not know much about, but where you can see that the disk space is running out. I think this scenario is pretty common - i have seen it simply with logs, but also with applications bugging and producing tons of data.
Specific files
Create a temporary file, and use the find
command to find newer files than that file.
Create an empty file:
$ touch newer_than_this_file
Look for files on the whole machine ("/"), which are newer ("-newer") than the file you just created ("newer_than_this_file"). Do not look for files in /proc/ ("-not -path "/proc/*""). Run the ls -lh
command on the files found:
$ find / -newer newer_than_this_file -not -path "/proc/*" -exec ls -lh {} \;
We can also specify a certain size if we are looking for a minimum size (1M or greater in this example):
$ find / -newer newer_than_this_file -size +1M -not -path "/proc/*" -exec ls -lh {} \;
Open files
Use the lsof (list open files) command to figure out what files your machine are using right now:
$ lsof
To make things easier i have made this line which does a few things:
$ lsof / > lsof_1.txt; sleep 15; lsof / > lsof_2.txt; sdiff -w250 lsof_1.txt lsof_2.txt > lsof_difference.txt; cat lsof_difference.txt | egrep '\||<|>'
- It calls lsof and outputs the list of open files to file called lsof_1.txt. When you call lsof with / it only select physical files.
- Then it sleeps (waits) for 15 seconds
- Then it calls lsof again and outputs the list of open files to a new file called lsof_2.txt. When you call lsof with / it only select physical files.
- Then we call sdiff to make a side-by-side difference of those two files, and redirect stdout to a file called lsof_difference.txt.
- At last we egrep (regular expression grep) for the symbols "|", "<", or ">".
Then we get all differences printed out between those 5 seconds.
sdiff
symbols explained shortly:
< - means that the line only exists in file 1
> - means that the line only exists in file 2
| - means that the lines from file 1 and file 2 are different
What processes
iotop (top-like diagnostics for io) is great to see what processes/services/applications is writing the most files. Just use it like the following:
$ iotop
Go for a specific folder
Here the du
command is very handy. Go for, eg. the logs folder, and check what each directory's size is:
$ du -h /var/log/
78M /var/log/apache2
24K /var/log/redis
81M /var/log/munin
28K /var/log/mongodb
4.0K /var/log/iptraf
4.0K /var/log/unattended-upgrades
12K /var/log/fsck
8.0K /var/log/dbconfig-common
4.0K /var/log/mysql
4.0K /var/log/varnish
108K /var/log/proftpd
4.0K /var/log/atop
12K /var/log/ajenti
4.0K /var/log/samba
4.0K /var/log/sysstat
176K /var/log/apt
4.0K /var/log/puppet
4.0K /var/log/news
252K /var/log/nginx
233M /var/log/
Then you can simply use ls
, on the directory you find evil (-lathr is good flags for listing. long listing format, hidden files, sort by modification time, human-readable output, recursive list):
$ ls -lathr
Good luck hunting them down.