Friday, April 5, 2013

DU - Show Directories Over A Certain Size


Space seems to be something there's never enough of, and when you need to find out what's using it up, du is your friend. Getting useful output from it is half the battle though, and you'll often say to yourself "I only want to find directories over a certain size!" Let's say you only care about directories that are over 2 gigs - how do we filter the output from du to show that?

Here's my favorite method:


du --max-depth=5 --block-size=GiB | grep "^[2-9]" | sort -n

So what's going on here? 

First we're telling du to only scan no deeper than five levels of sub-directories - this will help speed things up. We can always increase this number later if we need to.

Next, we're telling it to show the results in gigabytes, which will be much more human-friendly for us to read.

After that, we pipe it into grep, where we use a simple regular expression to include only lines that start with a number from two to nine. This excludes things that are smaller than two gigs.

The final step is running it through sort with the -n (numeric) option, so we can see the results sorted in order by smallest to largest.

We'll run this on my local test machine so you can see what the results will look like. I've switched the block size to megabytes, because it's a very small test server - there's nothing even in the 2 gig range! We're using a folder depth of two as well, so we can have a nice compact example.


[root@localhost /]# du --max-depth=2 --block-size=MiB | grep "^[2-9]" | sort -n
2MiB ./etc/pki
2MiB ./lib/udev
2MiB ./var/www
3MiB ./lib64/security
3MiB ./lib/kbd
8MiB ./bin
8MiB ./usr/include
9MiB ./etc/gconf
25MiB ./lib64
27MiB ./boot
28MiB ./usr/sbin
33MiB ./lib/firmware
33MiB ./usr/libexec
35MiB ./etc
48MiB ./usr/src
65MiB ./var/lib
83MiB ./var/cache
99MiB ./lib/modules
304MiB ./var/log
316MiB ./usr/lib
355MiB ./usr/lib64
452MiB ./var
3836MiB .
[root@localhost /]# 



And that's all there is to it! Short, sweet and simple.

1 comment:

  1. Hi James. I was reading your site_test.sh code on GitHub and I'm trying to modify it for my specific purposes. Would you mind answering a question about my methodology?

    1) The site that I am trying to log into is http://www.americanmanuscripteditors.com/login.aspx

    2) I'm trying to parse out what data I need to provide (like you had --data "userName=YourUserNameHere", etc.). I'm confused about the name values
    for my username and password ("ctl00$itemUserStatus$txtPassword"?), and if I need to provide anything else besides a username and password ("login=Login", etc.)

    Any advice you could offer would be greatly appreciated.

    Thank you very much. -Kathy (hobbies4kk@gmail.com)

    ReplyDelete