header

Torsten Curdt’s weblog

Recursive file listing in java

It turns out that when you search on the web for “recursive file java” you only find horrible examples on how to implement directory traversal. It’s such a simple algorithm but I feel obliged to provide a better example to the world. In fact directory traversal can be very elegantly be hidden by using anonymous classes. You just need to extend the following class


public class FileTraversal {
	public final void traverse( final File f ) throws IOException {
		if (f.isDirectory()) {
			onDirectory(f);
			final File[] childs = f.listFiles();
			for( File child : childs ) {
				traverse(child);
			}
			return;
		}
		onFile(f);
	}

	public void onDirectory( final File d ) {
	}

	public void onFile( final File f ) {
	}
}

and then you can locally override and implement the methods you need.


new FileTraversal() {
    public void onFile( final File f ) {
        System.out.println(f);
    }
}.traverse(new File("somedir"));

This cleanly separates out the actual traversal. So simple!

7 Responses to “Recursive file listing in java”

  1. leopard said, on 12. November 2007 at 22:25

    erm… this is calling onFile() on directories too!?

  2. vastheman said, on 13. November 2007 at 0:40

    I know I’m being evil, but a user could have a purpose for this or something similar:

    mkdir parent parent/child
    cd parent/child
    ln -s ../.. backlink

    Infinite recursion for you just by having a symlink to a parent directory! Then there’s mutual loops like this:

    mkdir parent parent/a parent/b
    cd parent/a
    ln -s ../b b
    cd ../b
    ln -s ../a a

    Another infinite recursion crash! There are two ways to sidestep this: first of all, you can put arbitrary limit on the recursion depth (make traverse() take an additional depth parameter, pass depth - 1 when it recurses and return immediately if depth is zero); the second is to have a hash set and add each directory path to it as you see it, so you can skip directories you’ve already processed.

    Simple problems are never as simple as you think to begin with.

  3. tcurdt said, on 13. November 2007 at 0:58

    leopard: No, you missed the return statement in the directory block.

  4. tcurdt said, on 13. November 2007 at 1:33

    Vas: hey, mate! :) …of course you are right. There is no such thing as “simple” …and this probably should be dealt with better. But believe it or not - this will not give those suspected crashes! Actually this turns out to be more interesting than I thought…

    In the first situation you get a single file:

    “parent/child/backlink/parent/child/backlink/parent/ ….very long line… /child/backlink”

    For your second example I get two files:
    “parent/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b” “parent/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a”

    …at least on OSX.

  5. Niall said, on 13. November 2007 at 5:57

    Alternatively use Commons IO’s DirectoryWalker - which provides an extensible object for directory traversal and uses FileFilters to narrow down the traversal results.

  6. vastheman said, on 15. November 2007 at 7:49

    Interesting. OSX appears to be limiting the depth somehow. I’m sure I’ve crashed someone’s directory traversal program by doing this at some point. Can’t remember the technology and the platform, though.

    BTW - if you put a file in one of the directories, it will process it multiple times, though.

  7. Reinhard Pötz said, on 22. November 2007 at 15:01

    If you want to delete directories in the onDirectory() method, you have to slightly alter the FilteTraversal class by adding a null check:

    final File[] children = f.listFiles();
    if (children != null) {
    for (File child : children) {
    traverse(child);
    }
    }

    Otherwise you run into a nullpointer exception.

    Otherwise thanks for posting this helpful piece of code!

Leave a Reply

Please copy the string eoaiqz to the field below: