header

Torsten Curdt’s weblog

Recursive file listing in java

It turns out that when you search on the web for “recursive file java” you only find horrible examples on how to implement directory traversal. It’s such a simple algorithm but I feel obliged to provide a better example to the world. In fact directory traversal can be very elegantly be hidden by using anonymous classes. You just need to extend the following class


public class FileTraversal {
        public final void traverse( final File f ) throws IOException {
                if (f.isDirectory()) {
                        onDirectory(f);
                        final File[] childs = f.listFiles();
                        for( File child : childs ) {
                                traverse(child);
                        }
                        return;
                }
                onFile(f);
        }

        public void onDirectory( final File d ) {
        }

        public void onFile( final File f ) {
        }
}

and then you can locally override and implement the methods you need.


new FileTraversal() {
    public void onFile( final File f ) {
        System.out.println(f);
    }
}.traverse(new File("somedir"));

This cleanly separates out the actual traversal. So simple!

  • Astabi
    Thank You!

    Just what I was looking for.
  • SF
    Very nice solution. Stylistically it is very clean. However, if you need to access any variable outside of your traversal, it becomes much less elegant.

    For example,

    Boolean someUsefulParamater

    new FileTraversal() {
    public void onFile( final File f ){
    if( someUsefulParamater )
    System.out.println( f.toString() );
    }
    }.traverse(newFile("somefile"));

    Doesn't work.

    You can only use final variables, which can be unfortunate/unwanted/ugly.

    Thanks for the post though!
  • SK
    Hello,

    Solution looks cool, but it may throw "Too many files open" exception, if the directory tree structure or depth is too huge.

    Non-recursive solution would be advisable for serious project purposes.

    Thank you,
  • thnx
    thnx, thnx, thnx...
    u r so right with: "...you only find horrible examples on how to implement directory traversal..."
    so thnx again, its so easy!
  • If you want to delete directories in the onDirectory() method, you have to slightly alter the FilteTraversal class by adding a null check:

    final File[] children = f.listFiles();
    if (children != null) {
    for (File child : children) {
    traverse(child);
    }
    }

    Otherwise you run into a nullpointer exception.

    Otherwise thanks for posting this helpful piece of code!
  • Interesting. OSX appears to be limiting the depth somehow. I'm sure I've crashed someone's directory traversal program by doing this at some point. Can't remember the technology and the platform, though.

    BTW - if you put a file in one of the directories, it will process it multiple times, though.
  • Niall
    Alternatively use Commons IO's DirectoryWalker - which provides an extensible object for directory traversal and uses FileFilters to narrow down the traversal results.
  • Vas: hey, mate! :) ...of course you are right. There is no such thing as "simple" ...and this probably should be dealt with better. But believe it or not - this will not give those suspected crashes! Actually this turns out to be more interesting than I thought...

    In the first situation you get a single file:

    "parent/child/backlink/parent/child/backlink/parent/ ....very long line... /child/backlink"

    For your second example I get two files:
    "parent/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b" "parent/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a/b/a"

    ...at least on OSX.
  • leopard: No, you missed the return statement in the directory block.
  • I know I'm being evil, but a user could have a purpose for this or something similar:

    mkdir parent parent/child
    cd parent/child
    ln -s ../.. backlink

    Infinite recursion for you just by having a symlink to a parent directory! Then there's mutual loops like this:

    mkdir parent parent/a parent/b
    cd parent/a
    ln -s ../b b
    cd ../b
    ln -s ../a a

    Another infinite recursion crash! There are two ways to sidestep this: first of all, you can put arbitrary limit on the recursion depth (make traverse() take an additional depth parameter, pass depth - 1 when it recurses and return immediately if depth is zero); the second is to have a hash set and add each directory path to it as you see it, so you can skip directories you've already processed.

    Simple problems are never as simple as you think to begin with.
  • leopard
    erm... this is calling onFile() on directories too!?
blog comments powered by Disqus