Filebucketing to the MAXXXXX

March 12, 2009 Technical, General

Every now and then we see an example of application failure so astounding it literally brings tears to our eyes. We have a client whose legacy application is unfortunately still running on an ancient version of Oracle Weblogic and which must be maintained until the new, flashy .NET version of their site is complete.

We were alerted this morning to a problem with some of the Weblogic content – the pages were timing out. Diagnostics were fairly fruitless – packet captures showed nothing useful, and the logging from Weblogic left much to be desired. We started considering more outlandish possibilities such as I/O load causing issues, recently applied updates and so on. Even rebooting was considered (given it is running on Windows).

The first clue of note was the open file list from the Weblogic processes – one such example stood out:

C:weblogicstateSa0Vb1gRO1OkWqYN9kivIQT2SHGxC3riaE1z
L1YHX5QWgdkBB2PBpPPwuHDKp1a7I0l594sUkQ43+5335517
573874846253_-1062731519_6_8888_8888_7002_702_8888_

For your sanity, I have manually wrapped this Godzilla-like filename.

Perhaps you are familiar with file bucketing already, but if not, typically the directory structure used will have a relatively sane scheme for locating files and only extend a few levels deep. What we saw in this instance was a completely new breed of monster. Admittedly the absolute path of this file is less than 200 characters out of a limit of more than 32,000 but the naming strategy and depth of the structure has us flummoxed.

But this was only the tip of the proverbial iceberg. When we requested Windows to show us the properties of this state folder it took over an hour to completely calculate the file and folder totals, and the result is impressive:

Web logic makes efficient use of the filesystem

Web logic makes efficient use of the filesystem

Yes you read that right – over 10 million nested directories. By this stage we had already moved the state directory out of the way and created a new one, and restarted Weblogic. It seemed happy and quite responsive after that. My suspicion is that someone developing this application at some point ran into a limitation with their filebucketing algorithm, and resolved to solve the problem once and for all, evidently by making it possible to efficiently filebucket every file in the known universe.