How to Detect and Prevent Directory Traversal Hacks
Directory traversal is a really basic weakness, but it can turn up interesting — sometimes sensitive — information about a web system, making it prone to hacks. This attack involves browsing a site and looking for clues about the server’s directory structure and sensitive files that might have been loaded intentionally or unintentionally.
Perform the following tests to determine information about your website’s directory structure.
A spider program, such as the free HTTrack website Copier, can crawl your site to look for every publicly accessible file. To use HTTrack, simply load it, give your project a name, tell HTTrack which website(s) to mirror, and after a few minutes, possibly hours, you’ll have everything that’s publicly accessible on the site stored on your local drive in c:My websites.
Complicated sites often reveal more information that should not be there, including old data files and even application scripts and source code.
Inevitably, when performing web security assessments, there are usually .zip or .rar files on web servers. Sometimes they contain junk, but oftentimes they hold sensitive information that shouldn’t be there for the public to access.
Look at the output of your crawling program to see what files are available. Regular HTML and PDF files are probably okay because they’re most likely needed for normal web usage. But it wouldn’t hurt to open each file to make sure it belongs there and doesn’t contain sensitive information you don’t want to share with the world.
Google can also be used for directory traversal. In fact, Google’s advanced queries are so powerful that you can use them to root out sensitive information, critical web server files and directories, credit card numbers, webcams — basically anything that Google has discovered on your site — without having to mirror your site and sift through everything manually. It’s already sitting there in Google’s cache waiting to be viewed.
The following are a couple of advanced Google queries that you can enter directly into the Google search field:
site:hostname keywords — This query searches for any keyword you list, such as SSN, confidential, credit card, and so on. An example would be:
filetype:file-extension site:hostname — This query searches for specific file types on a specific website, such as doc, pdf, db, dbf, zip, and more. These file types might contain sensitive information. An example would be:
Other advanced Google operators include the following:
allintitle searches for keywords in the title of a web page.
inurl searches for keywords in the URL of a web page.
related finds pages similar to this web page.
link shows other sites that link to this web page.
An excellent resource for Google hacking is Johnny Long’s Google Hacking Database.
When sifting through your site with Google, be sure to look for sensitive information about your servers, network, and organization in Google Groups, which is the Usenet archive. If you find something that doesn’t need to be there, you can work with Google to have it edited or removed. For more information, refer to Google’s Contact us page.
Countermeasures against directory traversals
You can employ three main countermeasures against having files compromised via malicious directory traversals:
Don’t store old, sensitive, or otherwise nonpublic files on your web server. The only files that should be in your /htdocs or DocumentRoot folder are those that are needed for the site to function properly. These files should not contain confidential information that you don’t want the world to see.
Configure your robots.txt file to prevent search engines, such as Google, from crawling the more sensitive areas of your site.
Ensure that your web server is properly configured to allow public access to only those directories that are needed for the site to function. Minimum privileges are key here, so provide access to only the files and directories needed for the web application to perform properly.
Check your web server’s documentation for instructions on controlling public access. Depending on your web server version, these access controls are set in
The httpd.conf file and the .htaccess files for Apache.
Internet Information Services Manager for IIS
The latest versions of these web servers have good directory security by default so, if possible, make sure you’re running the latest versions.
Finally, consider using a search engine honeypot, such as the Google Hack Honeypot. A honeypot draws in malicious users so you can see how the bad guys are working against your site. Then, you can use the knowledge you gain to keep them at bay.