Scratchy       

 

 

 
 
Home | About | Download | Docs | Screenshots | Links | Sourceforge |

SourceForge.net Logo Donate 
to this project

About

Scratchy is a set of scripts to parse Apache web server log files and extract useful information. From this data, Scratchy will create HTML reports so that website administrators can easily view the information and determine trends and their typical audience.

Scratchy began as a proof-of-concept which allowed me to compile stats about my personal website. As time progressed I continually added features and improvements and I felt that it would be useful to others.

Why Scratchy?

Well, the name of the project of course comes from the Simpsons "Itchy and Scratchy Show". The functionality that the project aims to supply is a complete log parsing and report generating tool. Also, there seemed to be a need for such a project in Python. I have seen some other Apache log parsers but they were developed in other languages (such as Perl, C, etc). One goal of this project is for it to be extensible, to that tune, most of the report appearance can be easily modified by tweaking a single config file.

What information does Scratchy report?

  • Accessed web pages
  • Hosts accessing your website
  • Operating systems
  • Browsers and versions
  • Search engines
  • Robots
  • File types accessed
  • Errors
  • Country name lookups (if enabled).
  • Charts of most data (if enabled).
  • A trace of pages accessed by each ip address (if enabled).

    Status

    Scratchy is under active development and although it currently has a rich feature set there are still some areas for improvement. I'm also interested in receiving feedback and suggestions because I don't know what other people are using Scratchy for.

    Although, every attempt at guessing user-agent, robots and search engine data has been made, I realize that there are some (if not many) that are not recognized. If you encounter some user-agents that are not being detected (correctly or at all) please email me the name of the user-agent (such as Netscape, IE) and any line(s) from your logfile that contain this user-agent. Specifically, if the user-agent is described in different ways (Win NT, Windows NT, Windows_NT, etc) please provide me with all variations.

    The next version will probably contain some bug fixes and additional data pages (such as full listings of data that is only partially listed on the main summary page).