|
![]() ParseThe Scratchy parser, parse.py attempts to parse an Apache log file:$ python parse.py This assumes you have a config file named "config". If you have multiple config files and/or a config file named differently you will then use the -c or --config command line parameter to tell the parser which config file to use: $ python parse.py --config=config_search -or- $ python parse.py -c config_search Either of the previous commands will invoke the parser using the file named config_search as the config file. Assuming everything went well the apache log described in the config file was parsed and the respective reports were produced. If you have a different log file to parse rather than the one you mentioned in the config file you can specify this alternative log file on the command line with the -f or --file option: $ python parse.py --file=/usr/local/apache/log/access_log.4 -or- $ python parse.py -f /usr/local/apache/log/access_log.4 Either of thse commands will attempt to parse the file /usr/local/apache/log/access_log.4 rather than the file named in the config file. The config file parameter is useful when you usually parse the same log. The command line parameter is useful when you want to parse a different (perhaps historical) access log. The parser keeps track of parsed files and offsets, so it is safe to parse access_log repeatedly. Consider this scenario where log rotation is used:
The parser uses a file called filetracker to store all of the logfiles that have ever been parsed and the last location in the file that was parsed. This way, if the log file grows then the previously parsed data will be ignored. Even if the file changes names (due to rotation, for instance) since the first line will always remain the same, the parser continues to work properly. The parser will collect data from the desired apache web server logfile and organize it for easy retrieval and manipulation for the reporter. For each month of data that is found the data is written to a file and the reporter is invoked. That is, if an apache log contains data that spans 3 months, Jan, Feb and Mar then a file will be created for Jan, the reporter invoked and then the process will repeat with Feb and Mar. The data is written to a file that depends on the DATA_DIR and DATA_NAME (that are specified in your config file) and the month and year of the data being parsed. Assume that we are parsing data for Novemeber, 2002 and DATA_DIR and DATA_NAME are defined as such::
DATA_DIR=/home/phil/scratchy_data
the file that will be created will then be: After viewing the report, you may wish to modify your config file and re-run the report manually.
|