Scratchy       

 

 

 
 
Home | About | Download | Docs | Screenshots | Links | Sourceforge |

SourceForge.net Logo Donate 
to this project

Configuring Scratchy

You can customize Scratchy to suit your preferences. To override the default value simply enter the parameter (or modify the current one) in your config file with a new value. Here is a sample config file:

# this is a comment
ACCESS_LOG: /usr/local/apache/logs/access_log
KNOWN_PAGES: html, cgi, doc

In the above example, the ACCESS_LOG and KNOWN_PAGES will be overridden from their default values

The config file parameters are used by both the parser and reporter, where appropriate. Here is a complete list of parameters that can be customized:

Complete List of Configuration Options
Parameter Type Default Applies to
ACCESS_LOG String /var/log/apache/access_log Parse
CHARSET_ENCODING String UTF-8 Report
CHART_COUNTRIES Integer 1 Report
CHART_DAILY Integer 1 Report
CHART_DAY_OF_WEEK Integer 1 Report
CHART_FILE_TYPES Integer 1 Report
CHART_HEIGHT Integer 600 Report
CHART_HOURLY Integer 0 Report
CHART_OPERATING_SYSTEMS Integer 1 Report
COLOR_BANDWIDTH String #00CCCC Report
COLOR_BANDWIDTH_TEXT String #000000 Report
COLOR_BROWSER String #CCCC99 Report
COLOR_BROWSER_TEXT String #000000 Report
COLOR_DEFAULT String #ffffff Report
COLOR_DEFAULT_TEXT String #000000 Report
COLOR_HEADER String #cccccc Report
COLOR_HEADER_TEXT String #000000 Report
COLOR_HITS String #3399FF Report
COLOR_HITS_TEXT String #000000 Report
COLOR_LINK_HOVER_TEXT String #000066 Report
COLOR_LINK_TEXT String #006666 Report
COLOR_LINK_VISITED_TEXT String #006666 Report
COLOR_MORE String #eeeeee Report
COLOR_MORE_TEXT String #000000 Report
COLOR_PAGES String #00CCFF Report
COLOR_PAGES_TEXT String #000000 Report
COLOR_SESSION_AVG String #FFCC33 Report
COLOR_SESSION_AVG_TEXT String #000000 Report
COLOR_SESSION_HEADER String #FFCC66 Report
COLOR_SESSION_HEADER_TEXT String #000000 Report
COLOR_SESSION_MAX String #FFCC33 Report
COLOR_SESSION_MAX_TEXT String #000000 Report
COLOR_SESSION_MIN String #FFCC33 Report
COLOR_SESSION_MIN_TEXT String #000000 Report
COLOR_TITLE String #666666 Report
COLOR_TITLE_TEXT String #ffffff Report
COLOR_VISIT_FIRST String #FFFF99 Report
COLOR_VISIT_FIRST_TEXT String #000000 Report
COLOR_VISIT_HEADER String #FFFF66 Report
COLOR_VISIT_HEADER_TEXT String #000000 Report
COLOR_VISIT_LAST String #FFFF99 Report
COLOR_VISIT_LAST_TEXT String #000000 Report
COLOR_VISIT_NUMBER String #FFFF99 Report
COLOR_VISIT_NUMBER_TEXT String #000000 Report
COUNTRY_LOOKUP Integer 1 Parse & Report
GEOIP_DB String /usr/local/share/GeoIP/GeoIP.dat Parse
DATABASE String mysql Parse & Report
DATA_DIR String data Parse
DATA_NAME String yourdata Parse & Report
ENABLE_IP_TRACE Integer 0 Parse & Report
EXCLUDE_HOSTNAMES String(s) None Report
EXCLUDE_SEARCH_TERMS String(s) None Parse
EXCLUDE_URLS String(s) None Report
KNOWN_ALIASES String(s) 'yourdomain,com', 'www.yourdomain.net' Parse
KNOWN_PAGES String(s) 'html, cgi' Parse
MAX_BROWSERS Integer 10 Report
MAX_BROWSER_VERSIONS Integer 10 Report
MAX_COUNTRIES Integer 15 Report
MAX_ERROR_CODES Integer 5 Report
MAX_ERROR_PAGES Integer 10 Report
MAX_EXTERNAL_LINKS Integer 20 Report
MAX_FILE_TYPES Integer 10 Report
MAX_HOSTS Integer 25 Report
MAX_OPERATING_SYSTEMS Integer 20 Report
MAX_FILES Integer 25 Report
MAX_PAGES Integer 25 Report
MAX_ROBOTS Integer 20 Report
MAX_SEARCH_ENGINES Integer 10 Report
MAX_SEARCH_KEYWORDS Integer 20 Report
MAX_SEARCH_STRINGS Integer 20 Report
MAX_STATUS_CODE Integer 10 Report
MYSQL_HOST String localhost Parse & Report
MYSQL_PORT String None Parse & Report
MYSQL_USERNAME String scratchy Parse & Report
MYSQL_PASSWORD String itchy Parse & Report
SESSION_TIME Integer 3600 Report
SORT_COUNTRIES String hits Report
SORT_FILES String hits Report
SORT_FILE_TYPES String hits Report
SORT_HOSTS String hits Report
SORT_HTTP_STATUS String hits Report
SORT_PAGES String hits Report
SQLITE_DB String scratchydb Parse & Report
TEMPLATE_REPORT String misc/template_report Report
TEMPLATE_SUMMARY String misc/template_summary Report
TIME_DATE_STR String %m/%d/%y %H:%M:%S Report
TIME_STR String %m/%d/%y Report
VERBOSE Integer 1 Parse & Report


ACCESS_LOG

Specify the default location of the Apache access_log file. This value is used by the parser and can be overridden on the command line with the -f or --file argument.

The value for this parameter can either be a single filename or a comma-separated list of filenames.


CHARSET_ENCODING

Specify the HTML charset encoding.

CHART_BROWSERS

Include the browsers chart in report? 1 = yes, 0 = no

CHART_COUNTRIES

Include the countries chart in report? 1 = yes, 0 = no

CHART_DAILY

Include the daily chart in report? 1 = yes, 0 = no

CHART_DAY_OF_WEEK

Include the day-of-week chart in report? 1 = yes, 0 = no

CHART_FILE_TYPES

include the file types chart in report? 1 = yes, 0 = no

CHART_HEIGHT

Specfiy the chart height (in pixels) for most of the charts. This also affects the maximum label length that is printed on these charts. That is, the larger the CHART_HEIGHT, the greater the number of characters in each label along the xaxis that can be printed.

This attribute does not impact the daily, hourly or day of week charts (these are fixed at 300 because their labels are also fixed).


CHART_HOURLY

Include the hourly chart in report? 1 = yes, 0 = no

CHART_OPERATING_SYSTEMS

Include the operating systems chart in report? 1 = yes, 0 = no

COLOR_BANDWIDTH

Specify the background color for the bandwidth column header

COLOR_BANDWIDTH_TEXT

Specify the text color for the bandwidth column header

COLOR_BROWSER

Specify the background color for the row of the browser name.

COLOR_BROWSER_TEXT

Specify the text color for the row of the browser name.
Specify the default background color

COLOR_DEFAULT_TEXT

Specify the default text color

COLOR_HEADER

Specify the background color of the header row

COLOR_HEADER_TEXT

Specify the text color of the header row

COLOR_HITS

Specify the background color of the hits column header

COLOR_HITS_TEXT

Specify the text color of the hits column header

COLOR_LINK_HOVER_TEXT

Specify the text color for hovering over links

COLOR_LINK_TEXT

Specify the text color for normal (unvisited) links

COLOR_LINK_VISITED_TEXT

Specify the text color for visited links

COLOR_MORE

Specify the background color of the more row

COLOR_MORE_TEXT

Specify the text color of the more row

COLOR_PAGES

Specify the background color of the pages column header

COLOR_PAGES_TEXT

Specify the text color of the pages column header

COLOR_SESSION_AVG

Specify the background color of the session average column header

COLOR_SESSION_AVG_TEXT

Specify the text color of the session average column header

COLOR_SESSION_HEADER

Specify the background color of the session column header

COLOR_SESSION_HEADER_TEXT

Specify the text color of the session column header

COLOR_SESSION_MAX

Specify the background color of the session max column header

COLOR_SESSION_MAX_TEXT

Specify the text color of the session max column header

COLOR_SESSION_MIN

Specify the background color of the session min column header

COLOR_SESSION_MIN_TEXT

Specify the text color of the session min column header

COLOR_TITLE

Specify the background color of the title row (the separator between tables)

COLOR_TITLE_TEXT

Specify the text color of the title row (the separator between tables)

COLOR_VISIT_FIRST

Specify the background color of the first visit column header of the hosts table

COLOR_VISIT_FIRST_TEXT

Specify the text color of the first visit column header of the hosts table

COLOR_VISIT_HEADER

Specify the background color of the visit column header of the hosts table

COLOR_VISIT_HEADER_TEXT

Specify the text color of the visit column header of the hosts table

COLOR_VISIT_LAST

Specify the background color of the last visit column header of the hosts table

COLOR_VISIT_LAST_TEXT

Specify the text color of the last visit column header of the hosts table

COLOR_VISIT_NUMBER

Specify the background color of the visit number column header of the hosts table

COLOR_VISIT_NUMBER_TEXT

Specify the text color of the visit number column header of the hosts table

COUNTRY_LOOKUP

Enable (1 - default) or disable (0) country lookups based on IP addresses. This feature requires the
GeoIP API's and DAT file.

GEOIP_DB

Designates the path of the GeoIP ip-to-country lookup database file. The default value is /usr/local/share/GeoIP/GeoIP.dat.

DATABASE

Specify the database that Scratchy will use. Currently, only mysql (the default) and sqlite are supported.

NOTE!!! sqlite and gadfly are currently not fully supported, do not use yet!


DATA_DIR

Specify the path where parsed log data will be stored. This data will be the root of all data subdirectories (DATA_NAME)

DATA_NAME

Path, relative to DATA_DIR where parsed log data will be stored. If DATA_DIR is /home/fred/scratchy and DATA_NAME is mysite then data will be stored in /home/fred/scratchy/mysite

ENABLE_IP_TRACE

The behavior of IP tracing has changed for version 0.7 and above. Only the report module uses this flag now. If set to 1, then the report module will produce pages for each ip address (up to the value of MAX_HOSTS). If set to 0, then no ip address pages will be produces.

EXCLUDE_HOSTNAMES

Instructs the Report script to exclude this list of hostnames from being output. The hostname will be recorded during the Parse script, however, when the report is generated these hostnames will be excluded. The hostnames must be designated exactly as they appear in the access_log, that is, wildcards (SQL like clause) are not used to exclude the hostnames.

EXCLUDE_SEARCH_TERMS

Designate search terms that should be exluded from being collected (and ultimately reported). If a visitor to your website was referred by a search engine and if the query contained any of the EXCLUDE_SEARCH_TERMS then:
  • the search phrase will not be recorded
  • any of the keywords of this phrase that were in the EXCLUDE_SEARCH_TERMS list will not be recorded.

    An example:
    A visitor arrives from a Google after searching for "sprockets gizmos gadgets". Your config file contains EXCLUDE_SEARCH_TERMS: 'gizmos'.

    After running parse.py this Google search will not contain a search phrase (since it will be excluded) because it contains the term, "gadgets". Furthermore, this will contain only 2 keywords ("gizmos" and "sprockets") since "gadgets" will be excluded.

    This parameter should be specified as a comma-separated list of strings if there is more than one term that you wish to exlucde.

    Example 1: excluding a single term
    EXCLUDE_SEARCH_TERMS: gizmos

    Example 2: excluding multiple terms
    EXCLUDE_SEARCH_TERMS: gizmos, gadget, foo bar, sprocket


    EXCLUDE_URLS

    Instructs the Report script to exclude this list of urls from being output in the pages and files reports. The urls will be recorded during the Parse script, however, when the report is generated these urls will be excluded. The excluded url terms are automatically treated as a wildcard (SQL like clause).

    KNOWN_ALIASES

    Specify the names that the server is known as (such as, www.yahoo.com, yahoo.com). This data is used to deduce whether a hit was from an external link (from another website) or an internal link (within your site).

    KNOWN_PAGES

    A hit is considered any file that is accessed. A page is considered to be any page aka document that is accessed. The difference is subtle but useful (since it may be desirable to view stats of image files in the same manner that you view html files-- since an html file can contain dozens of images).

    You can specify which files you consider to be pages by listing their suffixes in a comma separated list: doc, html

    Additionally, if you change these entries after you have parsed some log files you will want to run the update_known_pages.py script that is in the Scratchy/scripts directory. This script updates the appropriate database tables such that reports produced with this data will reflect the currently recognized KNOWN_PAGES.


    MAX_BROWSERS

    Specify the maximum number of browsers to display in the report. A value of 0 displays all entries.

    MAX_BROWSER_VERSIONS

    Specify the maximum number of versions of each browser to display in the report. A value of 0 displays all entries.

    MAX_COUNTRIES

    Specify the maximum number of country names to display. A value of 0 displays all countries.

    Countries will not be displayed at all if the COUNTRY_CACHE is set to a value of 0 (disabled).


    MAX_DATES

    Specify the maximum number of dates to display in the report. A value of 0 displays all entries. Since data is collected monthly a value greater than 31 has no affect.

    MAX_ERROR_CODES

    Specify the maximum number of error codes to display. A value of 0 displays all entries.

    MAX_ERROR_PAGES

    Specify the maximum number of error pages to display. A value of 0 displays all entries.

    MAX_EXTERNAL_LINKS

    Specify the maximum number of external links to display. A value of 0 displays all entries.

    MAX_FILE_TYPES

    Specify the maximum number of file types to display. A value of 0 displays all entries.

    MAX_HOSTS

    Specify the maximum number of hosts to display. A value of 0 displays all entries.

    MAX_OPERATING_SYSTEMS

    Specify the maximum number of operating systems to display. A value of 0 displays all entries.

    MAX_FILES

    Specify the maximum number of files to display. A value of 0 displays all entries.

    MAX_PAGES

    Specify the maximum number of pages to display. A value of 0 displays all entries.

    MAX_ROBOTS

    Specify the maximum number of robots to display. A value of 0 displays all entries.

    MAX_SEARCH_ENGINES

    Specify the maximum number of search engines to display. A value of 0 displays all entries.

    MAX_SEARCH_KEYWORDS

    Specify the maximum number of search keywords to display. A value of 0 displays all entries.

    MAX_SEARCH_STRINGS

    Specify the maximum number of search strings (phrases) to display. A value of 0 displays all entries.

    MAX_STATUS_CODE

    Specify the maximum number of HTTP status codes to display. A value of 0 displays all entries.

    MYSQL_HOST

    Specify the hostname that the MySQL server is running on (localhost is the default).

    MYSQL_PORT

    Specify the port that the MySQL server is running on (None is the default which indicates the MySQL default).

    MYSQL_USERNAME

    Specify the username that Scractchy will use to connect to the MySQL server

    MYSQL_PASSWORD

    Specify the password for the MYSQL_USERNAME that Scractchy will use to connect to the MySQL server.

    SORT_COUNTRIES

    Designate the sort column for the countries table and chart. Valid values are 'hits', 'bandwidth' and 'pages'

    SORT_COUNTRIES

    Designate the sort column for the countries table and chart. Valid values are 'hits', 'bandwidth' and 'pages'

    SORT_FILE_TYPES

    Designate the sort column for the filetypes table and chart. Valid values are 'hits' and 'bandwidth'

    SORT_FILES

    Designate the sort column for the files table and chart. Valid values are 'hits', 'bandwidth' and 'pages'

    SORT_PAGES

    Designate the sort column for the pages table. Valid values are 'hits' and 'bandwidth'

    SORT_HOSTS

    Designate the sort column for the hosts table. Valid values are 'hits', 'bandwidth' and 'pages'

    SORT_HTTP_STATUS

    Designate the sort column for the http status table. Valid values are 'hits', 'bandwidth' and 'pages'

    SQLITE_DB

    Specify the database (file) name for Scratchy to use. This file is relative to DATA_DIR/DATA_NAME.

    Note: For SQLite you will also need to set the DATABASE to sqlite (the default is MySQL)


    TEMPLATE_REPORT

    Designates the template file you wish to use for the main report page.

    TEMPLATE_SUMMARY

    Designates the template file you wish to use for the summary index page.

    TIME_DATE_STR

    The format string for displaying the time and date in the report. Refer to the Python time module strftime function for details or consult the unix man page for strftime.

    TIMESTR

    The format string for displaying the time in the report. Refer to the Python time module strftime function for details or consult the unix man page for strftime.

    VERBOSE

    If VERBOSE is 1 then the parser and reporter will output some (useful) information. If VERBOSE is 0 the output will be quiet with the exception of warnings and errors.

    VISIT_TIME

    Specify the value, in seconds, that constitutes the start of a new visit. A value of 3600 (1 hour) indicates that if an IP address accesses a page within your site at 5:00 PM and then again at 5:30 PM that this will be considered a single visit. If the IP address is inactive for more than 1 hour and then accesses your site (i.e. 6:31 PM) then this access will be considered a new visit.

    In the first case, the session is considered to have lasted 1800 seconds(30 minutes).