iwla

iwla Git Source Tree

Root/docs/main.md

1iwla
2====
3
4Introduction
5------------
6
7iwla (Intelligent Web Log Analyzer) is basically a clone of [awstats](http://www.awstats.org). The main problem with awstats is that it's a very monolithic project with everything in one big PERL file. In opposite, iwla has been though to be very modular : a small core analysis and a lot of filters. It can be viewed as UNIX pipes. Philosophy of iwla is : add, update, delete ! That's the job of each filter : modify statistics until final result. It's written in Python.
8
9Nevertheless, iwla is only focused on HTTP logs. It uses data (robots definitions, search engines definitions) and design from awstats. Moreover, it's not dynamic, but only generates static HTML page (with gzip compression option).
10
11Usage
12-----
13
14 ./iwla [-c|--config-file file] [-C|--clean-output] [-i|--stdin] [-f FILE|--file FILE] [-d LOGLEVEL|--log-level LOGLEVEL] [-r|--reset year/month] [-z|--dont-compress] [-p] [-D|--dry-run]
15
16 -c : Configuration file to use (default conf.py)
17 -C : Clean output (database and HTML) before starting
18 -i : Read data from stdin instead of conf.analyzed_filename
19 -f : Analyse this log file, multiple files can be specified (comma separated). gz files are acceptedRead data from FILE instead of conf.analyzed_filename
20 -d : Loglevel in ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'CRITICAL']
21 -r : Reset analysis to a specific date (month/year)
22 -z : Don't compress databases (bigger but faster, not compatible with compressed databases)
23 -p : Only generate display
24 -d : Dry run (don't write/update files to disk)
25
26Basic usage
27-----------
28
29In addition to command line, iwla read parameters in default_conf.py. User can override default values using _conf.py_ file. Each module requires its own parameters.
30
31Main values to edit are :
32
33 * **analyzed_filename** : web server log
34 * **domaine_name** : domain name to filter
35 * **pre_analysis_hooks** : List of pre analysis hooks
36 * **post_analysis_hooks** : List of post analysis hooks
37 * **display_hooks** : List of display hooks
38 * **locale** : Displayed locale (_en_ or _fr_)
39
40You can also append an element to an existing default configuration list by using "_append" suffix. Example :
41 multimedia_files_append = ['xml']
42or
43 multimedia_files_append = 'xml'
44Will append 'xml' to current multimedia_files list
45
46Then, you can launch iwla. Output HTML files are created in _output_ directory by default. To quickly see it, go into _output_ and type
47
48 python -m SimpleHTTPServer 8000
49
50Open your favorite web browser at _http://localhost:8000_. Enjoy !
51
52**Warning** : The order in hooks list is important : Some plugins may requires others plugins, and the order of display_hooks is the order of displayed blocks in final result.
53
54
55Interesting default configuration values
56----------------------------------------
57
58 * **DB_ROOT** : Default database directory (default ./output_db)
59 * **DISPLAY_ROOT** : Default HTML output directory (default _./output_)
60 * **log_format** : Web server log format (nginx style). Default is apache log format
61 * **time_format** : Time format used in log format
62 * **pages_extensions** : Extensions that are considered as a HTML page (or result) in opposit to hits
63 * **viewed_http_codes** : HTTP codes that are cosidered OK (200, 304)
64 * **count_hit_only_visitors** : If False, don't count visitors that doesn't GET a page but resources only (images, rss...)
65 * **multimedia_files** : Multimedia extensions (not accounted as downloaded files)
66 * **css_path** : CSS path (you can add yours)
67 * **compress_output_files** : Files extensions to compress in gzip during display build
68
69Plugins
70-------
71
72As previously described, plugins acts like UNIX pipes : statistics are constantly updated by each plugin to produce final result. We have three type of plugins :
73
74 * **Pre analysis plugins** : Called before generating days statistics. They are in charge to filter robots, crawlers, bad pages...
75 * **Post analysis plugins** : Called after basic statistics computation. They are in charge to enlight them with their own algorithms
76 * **Display plugins** : They are in charge to produce HTML files from statistics.
77
78To use plugins, just insert their file name (without _.py_ extension) in _pre_analysis_hooks_, _post_analysis_hooks_ and _display_hooks_ lists in conf.py.
79
80Statistics are stored in dictionaries :
81
82 * **month_stats** : Statistics of current analysed month
83 * **valid_visitor** : A subset of month_stats without robots
84 * **days_stats** : Statistics of current analysed day
85 * **visits** : All visitors with all of its requests
86 * **meta** : Final result of month statistics (by year)
87
88Create a Plugins
89----------------
90
91To create a new plugin, it's necessary to subclass IPlugin (_iplugin.py) in the right directory (_plugins/xxx/yourPlugin.py_).
92
93Plugins can defines required configuration values (self.conf_requires) that must be set in conf.py (or can be optional). They can also defines required plugins (self.requires).
94
95The two functions to overload are _load(self)_ that must returns True or False if all is good (or not). It's called after _init_. The second is _hook(self)_ that is the body of plugins.
96
97For display plugins, a lot of code has been wrote in _display.py_ that simplify the creation on HTML blocks, tables and bar graphs.
98
99Plugins
100=======
101
102Optional configuration values ends with *.
103

Archive Download this file

Branches

Tags