iwla

iwla Commit Details

Date:2014-12-23 09:18:30 (6 years 7 months ago)
Author:Grégory Soutadé
Commit:2180f9e7d4003ca4adb58c1da7bf3aa1f85023af
Parents: 3b91fa0a443ca3d0bf478ab815b11a58e2beaa90
Message:Update doc

Changes:
Mdocs/index.md (18 diffs)
Mdocs/main.md (4 diffs)
Mdocs/modules.md (14 diffs)

File differences

docs/index.md
44
55
66
7
7
88
99
1010
......
3232
3333
3434
35
35
3636
3737
3838
3939
4040
41
41
4242
4343
4444
4545
4646
4747
48
48
4949
5050
5151
5252
53
53
5454
5555
5656
......
6464
6565
6666
67
67
6868
6969
7070
......
7777
7878
7979
80
80
8181
8282
8383
......
175175
176176
177177
178
179
178
179
180180
181181
182182
183
183
184184
185185
186
186
187187
188188
189
190
189
191190
192191
193
192
194193
195194
196195
......
203202
204203
205204
206
207
205
206
208207
209208
210209
211
210
212211
213212
214
213
215214
216215
217
216
217
218
219
218220
219221
220
222
223
221224
222225
223226
......
230233
231234
232235
233
234
236
237
235238
236239
237240
238
241
239242
240243
241
244
242245
243246
244
245
247
246248
247249
248
249250
250251
251252
......
258259
259260
260261
261
262
262
263
263264
264265
265266
266
267
267268
268269
269
270
270271
271272
272
273
274
275
273
274
276275
277276
278
279
277
280278
281279
282280
......
289287
290288
291289
292
293
290
291
294292
295293
296294
297
295
298296
299297
300
298
301299
302300
303
301
302
304303
305304
305
306306
307307
308308
......
315315
316316
317317
318
319
318
319
320320
321321
322322
323
323
324324
325325
326
326
327327
328328
329
330
329
330
331331
332332
333
333
334334
335335
336336
......
343343
344344
345345
346
347
346
347
348348
349
350
351
349
350
352351
353352
354353
355354
356355
357
356
357
358358
359359
360360
......
363363
364364
365365
366
367
368
366
367
368
369369
370370
371371
372372
373373
374
375
374
375
376376
377
377
378378
379
379
380380
381381
382382
383383
384384
385
385
386
386387
387388
388389
......
391392
392393
393394
394
395
396
395
396
397
397398
398399
399400
......
436437
437438
438439
439
440
440
441
441442
442443
443444
444
445
445446
446447
447448
448449
449450
450
451
451452
452453
453454
......
456457
457458
458459
459
460
461
462
460
461
462
463463
464464
465465
466466
467467
468
469
468
469
470470
471471
472472
473
473
474474
475475
476476
477477
478478
479
479
480480
481481
482482
......
485485
486486
487487
488
489
490
488
489
490
491
491492
492493
493494
494495
495496
496
497
497
498
498499
499
500
500
501
502
501503
502504
503505
504506
505507
506
507
508
508509
509510
510511
......
513514
514515
515516
516
517
518
517
518
519
519520
520521
521522
522523
523524
524
525
525
526
526527
527
528
528529
529
530
530531
531532
532533
533534
534535
535
536
536
537537
538538
539539
......
542542
543543
544544
545
546
547
545
546
547
548548
549549
550550
Introduction
------------
iwla (Intelligent Web Log Analyzer) is basically a clone of [awstats](http://www.awstats.org). The main problem with awstats is that it's a very monolothic project with everything in one big PERL file. In opposite, iwla has be though to be very modular : a small core analysis and a lot of filters. It can be viewed as UNIX pipes. Philosophy of iwla is : add, update, delete ! That's the job of each filters : modify statistics until final result. It's written in Python.
iwla (Intelligent Web Log Analyzer) is basically a clone of [awstats](http://www.awstats.org). The main problem with awstats is that it's a very monolothic project with everything in one big PERL file. In opposite, iwla has been though to be very modular : a small core analysis and a lot of filters. It can be viewed as UNIX pipes. Philosophy of iwla is : add, update, delete ! That's the job of each filter : modify statistics until final result. It's written in Python.
Nevertheless, iwla is only focused on HTTP logs. It uses data (robots definitions, search engines definitions) and design from awstats. Moreover, it's not dynamic, but only generates static HTML page (with gzip compression option).
* **display_hooks** : List of display hooks
* **locale** : Displayed locale (_en_ or _fr_)
Then, you can then iwla. Output HTML files are created in _output_ directory by default. To quickly see it go in output and type
Then, you can launch iwla. Output HTML files are created in _output_ directory by default. To quickly see it, go into _output_ and type
python -m SimpleHTTPServer 8000
Open your favorite web browser at _http://localhost:8000_. Enjoy !
**Warning** : The order is hooks list is important : Some plugins may requires others plugins, and the order of display_hooks is the order of displayed blocks in final result.
**Warning** : The order in hooks list is important : Some plugins may requires others plugins, and the order of display_hooks is the order of displayed blocks in final result.
Interesting default configuration values
----------------------------------------
* **DB_ROOT** : Default database directory (default ./output_db)
* **DISPLAY_ROOT** : Default HTML output directory (default ./output)
* **DISPLAY_ROOT** : Default HTML output directory (default _./output_)
* **log_format** : Web server log format (nginx style). Default is apache log format
* **time_format** : Time format used in log format
* **pages_extensions** : Extensions that are considered as a HTML page (or result) in opposit to hits
* **viewed_http_codes** : HTTP codes that are cosidered OK (200, 304)
* **count_hit_only_visitors** : If False, doesn't cout visitors that doesn't GET a page but resources only (images, rss...)
* **count_hit_only_visitors** : If False, don't count visitors that doesn't GET a page but resources only (images, rss...)
* **multimedia_files** : Multimedia extensions (not accounted as downloaded files)
* **css_path** : CSS path (you can add yours)
* **compress_output_files** : Files extensions to compress in gzip during display build
* **Post analysis plugins** : Called after basic statistics computation. They are in charge to enlight them with their own algorithms
* **Display plugins** : They are in charge to produce HTML files from statistics.
To use plugins, just insert their name in _pre_analysis_hooks_, _post_analysis_hooks_ and _display_hooks_ lists in conf.py.
To use plugins, just insert their file name (without _.py_ extension) in _pre_analysis_hooks_, _post_analysis_hooks_ and _display_hooks_ lists in conf.py.
Statistics are stored in dictionaries :
Create a Plugins
----------------
To create a new plugin, it's necessary to create a derived class of IPlugin (_iplugin.py) in the right directory (_plugins/xxx/yourPlugin.py_).
To create a new plugin, it's necessary to subclass IPlugin (_iplugin.py) in the right directory (_plugins/xxx/yourPlugin.py_).
Plugins can defines required configuration values (self.conf_requires) that must be set in conf.py (or can be optional). They can also defines required plugins (self.requires).
None
plugins.display.top_downloads
-----------------------------
plugins.display.all_visits
--------------------------
Display hook
Create TOP downloads page
Create All visits page
Plugin requirements :
post_analysis/top_downloads
None
Conf values needed :
max_downloads_displayed*
create_all_downloads_page*
display_visitor_ip*
Output files :
OUTPUT_ROOT/year/month/top_downloads.html
OUTPUT_ROOT/year/month/all_visits.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.all_visits
--------------------------
plugins.display.referers
------------------------
Display hook
Create All visits page
Create Referers page
Plugin requirements :
None
post_analysis/referers
Conf values needed :
display_visitor_ip*
max_referers_displayed*
create_all_referers_page*
max_key_phrases_displayed*
create_all_key_phrases_page*
Output files :
OUTPUT_ROOT/year/month/all_visits.html
OUTPUT_ROOT/year/month/referers.html
OUTPUT_ROOT/year/month/key_phrases.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.top_hits
------------------------
plugins.display.top_visitors
----------------------------
Display hook
Create TOP hits page
Create TOP visitors block
Plugin requirements :
post_analysis/top_hits
None
Conf values needed :
max_hits_displayed*
create_all_hits_page*
display_visitor_ip*
Output files :
OUTPUT_ROOT/year/month/top_hits.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.referers
------------------------
plugins.display.top_pages
-------------------------
Display hook
Create Referers page
Create TOP pages page
Plugin requirements :
post_analysis/referers
post_analysis/top_pages
Conf values needed :
max_referers_displayed*
create_all_referers_page*
max_key_phrases_displayed*
create_all_key_phrases_page*
max_pages_displayed*
create_all_pages_page*
Output files :
OUTPUT_ROOT/year/month/referers.html
OUTPUT_ROOT/year/month/key_phrases.html
OUTPUT_ROOT/year/month/top_pages.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.top_visitors
----------------------------
plugins.display.top_hits
------------------------
Display hook
Create TOP visitors block
Create TOP hits page
Plugin requirements :
None
post_analysis/top_hits
Conf values needed :
display_visitor_ip*
max_hits_displayed*
create_all_hits_page*
Output files :
OUTPUT_ROOT/year/month/top_hits.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.top_pages
-------------------------
plugins.display.top_downloads
-----------------------------
Display hook
Create TOP pages page
Create TOP downloads page
Plugin requirements :
post_analysis/top_pages
post_analysis/top_downloads
Conf values needed :
max_pages_displayed*
create_all_pages_page*
max_downloads_displayed*
create_all_downloads_page*
Output files :
OUTPUT_ROOT/year/month/top_pages.html
OUTPUT_ROOT/year/month/top_downloads.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.post_analysis.top_downloads
-----------------------------------
plugins.pre_analysis.page_to_hit
--------------------------------
Post analysis hook
Count TOP downloads
Pre analysis hook
Change page into hit and hit into page into statistics
Plugin requirements :
None
Conf values needed :
None
page_to_hit_conf*
hit_to_page_conf*
Output files :
None
None
Statistics update :
month_stats:
top_downloads =>
uri
visits :
remote_addr =>
is_page
Statistics deletion :
None
plugins.post_analysis.top_hits
------------------------------
plugins.pre_analysis.robots
---------------------------
Post analysis hook
Pre analysis hook
Count TOP hits
Filter robots
Plugin requirements :
None
Conf values needed :
None
page_to_hit_conf*
hit_to_page_conf*
Output files :
None
None
Statistics update :
month_stats:
top_hits =>
uri
visits :
remote_addr =>
robot
Statistics deletion :
None
None
plugins.post_analysis.reverse_dns
---------------------------------
plugins.post_analysis.top_pages
-------------------------------
Post analysis hook
Replace IP by reverse DNS names
Count TOP pages
Plugin requirements :
None
Conf values needed :
reverse_dns_timeout*
None
Output files :
None
None
Statistics update :
valid_visitors:
remote_addr
dns_name_replaced
dns_analyzed
month_stats:
top_pages =>
uri
Statistics deletion :
None
plugins.post_analysis.top_pages
-------------------------------
plugins.post_analysis.reverse_dns
---------------------------------
Post analysis hook
Count TOP pages
Replace IP by reverse DNS names
Plugin requirements :
None
Conf values needed :
None
reverse_dns_timeout*
Output files :
None
None
Statistics update :
month_stats:
top_pages =>
uri
valid_visitors:
remote_addr
dns_name_replaced
dns_analyzed
Statistics deletion :
None
plugins.pre_analysis.page_to_hit
--------------------------------
plugins.post_analysis.top_hits
------------------------------
Pre analysis hook
Change page into hit and hit into page into statistics
Post analysis hook
Count TOP hits
Plugin requirements :
None
Conf values needed :
page_to_hit_conf*
hit_to_page_conf*
None
Output files :
None
None
Statistics update :
visits :
remote_addr =>
is_page
month_stats:
top_hits =>
uri
Statistics deletion :
None
plugins.pre_analysis.robots
---------------------------
plugins.post_analysis.top_downloads
-----------------------------------
Pre analysis hook
Post analysis hook
Filter robots
Count TOP downloads
Plugin requirements :
None
Conf values needed :
page_to_hit_conf*
hit_to_page_conf*
None
Output files :
None
None
Statistics update :
visits :
remote_addr =>
robot
month_stats:
top_downloads =>
uri
Statistics deletion :
None
docs/main.md
44
55
66
7
7
88
99
1010
......
3232
3333
3434
35
35
3636
3737
3838
3939
4040
41
41
4242
4343
4444
4545
4646
4747
48
48
4949
5050
5151
5252
53
53
5454
5555
5656
......
6464
6565
6666
67
67
6868
6969
7070
......
7777
7878
7979
80
80
8181
8282
8383
Introduction
------------
iwla (Intelligent Web Log Analyzer) is basically a clone of [awstats](http://www.awstats.org). The main problem with awstats is that it's a very monolothic project with everything in one big PERL file. In opposite, iwla has be though to be very modular : a small core analysis and a lot of filters. It can be viewed as UNIX pipes. Philosophy of iwla is : add, update, delete ! That's the job of each filters : modify statistics until final result. It's written in Python.
iwla (Intelligent Web Log Analyzer) is basically a clone of [awstats](http://www.awstats.org). The main problem with awstats is that it's a very monolothic project with everything in one big PERL file. In opposite, iwla has been though to be very modular : a small core analysis and a lot of filters. It can be viewed as UNIX pipes. Philosophy of iwla is : add, update, delete ! That's the job of each filter : modify statistics until final result. It's written in Python.
Nevertheless, iwla is only focused on HTTP logs. It uses data (robots definitions, search engines definitions) and design from awstats. Moreover, it's not dynamic, but only generates static HTML page (with gzip compression option).
* **display_hooks** : List of display hooks
* **locale** : Displayed locale (_en_ or _fr_)
Then, you can then iwla. Output HTML files are created in _output_ directory by default. To quickly see it go in output and type
Then, you can launch iwla. Output HTML files are created in _output_ directory by default. To quickly see it, go into _output_ and type
python -m SimpleHTTPServer 8000
Open your favorite web browser at _http://localhost:8000_. Enjoy !
**Warning** : The order is hooks list is important : Some plugins may requires others plugins, and the order of display_hooks is the order of displayed blocks in final result.
**Warning** : The order in hooks list is important : Some plugins may requires others plugins, and the order of display_hooks is the order of displayed blocks in final result.
Interesting default configuration values
----------------------------------------
* **DB_ROOT** : Default database directory (default ./output_db)
* **DISPLAY_ROOT** : Default HTML output directory (default ./output)
* **DISPLAY_ROOT** : Default HTML output directory (default _./output_)
* **log_format** : Web server log format (nginx style). Default is apache log format
* **time_format** : Time format used in log format
* **pages_extensions** : Extensions that are considered as a HTML page (or result) in opposit to hits
* **viewed_http_codes** : HTTP codes that are cosidered OK (200, 304)
* **count_hit_only_visitors** : If False, doesn't cout visitors that doesn't GET a page but resources only (images, rss...)
* **count_hit_only_visitors** : If False, don't count visitors that doesn't GET a page but resources only (images, rss...)
* **multimedia_files** : Multimedia extensions (not accounted as downloaded files)
* **css_path** : CSS path (you can add yours)
* **compress_output_files** : Files extensions to compress in gzip during display build
* **Post analysis plugins** : Called after basic statistics computation. They are in charge to enlight them with their own algorithms
* **Display plugins** : They are in charge to produce HTML files from statistics.
To use plugins, just insert their name in _pre_analysis_hooks_, _post_analysis_hooks_ and _display_hooks_ lists in conf.py.
To use plugins, just insert their file name (without _.py_ extension) in _pre_analysis_hooks_, _post_analysis_hooks_ and _display_hooks_ lists in conf.py.
Statistics are stored in dictionaries :
Create a Plugins
----------------
To create a new plugin, it's necessary to create a derived class of IPlugin (_iplugin.py) in the right directory (_plugins/xxx/yourPlugin.py_).
To create a new plugin, it's necessary to subclass IPlugin (_iplugin.py) in the right directory (_plugins/xxx/yourPlugin.py_).
Plugins can defines required configuration values (self.conf_requires) that must be set in conf.py (or can be optional). They can also defines required plugins (self.requires).
docs/modules.md
8383
8484
8585
86
87
86
87
8888
8989
9090
91
91
9292
9393
94
94
9595
9696
97
98
97
9998
10099
101
100
102101
103102
104103
......
111110
112111
113112
114
115
113
114
116115
117116
118117
119
118
120119
121120
122
121
123122
124123
125
124
125
126
127
126128
127129
128
130
131
129132
130133
131134
......
138141
139142
140143
141
142
144
145
143146
144147
145148
146
149
147150
148151
149
152
150153
151154
152
153
155
154156
155157
156
157158
158159
159160
......
166167
167168
168169
169
170
170
171
171172
172173
173174
174
175
175176
176177
177
178
178179
179180
180
181
182
183
181
182
184183
185184
186
187
185
188186
189187
190188
......
197195
198196
199197
200
201
198
199
202200
203201
204202
205
203
206204
207205
208
206
209207
210208
211
209
210
212211
213212
213
214214
215215
216216
......
223223
224224
225225
226
227
226
227
228228
229229
230230
231
231
232232
233233
234
234
235235
236236
237
238
237
238
239239
240240
241
241
242242
243243
244244
......
251251
252252
253253
254
255
254
255
256256
257
258
259
257
258
260259
261260
262261
263262
264263
265
264
265
266266
267267
268268
......
271271
272272
273273
274
275
276
274
275
276
277277
278278
279279
280280
281281
282
283
282
283
284284
285
285
286286
287
287
288288
289289
290290
291291
292292
293
293
294
294295
295296
296297
......
299300
300301
301302
302
303
304
303
304
305
305306
306307
307308
......
344345
345346
346347
347
348
348
349
349350
350351
351352
352
353
353354
354355
355356
356357
357358
358
359
359360
360361
361362
......
364365
365366
366367
367
368
369
370
368
369
370
371371
372372
373373
374374
375375
376
377
376
377
378378
379379
380380
381
381
382382
383383
384384
385385
386386
387
387
388388
389389
390390
......
393393
394394
395395
396
397
398
396
397
398
399
399400
400401
401402
402403
403404
404
405
405
406
406407
407
408
408
409
410
409411
410412
411413
412414
413415
414
415
416
416417
417418
418419
......
421422
422423
423424
424
425
426
425
426
427
427428
428429
429430
430431
431432
432
433
433
434
434435
435
436
436437
437
438
438439
439440
440441
441442
442443
443
444
444
445445
446446
447447
......
450450
451451
452452
453
454
455
453
454
455
456456
457457
458458
None
plugins.display.top_downloads
-----------------------------
plugins.display.all_visits
--------------------------
Display hook
Create TOP downloads page
Create All visits page
Plugin requirements :
post_analysis/top_downloads
None
Conf values needed :
max_downloads_displayed*
create_all_downloads_page*
display_visitor_ip*
Output files :
OUTPUT_ROOT/year/month/top_downloads.html
OUTPUT_ROOT/year/month/all_visits.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.all_visits
--------------------------
plugins.display.referers
------------------------
Display hook
Create All visits page
Create Referers page
Plugin requirements :
None
post_analysis/referers
Conf values needed :
display_visitor_ip*
max_referers_displayed*
create_all_referers_page*
max_key_phrases_displayed*
create_all_key_phrases_page*
Output files :
OUTPUT_ROOT/year/month/all_visits.html
OUTPUT_ROOT/year/month/referers.html
OUTPUT_ROOT/year/month/key_phrases.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.top_hits
------------------------
plugins.display.top_visitors
----------------------------
Display hook
Create TOP hits page
Create TOP visitors block
Plugin requirements :
post_analysis/top_hits
None
Conf values needed :
max_hits_displayed*
create_all_hits_page*
display_visitor_ip*
Output files :
OUTPUT_ROOT/year/month/top_hits.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.referers
------------------------
plugins.display.top_pages
-------------------------
Display hook
Create Referers page
Create TOP pages page
Plugin requirements :
post_analysis/referers
post_analysis/top_pages
Conf values needed :
max_referers_displayed*
create_all_referers_page*
max_key_phrases_displayed*
create_all_key_phrases_page*
max_pages_displayed*
create_all_pages_page*
Output files :
OUTPUT_ROOT/year/month/referers.html
OUTPUT_ROOT/year/month/key_phrases.html
OUTPUT_ROOT/year/month/top_pages.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.top_visitors
----------------------------
plugins.display.top_hits
------------------------
Display hook
Create TOP visitors block
Create TOP hits page
Plugin requirements :
None
post_analysis/top_hits
Conf values needed :
display_visitor_ip*
max_hits_displayed*
create_all_hits_page*
Output files :
OUTPUT_ROOT/year/month/top_hits.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.display.top_pages
-------------------------
plugins.display.top_downloads
-----------------------------
Display hook
Create TOP pages page
Create TOP downloads page
Plugin requirements :
post_analysis/top_pages
post_analysis/top_downloads
Conf values needed :
max_pages_displayed*
create_all_pages_page*
max_downloads_displayed*
create_all_downloads_page*
Output files :
OUTPUT_ROOT/year/month/top_pages.html
OUTPUT_ROOT/year/month/top_downloads.html
OUTPUT_ROOT/year/month/index.html
Statistics creation :
None
plugins.post_analysis.top_downloads
-----------------------------------
plugins.pre_analysis.page_to_hit
--------------------------------
Post analysis hook
Count TOP downloads
Pre analysis hook
Change page into hit and hit into page into statistics
Plugin requirements :
None
Conf values needed :
None
page_to_hit_conf*
hit_to_page_conf*
Output files :
None
None
Statistics update :
month_stats:
top_downloads =>
uri
visits :
remote_addr =>
is_page
Statistics deletion :
None
plugins.post_analysis.top_hits
------------------------------
plugins.pre_analysis.robots
---------------------------
Post analysis hook
Pre analysis hook
Count TOP hits
Filter robots
Plugin requirements :
None
Conf values needed :
None
page_to_hit_conf*
hit_to_page_conf*
Output files :
None
None
Statistics update :
month_stats:
top_hits =>
uri
visits :
remote_addr =>
robot
Statistics deletion :
None
None
plugins.post_analysis.reverse_dns
---------------------------------
plugins.post_analysis.top_pages
-------------------------------
Post analysis hook
Replace IP by reverse DNS names
Count TOP pages
Plugin requirements :
None
Conf values needed :
reverse_dns_timeout*
None
Output files :
None
None
Statistics update :
valid_visitors:
remote_addr
dns_name_replaced
dns_analyzed
month_stats:
top_pages =>
uri
Statistics deletion :
None
plugins.post_analysis.top_pages
-------------------------------
plugins.post_analysis.reverse_dns
---------------------------------
Post analysis hook
Count TOP pages
Replace IP by reverse DNS names
Plugin requirements :
None
Conf values needed :
None
reverse_dns_timeout*
Output files :
None
None
Statistics update :
month_stats:
top_pages =>
uri
valid_visitors:
remote_addr
dns_name_replaced
dns_analyzed
Statistics deletion :
None
plugins.pre_analysis.page_to_hit
--------------------------------
plugins.post_analysis.top_hits
------------------------------
Pre analysis hook
Change page into hit and hit into page into statistics
Post analysis hook
Count TOP hits
Plugin requirements :
None
Conf values needed :
page_to_hit_conf*
hit_to_page_conf*
None
Output files :
None
None
Statistics update :
visits :
remote_addr =>
is_page
month_stats:
top_hits =>
uri
Statistics deletion :
None
plugins.pre_analysis.robots
---------------------------
plugins.post_analysis.top_downloads
-----------------------------------
Pre analysis hook
Post analysis hook
Filter robots
Count TOP downloads
Plugin requirements :
None
Conf values needed :
page_to_hit_conf*
hit_to_page_conf*
None
Output files :
None
None
Statistics update :
visits :
remote_addr =>
robot
month_stats:
top_downloads =>
uri
Statistics deletion :
None

Archive Download the corresponding diff file

Branches

Tags