Webalizer-ER

2005-Mar05:  I installed version 2.01-10 of The Webalizer, a web-stats-analyzer by Bradford L Barrett, then proceeded to make several modifications to it, in order to:
• get dates in a sensible format (year-month-day)
• have the names of all output files begin with a common prefix "wa-"
• have the index in chronological order, rather than reverse-chronological
improve the link colours
• replace the term "Sites" with "Visitors"
• in the Usage-by-Country table, show the 2-letter country-code as well as the full name
• label the Usage-Summary-Graph with the Year-Month-range it covers

I also wanted support for more than 12 months of stats, but made do with a bash script that combines the newly produced Index page with the previous such page.  The result is a page containing a 12-month graph, followed by a table with 12-months of data and links, then another such pair for the next year, and so on.  One flaw with this method is that the separate 12-month graphs are not plotted to the same scale. 

2008-Dec28:  I learned that the original author has implemented his version of more-than-12-month support, so I installed that version (2.20-03), refitted the mods described above, and made a few more:
• produce Yearly totals
• added ROBOTS NOFOLLOW meta-tag, badly needed in older versions, less so in this one as it has non-links or rel-nofollow links in the Referrer-table
• fixed colours in the Usage-Summary-Graph becoming progressively more greyish with more months-in-graph (became serious for more than 36-month graphs)

2009-Sep:  The "Search-Strings" part of the webalizer reports, in its as-distributed configuration, has been deteriorating for years, and has become essentially useless.  A partial solution is to update the SearchEngine lines in /etc/webalizer.conf for new and revised search-engines, and I've done that in the supplied webalizer.conf.sample.  A better solution is described below.

2009-Oct:  Before switching from month-at-a-time webalizing to daily webalizing, I ran a comparison: (1) doing an entire month with separate invocations for each day; (2) doing the same month all at once.  I first made copies of the webalizer-state (its hist & current files), so as to be able to restore it prior to 2nd run.  The results are mildly alarming: for each stat except Monthly Total Visitors, the incremental-results are 3-4% lower.  It turns out to be due to webalizer treating out-of-order records differently when invoked incrementally, in which case it doesn't honour the -f option.  This behaviour may be a kluge to make it ignore cases where the same day's logrecords are fed in more than once?  Incidentally all webalizer runs in this test used the exact same options, including -f and -p, so even my non-incremental run was done in incremental-mode.  I'm starting to use daily webalizing despite the flaw. 

Download:  webalizer-ER.zip

For questions, suggestions, etc about my version, contact ereimer@shaw.ca;  the original author can be contacted via his website (although in my experience he doesn't answer).

Some websites with stats by webalizer-ER:
www.nativeorchid.org/wa-stats
www.debwendon.org/wa-stats
stats on this website


Weblog-search-strings-report

2009-Sep:  Updating the SearchEngine lines in /etc/webalizer.conf is only a partial solution.  Some search-engines, most notably Google-Images, cannot be handled without a rewrite of that part of webalizer.  Are you wondering why your webalizer report shows implausible strings such as "-" or "p-" amongst the searches supposedly leading people to your site?  (An example.)  The answer is that these are due to bugs in webalizer.  Trying to fix code with bugs like that would likely be a waste of time, at least for someone other than the author -- a rewrite is indicated.  I don't expect to do such a rewrite at least not in the near future.  However, I've written an alternative program to do Search-String reporting, whose output serves as a replacement for that section of a webalizer-report.  Besides handling more search-engines, it also distinquishes between "Web" versus "Image" searches, another thing I've always wished the webalizer did.

The Search-Strings report (Top-5) for www.nativeorchid.org for 2009-June as made by webalizer (configured as distributed):

Top 5 of 1029 Total Search Strings
# Hits Search String
1 146 7.80% pictures of chigger bites
2 34 1.82% -
3 32 1.71% seneca root
4 29 1.55% p-
5 28 1.50% seed pod identification


The output of weblog-search-strings-report when run on the exact same logrecords:

Top 5 of 2239 Distinct, 5905 Total Search Strings
#HitsSearch String
1366img:pictures of chigger bites
2246web:orchids
3135img:georgian bay
478web:"catherine wishart tract"
561img:wreck

Download:  weblog-search-strings-report