Webmaster Tools

2001-10:  I took over as webmaster for NativeOrchid.Org and wrote the first of these tools.  The most basic one is webput which does the uploading, by FTP, of all webpages that have changed in my local copy of the websites I look after; it has undergone many changes over the years, mostly because no two webhosting outfits ever do things in the same way, and the webhosting for NativeOrchid.Org has changed several times.  Webput is now designed for the file-structure one sees at HostExcellence.Com (which will be very similar to what's seen at most any modern webhoster), and for redirects via .htaccess (it used to do them with META-REFRESH tags which only works for HTML files but works at a webhoster who doesn't let you to supply the Apache .htaccess file). 

The weblinkcheck.cron script is another important labour-saver; it automates the checking of links, a task that is painfully tedious when done manually.  It is something that too few webmasters ever do, whether with automated or manual methods, based on the broken links one encounters on the world-wide-web. 

An entirely different tool is my sendtoElist script.  There are many commercial products to solve the problem of managing an "elist" and sending periodic emails to the email-addresses on such a list, however here is a simple command-line tool to solve that problem. 

ertools.zip -- zipfile containing all of the webmaster-tools, the general-purpose scripts, and the other scripts I've published on this site, for easy installation.  See README.htm for more info on this collection of tools, and how to install them.  And see Unix-tools for Windows to install for Windows. 

Here are the individual shell (bash) scripts:

webput -- FTP-upload of modified files to remotely-hosted website
WEBPUT-LL -- FTP-upload of modified files to remotely-hosted website -- the low-level part
WEBVARS -- needed by WEBPUT-LL
webclean -- needed by webput
webmksubsets -- needed by webput
webnavreplicate -- needed by webput
sitemap-gen -- needed by webput
webcp -- needed by webput (ER-specific)
genByCaption-ER -- needed by webput (ER-specific)
genOrchidsBySpecies-ER -- needed by webput (ER-specific)
genFrontpageCandidates-ER -- needed by webput (ER-specific)
INSTALL-new-version-of-nopercart -- needed by webput (ER-specific)
mk-webalizer-ER-distribution -- needed by webput (ER-specific)
webmv -- needed by rename with --web option (ER-specific)
WEBLINKCHECK-LL -- check for broken links in a webpage (general-purpose)
weblinkcheck.cron -- cron-task to find and repair broken and redirected links in websites; an example of how to use WEBLINKCHECK-LL
webcommentcheck -- check webpages for HTML-comments containing "--" (general-purpose)
WEBGETLOGS.cron -- cron-task to download access-log files
FTP -- needed by WEBGETLOGS.cron
404report -- needed by WEBGETLOGS.cron
webalize-auto -- invoked from WEBGETLOGS.cron
webalize -- needed by webalize-auto
bk-auto.cron -- cron-task to do daily backups
bkall -- needed by bk-auto.cron
bk -- needed by bkall
bk-restore -- restore file(s) backup up by bk
sendtoElist -- send supplied email to each member of supplied elist
generateElistFromHISTORY+MBRS -- needed by sendtoElist
subsetMBRStoSMALL -- needed by sendtoElist
fixPeggyMBRS -- convert XL to HTML in one-line-per-row form
countMBRS -- count members in membership-db
ListSubtract-e -- set-difference for an elist
harvest-emailids -- a "bot" to harvest email-addresses for targeted email-campaign
uniquify-emailidlist -- eliminate duplicates in an elist

Most of these scripts will require some customization before being useful for someone else, and some are so highly dependent on features of my own website that it's hard to imagine them being useful to anyone else (these are labelled "ER-specific"). 

Some of these scripts use regular-expressions via grep and sed to select and modify HTML and will only function correctly on HTML written in a simplified subset of HTML.  The comments will explain limitations of that sort -- please let me know if I've missed any. 


Send your questions, suggestions, corrections to ereimer@shaw.ca.