WASD Hypertext Services - Technical Overview

20 - Utilities

20.1 - Echo Facility
20.2 - Where Facility
20.3 - Xray Facility
20.4 - ApacheBench
20.5 - HTTPd Monitor
20.6 - MD5digest
20.7 - QDLogStats
20.8 - Scrunch Utility (obsolete)
20.9 - StreamLF Utility
20.10 - Server Workout (stress-test)
[next] [previous] [contents] [full-page]

Foreign commands for external utilities (and the HTTPD control functionality) will need to be assigned from the adminstration users' LOGIN.COM either explicitly or by calling the HT_ROOT:[EXAMPLE]WASDVERBS.COM procedure. 

  $ AB == "$HT_EXE:AB"
  $ HTTPD == "$HT_EXE:HTTPD"
  $ HTTPDMON == "$HT_EXE:HTTPDMON"
  $ MD5DIGEST == "$HT_EXE:MD5DIGEST"
  $ QDLOGSTATS == "$HT_EXE:QDLOGSTATS"
  $ STREAMLF == "$HT_EXE:STREAMLF"
  $ WWWRKOUT == "$HT_EXE:WWWRKOUT"


20.1 - Echo Facility

Ever had to go to extraordinary lengths to find out exactly what your browser is sending to the server?  The server provides a request echo facility.  This merely returns the complete request as a plain-text document.  This can be used for for checking the request header lines being provided by the browser, and can be valuable in the diagnosis of POSTed forms, etc. 

This facility must be enabled through a mapping rule entry. 

  script /echo/* /echo/*

It may then be used with any request merely by inserting "/echo" at the start of the path, as in the following example. 

  http://wasd.dsto.defence.gov.au/echo/ht_root/


20.2 - Where Facility

Need to locate where VMS has the HTTPd files?  This simple facility maps the supplied path then parses it to obtain a resulting VMS file specification.  This does not demonstrate whether the path actually exists! 

This facility must be enabled through a mapping rule entry. 

  script /where/* /where/*

It may then be used with any request merely by inserting "/where" at the start of the path, as in the following example. 

  http://wasd.dsto.defence.gov.au/where/ht_root/


20.3 - Xray Facility

The Xray facility returns a request's complete response, both header and body, as a plain text document.  Being able to see the internals of the response header as well as the contents of the body rendered in plain text can often be valuable when developing scripts, etc. 

This facility must be enabled through a mapping rule entry. 

  script /Xray/* /Xray/*

It may then be used with any request merely by inserting "/xray" at the start of the path, as in the following example. 

  http://wasd.dsto.defence.gov.au/xray/ht_root/


20.4 - ApacheBench

This server stress-test and benchmarking tool, as used in the Apache Distribution, is included with the WASD package (sourced from http://webperf.zeus.co.uk/ab.c), within license conditions. 

  Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd.
  Copyright (c) 1998 The Apache Group.

ApacheBench will only compile and run for Alpha or VAX systems with VMS 7.n or greater available.  It is a simple but effective tool, allowing a single resource to be requested from a server a specified number of times and with a specified concurrency.  This can be used to benchmark a server or servers, or be used to stress-test a server configuration's handling of variable loads of specific resquests (before exhausting process quotas, etc.)

A small addition to functionality has been made.  The WASD ApacheBench displays a count of the HTTP response categories received (i.e. the number of 2nns, 4nns, etc.) This allows easier assessment of the relevance of results (i.e. measuring performance of some aspect only to find the results showed the performance of 404 message generation - and yes, an annoying experience of the author's prompted the changes!)

The following examples illustrate it's use. 

  $ AB -H
  $ AB -C 10 -N 100 http://the.server.name/ht_root/exercise/0k.txt
  $ AB -C 50 -N 500 -K http://the.server.name/ht_root/exercise/64k.txt
  $ AB -C 10 -N 100 http://the.server.name/cgi-bin/cgi_symbols


20.5 - HTTPd Monitor

The HTTP server may be monitored in real-time using the HTTPDMON utility. 

[graphic]  HTTPDMON Graphic

This utility continuously displays a screen of information comprising three or four of the following sections:

  1. Process Information
    HTTPd process information includes its up-time, CPU-time consumed (excluding any subprocesses), I/O counts, and memory utilization.  The "Servers:" item shows how many servers are currently running on the node/cluster.  Changes in this count are indicated by the second, parenthesized number. 
  2. General Server Counters
    The server counters keep track of the total connections received, accepted, rejected, etc., totals for each request type (file transfer, directory listing, image mapping, etc.).
  3. Proxy Serving Counters
    The server counters keep track of proxy serving connections, network and cache traffic, cache status, etc. 
  4. Latest Request
    This section provides the response status code, and some transaction statistics, the service being accessed, originating host and HTTP request.  Note that long request strings may be truncated (indicated by a bolded elipsis). 
  5. Status Message
    If the server is in an exceptional condition, for example exited after a fatal error, starting up, etc., a textual message may be displayed in place of the the request information.  This may be used to initiate remedial actions, etc. 

The following shows example output (after WWWRKOUT server testing):

   WASD:: 1         HTTPDMON v2.0.0 AXP        Saturday, 12-MAY-2001 11:26:18
 
 Process: HTTPd:80  PID: 2020022F  User: HTTP$SERVER  Version: 7.2
      Up: 0 21:05:51.88  CPU: 0 07:34:19.99  Startup: 1
 Pg.Flts: 3526  Pg.Used: 37%  WsSize: 6402  WsPeak: 4729
     AST:  196/200   BIO: 1003/1020  BYT: 202270/202720  DIO: 498/500
     ENQ: 1989/2000  FIL:  298/300   PRC:    100/100      TQ:  26/30
 
 Request: 184865  Current:16  Throttle: 0/0/0%  Peak: 29
  Accept: 178212  Reject:0  Busy:0  SSL: 17 
 CONNECT: 0  GET: 164717  HEAD: 20148  POST: 0  PUT: 0
   Admin: 0  Cache: 48/39992/0  DECnet: 3327/6655  Dir: 6670
     DCL: CLI:16730 CGI:9987 CGIplus:13310/13182 RTE:16/15 Processes: 168/5
    File: 3384/0  IsMap: 0  Proxy: 64911  Put: 0  SSI: 16657  Upd: 9988
 
     1xx: 0  2xx: 157816  3xx: 3  4xx: 18496 (403:27)  5xx: 1864  (none:0)
      Rx: 19,762,345  Tx: 1,053,847,437  (bytes)
 
   Proxy: enabled
 CONNECT: 0  GET: 58720  HEAD: 6191  POST: 0  PUT: 0
     Not: Cacheable Request:14555 Response:23786
 Network: Rx:118,463,055 Tx:7,208,716 (29%)  Accessed: 38356
  Lookup: Numeric:19 DNS:21 Cache:38317 Error:1
   Cache: Rx:2,863,617 Tx:320,164,012 (71%)  Read:26554/3 Write:10
  Device: DKA0: 4110480 blocks (2007MB)  Space: available
          2171744 used (1060MB 53%), 1938736 free (946MB 47%)
   Purge: 18 00:00:54, 98 files (2254/2412), 0 deleted (0/0)
 
    Time: 12 11:25:11  Status: 200  Rx: 95  Tx: 34479  Duration: 0.2400
 Service: beta.dsto.defence.gov.au:80  
    Host: beta.dsto.defence.gov.au (131.185.250.201)
 Request: GET /ht_root/exercise/64k.txt

The "/HELP" qualifier provides a brief usage summary. 

The server counter values are carried over when a server (re)starts (provided the system has stayed up).  To reset the counters use the on-line server administration menu (see 15 - Server Administration). 

If [DNSlookup] is disabled for the HTTP server the HTTPDMON utility attempts to resolve the numeric address into a host name.  This may be disabled using the /NORESOLVE qualifier. 


20.6 - MD5digest

From RFC1321 ...

" The [MD5] algorithm takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input.  It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest.  "

The MD5DIGEST utility is primarily provided with WASD for verifying kits as unchanged from the originals released.  With the proliferation of mirror sites and other distribution resources it has become good practice to ensure kits remain unchanged from release, to distribution, to installation site (changes due to to data corruption or malicious intent - as remote a possibility as that may seem).  Of course it may also be used for any other purpose where the MD5 hash is useful. 

For verifying the contents of a WASD release connect to the original WASD distribution site, refer to the download page, and make a comparison between the release MD5 hash found against the list of all archive hashes and the MD5 hash of your archive.  That can be done as follows

  $ MD5DIGEST == "$HT_EXE:MD5DIGEST"
  $ MD5DIGEST device:[dir]archive.ZIP
The result will look similar to
  MD5 (kits:[000000]htroot710.zip;1) = 404bbdfe0f847c597b034feef2d13d2d

Of course, if you have not yet installed your first WASD distribution using the MD5DIGEST utility that is part of it is not feasable.  The original site can provide kits and pre-built executables for this purpose. 


20.7 - QDLogStats

Quick-and-Dirty LOG STATisticS is a utility to extract very elementary statistics from Web server common/combined format log files.  It is intended for those moments when we think "I wonder how many times that new archive has been downloaded? ", "How much data was transfered during November? ", "How often is such-and-such a client using the authenticated so-and-so service? ", "How much has the mail service been used? " ... and want the results in a matter of seconds (or at least a few tens of seconds ;^) It is available at the command-line and as a CGI script. 

For QDLOGSTATS to be available as a CGI script it must have authorization enabled against it (to prevent potential ad hoc browsing of a site's logs).  The following provides some indication of this configuration, although of course it requires tailoring for any given site. 

  [VMS]
  /cgi-bin/qdlogstats ~webadmin,131.185.250.*,r+w ;

It could then be accessed using

  http://the.host.name/cgi-bin/qdlogstats

The initial access provides a form allowing the various filters and other behaviours to be selected.  The CGI form basically parallels the command-line behaviour described below. 


Filters

A number of filters allow subsets of the log contents to be selected.  These filters are simple "sort-of-regular" expressions, not case-sensitive (deliberately), can contain wildcards (such as asterisks (*), question marks (?), and percent signs (%)) as well as "semi-regular" expressions (such as the range [a-z]). THERE IS NO WAY (that I know of) TO ESCAPE THESE RESERVED CHARACTERS!  (This functionality uses decc$match_wild() function.) All matches are made by string pattern matching, hence a query /AFTER=01-NOV cannot be done.  Of course date pattern matching can! 

Special constructs allow more complex expressions to be built up.  Combinations of required and excluded strings may be specified in the one expression.  When a string begins with a "+{" it must be present for the record not to be filtered out.  If it begins "-{" then it must not be present.  Such specifications must be terminated with a matching closure "}".

A knowlege of the format and contents of the common and combined log formats will assist in deciding which and to what purpose filters should be used.  Record filtering is done in the same order as is finally displayed, so method would be processed before user-agent for instance.  Normally a record match terminates on the first non-matched filter (to expedite processing).  To compare and report each filter for every record apply the /ALL qualifier.  To view records as they are processed use the /VIEW qualifier.  This by default displays all matched records, but the optional =ALL or =NOMATCH parameters will display all records, or all those but the matches. 

$ QDLOGSTATS log-file-spec [pattern qualifiers] [other qualifiers]

/ALL compare and report on all supplied filters
/AUTHUSER= pattern (any authenticated username)
/CLIENT= pattern (client host name or IP address)
/DATETIME= pattern ("11/Jun/1999:14:08:49 +0930")
/METHOD= pattern (HTTP "GET", "POST", etc.)
/OUTPUT= file specification
/PATH= pattern (URL path component only)
/PROGRESS show progress during processing
(a "+" for each file started, a "." for each 1000 records processed)
/QUERY= pattern (URL query component only)
/REFERER= pattern (HTTP "Referer:" field, COMBINED only)
/USERAGENT= pattern (HTTP "User-Agent:" field, COMBINED only)
/VIEW[=type] display matching log records (ALL, NOMATCH, MATCH)


Examples


20.8 - Scrunch Utility (obsolete)

SCRUNCH Obsolete with 7.2

Changes with server-internal SSI document handling have made the SCRUNCH utility obsolete for WASD versions 7.2 and later.  Previously SCRUNCHed documents will continue to be processed without needing to be explicitly UNSCRUNCHed.  This description is maintained only for historical reasons only. 

The "scrunch"er is used to decrease the processing overhead of Server-Side Includes (SSI) files.  See the "Environment Overview" for more detail on this utility. 

A scrunched SSI file is a variable length record file, where each record either comprises a single SSI directive, with the "<!--#" beginning on the record boundary, or a record of other text to be output (i.e. not beginning with "<!--#").

Why do this?  Well, if all SSI directives begin on a record boundary you only have to check the first five characters of each record to establish whether it should be interpreted or directly output!  This saves checking every character of every record for the opening "<" and the following "!--#".

Files that have been scrunched are basically unsuitable for editing (only due to the often inappropriately sized records).  Previously scrunched files may be returned to something (often exactly) resembling their original condition using the /UNSCRUNCH qualifier. 


20.9 - StreamLF Utility

This utility converts VARIABLE format files to STREAM_LF.  The WASD HTTPd server access STREAM_LF files in block/IO-mode, far more efficiently that the record-mode required by variable-record format files.  Use "STREAMLF/HELP" for some basic usage information. 

NOTE: The server can also be configured to automatically convert any VARIABLE record format files it encounters to STREAM_LF. 


20.10 - Server Workout (stress-test)

The WWWRKOUT ("World Wide Web Workout") utility exercises an HTTP server, both in the number of concurrent connections maintained and in the number of back-to-back sequential connection requests and data transfers. 

This utility can be used to stress-test the WASD VMS HTTP server (or any other), or to make comparisons between it and other servers.  When stress-testing a server, evaluating performance or just using it to try and break a test-bed server, it is best used from multiple, autonomous systems concurrent. 

It sets up and maintains a specified number of concurrent connections to a server.  It reads a buffer of data from each connection in turn, where data is waiting (does not block), until the document transfer is complete and the connection closed by the server.  It then closes the local end and immediately reuses the now-free socket to initiate another sequence.  If enabled (it is by default), the utility attempts to reflect the real-world in varying the data transfer rate for each connection, by setting the number of bytes read during each read loop differently for each connection.  All transfered data is discarded. 

The data transfer rate for each connection is displayed at connection close.  It is by default it is the effective transfer rate, that is the rate from opening the connection to closing it, and so includes request processing time, etc.  If the "/NOEFFECTIVE" qualifier is employed it measures the document data transfer rate only. 

Although a single document path may be specified on the command line it is preferable to supply a range of document paths, one per line in a plain text file.  Each document path and/or type specified should be different to the others, to exercise the server and file system cache.  Any number of paths may be specified in the file.  If the file is exhausted before the specified number of connections have been established the file contents are recycled from the first path.  If a path or a file of paths is not specified the utility just continually requests the welcome document. 

To assess a server's total throughput choose paths that lead to large documents (> 50K), where the overhead of connection setup, rule processing and transfer initiation are less significant than the data transfer itself.  The buffer size variation functionality should be disabled using the "/NOVARY" qualifier when assessing data transfer rates.  Responsiveness is better assessed using small documents (< 2K), where the overhead of the non-data-transfer activities is more significant. 

$ WWWRKOUT [server_host_name[:port]] [path] [qualifiers]

/[NO]BREAK will randomly break a small number of connections during the data transfer (tests server recovery under those circumstances)
/BUFFER= number of bytes to be read from server each time (default is 1024, will be modified by the default "/VARY" qualifier)
/COUNT= total number of connections and requests to be done (default is 100)
/[NO]EFFECTIVE measures data transfer rate from request to close (if "/NOEFFECTIVE" is applied the rate is measured during data transfer only
/FILEOF= file name containing paths to documents
/HELP display brief usage information
/OUTPUT= file name for output
/PATH= single path to document to be retrieved
/PORT= IP port number of HTTP server (default is 80)
/PROXY= host name and port of proxy server
/SERVER= HTTP server host name
/SIMULTANEOUS= number of simultaneous connections to be set up at any one time (default is 10)
/[NO]VARY varies the size of the read buffer for each connection (default is vary)

Examples:

  $ WWWRKOUT
  $ WWWRKOUT www.server.host "/web/home.html"
  $ WWWRKOUT www.server.host:8080 /FILEOF=PATHS.TXT
  $ WWWRKOUT /SERVER=www.server.host /PORT=8080 /NOBREAK /NOVARY /FILEOF=PATHS.TXT
  $ WWWRKOUT www.server.host:8080 /FILEOF=PATHS.TXT /NOEFFECTIVE /NOVARY
  $ WWWRKOUT www.server.host /FILEOF=PATHS.TXT /COUNT=500 /SIMUL=20 /BUFFER=512

The "/HELP" qualifier provides a brief usage summary. 


[next] [previous] [contents] [full-page]