The HTML pre-processor is used to provide dynamic information inside of an otherwise static, HTML (HyperText Markup Language) document. The HTTPd server provides this as internal functionality, scanning the input document for special pre-processor directives, which are replaced by dynamic information based upon the particular directive.
The HTML pre-processor is invoked when the document file's extension is .SHTML (the "s" may have originated from the SSI nomenclature). As there is a significant overhead with pre-processed HTML compared to normal HTML, it should only be used when it serves a useful documentary purpose, and not just for the novelty.
One effective use for pre-processed HTML is the creation of single
virtual documents from two or more physical documents. That is, the
pre-processed document is used to include multiple physical documents, that
may even be independently administered, to return a composite document to the
client. This is a relatively low-overhead activity, but because it is a
dynamic document loses the advantages of "If-Modified-Since:" processing (see
2 - HyperText Transport Protocol Daemon).
5.1 - Pre-Expiring Documents
HTML-preprocessed documents are dynamic in the sense that the information presented can be different every time the document is generated (e.g. if time directives are included). If it is important that each time the document is accessed it is regenerated then an HTML META tag can be included in the HTML header to cause the document to expire. This will result in the document being reloaded with each access.
Ensure the document is structured similar to the following and include the <META HTTP-EQUIV= ...> tag with a legitimate date well in the past:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="expires" CONTENT="Thu, 01-Jan-1970, 00:00:01 GMT"> <TITLE> etc. </TITLE> </HEAD> <BODY> etc. </BODY> </HEAD>
The syntax follows closely that used by the other implementations, but some directives are tailored to the VMS environment. The directive is enclosed within an HTML comment and takes the form:
<!--#command [[tag="value"] ...]-->
A directive can be split over multiple lines provided the new line begins naturally on white-space within the directive. For example, this is correctly split
<!--#echo created[="<EMPHASIS>(time-format)"] -->while the following is not (and would produce an error)
<!--#echo creat ed[="<EMPHASIS>(time-format)"] -->
The command and tag keywords are case insensitive. The tag value may or
may not be case sensitive, depending upon the command/tag. Generally the
effect of a command is to produce additional text to be inserted in the
document.
5.3 - Directive Commands
5.3.1 - ACCESSES
The accesses directive allows the number of times the document has been accessed to be included. It does this by creating a counter file in the same location and using the same name with a dollar symbol appended to the type (extension). The count may be reset by deleting the file. This is an expensive function (in terms of file system activity) and so should be used appropriately. It can be disabled by server configuration. Two tags provide additional functionality:
<!--#accesses since="text" -->
This tag includes the specified text immediately after the access count is displayed, then adds the creation date of the counter file.
<!--#accesses since="text" timefmt="[time-format]" -->Allows the time format of the since tag to be supplied, where time-format is specified according to 5.5 - Time Format.
The config directive allows time and file size formats to be specified for all subsequent directives providing these values. Optional specifications for individual directives may still be made, and override, do not supercede, any specification made using a config directive. A config directive may be made once, or any number of times in a document, and applies until another is made, or until the end of the document.
This directive, provided in other implementations, is ignored for the WASD HTTPd. It controls whether an error message is generated upon encountering a pre-processor error. For the WASD implementation an error is always reported and aborts the processing of the document.
<!--#config timefmt="time-format" -->Where time-format is specified according to 5.5 - Time Format.
<!--#config sizefmt="size-format" -->Where size-format is specified using the following keywords:
The dir directive generates an Index of ... directory listing inside an HTML document. Apart from not generating a title (it is up to the pre-processed document to title, or otherwise caption, the listing) it provides all the functionality of the WASD HTTPd directory listing (see 4 - Directory Listing), including query string format control via the "par=" parameter (note that from the "httpd=index" introducer used with directory listings is not necessary from SSI). It is an WASD HTTPd extension to pre-processed HTML.
Listing specified using a VMS file path.
<!--#dir file="file-name" [par="server-directive(s)"] -->
Listing specified using URL-style syntax.
<!--#dir virtual="path" [par="server-directive(s)"] -->
For example:
<!--#dir /ht_root/src/httpd/" --> <!--#dir /ht_root/src/httpd/*.c" [par="layout=UL__S&nops=yes"] -->
The dcl directive executes a DCL command and incorporates the output into the processed document. It is an WASD HTTPd extension to the more common exec directive, which is also included.
By default, output from the DCL command has all HTML-forbidden characters (e.g. "<", "&") escaped before inclusion in the processed document. Thus command output cannot interfere with document markup, but nor can the DCL command provide HTML markup. This behaviour may be changed by appending the following tag to the directive:
type="text/html"
Some dcl directives are for privileged documents only, documents defined as those being owned by the SYSTEM account, and not being world-writeable. The reason for this should be obvious. There are implicit security concerns about any document being able to execute any DCL command(s), even if it is being executed in a completely unprivileged process. Hence only innocuous commands are allowed in standard documents.
Execute the DCL WRITE SYS$OUTPUT command, using the specified parameter.
<!--#dcl say="hello." -->
Execute the DCL SHOW command, using the specified parameter.
<!--#dcl show="device/full tape1:" -->
Execute the DCL DIRECTORY command, using the supplied file specification. Qualifiers may be included in the optional par tag to control the format of the listing.
<!--#dcl dir="web:[000000]" --> <!--#dcl dir="web:[000000]" par="/nohead/notrail" --> <!--#dcl dir="web:[000000]" par="/size/date" -->
Execute the specified DCL command.
<!--#dcl exec="show device/full tape1:" -->
Execute the DCL command procedure specified as a VMS file path, with any specified parameters applied to the procedure.
<!--#dcl file="HT_ROOT:[SHTML]TEST.COM" par="PARAM1 PARAM2" -->
Execute the DCL command procedure specified in URL-style syntax, with any specified parameters applied to the procedure.
<!--#dcl virtual="../shtml/test.com" par="PARAM1 PARAM2" -->
The echo directive incorporates the specified information into the processed document.
The date/time of the current document's creation.
<!--#echo created[="time-format"] -->
Include the current date/time.
<!--#echo date_local[="time-format"] -->
Include the current Greenwich Mean Time (UTC) date/time.
<!--#echo date_gmt[="time-format"] -->
The current document's URL-style path.
<!--#echo document_name -->
The current document's VMS file path.
<!--#echo file_name -->
The date/time of the current document's last modification.
<!--#echo last_modified[="time-format"] -->
Comma separated list of browser accepted content-types.
<!--#echo http_accept -->
Comma separated list of browser accepted character sets.
<!--#echo http_accept_charset -->
Comma separated list of browser accepted languages.
<!--#echo http_accept_language -->
Any gateways or proxy servers have handled the request.
<!--#echo http_forwarded -->
Host and port request was directed to.
<!--#echo http_host -->
URL of document containing link to current document (if any).
<!--#echo http_referer -->
Identification string of browser.
<!--#echo http_user_agent -->
URL path of current document.
<!--#echo path_info -->
VMS file name of current document.
<!--#echo path_translated -->
Query string of URL (unusual if there is one!)
<!--#echo query_string -->
IP address of browser system.
<!--#echo remote_addr -->
IP host-name of browser system (if DNS lookup is configured).
<!--#echo remote_host -->
Authenticated username (if any).
<!--#echo remote_user -->
IP host name of server system.
<!--#echo server_name -->
IP port number server host accepted client connection on.
<!--#echo server_port -->
"HTTP/1.0"
<!--#echo server_protocol -->
Identification string of server software.
<!--#echo server_software -->
The exec directive executes a DCL command and incorporates the output into the processed document. It is the VMS equivalent of the exec shell directive of some Unix implementations. It is implemented in the same way as the #DCL directive, and so the general detail of that directive applies. It supports only the cmd tag, the cgi tag, allowing execution of CGI scripts, is not supported.
<!--#exec cmd="show device/full tape1:" -->
The exec directive is for privileged documents
only, documents defined as those being owned by the SYSTEM account, and not
being world-writeable. The reason for this should be obvious. There are
implicit security concerns about any document being able to execute any DCL
command(s), even if it is being executed in a completely unprivileged process.
5.3.7 - FCREATED
The fcreated directive incorporates the creation date/time of a specified file/document into the processed document.
Document specified using a VMS file path.
<!--#fcreated file="file-name" [fmt="time-format"] -->
Document specified using URL-style syntax.
<!--#fcreated virtual="path" [fmt="time-format"] -->
The flastmod directive incorporates the last modification date/time of a specified file/document into the processed document.
Document specified using a VMS file path.
<!--#flastmod file="file-name" [fmt="time-format"] -->
Document specified using URL-style syntax.
<!--#flastmod virtual="path" [fmt="time-format"] -->
The fsize directive incorporates the size, in bytes, kbytes or Mbytes, of a specified file/document into the processed document.
Document specified using a VMS file path.
<!--#fsize file="file-name" [fmt="size-format"] -->
Document specified using URL-style syntax.
<!--#fsize virtual="path" [fmt="size-format"] -->
The include directive incorporates the contents of a specified file/document into the processed document.
<!--#include file="file-name" -->
Include the contents of the document specified using URL-style syntax.
<!--#include virtual="path" -->
The contents of the specified file are included differently depending on the MIME content-type of the file. Files of text/html content-type (HTML documents) are included directly, and any HTML tags within them contribute to the markup of the document. Files of text/plain content-type (plain-text documents) are encapsulated in <PRE></PRE> tags and have all HTML-forbidden characters (e.g. "<", "&") escaped before inclusion in the processed document. An HTML file can be forced to be included as plain-text by using the following syntax:
<!--#include virtual="example.html" type="text/plain" -->
Documents may be specified using either the FILE or VIRTUAL tags.
The FILE tag expects an absolute VMS file specification.
The VIRTUAL tag expects an URL-style path to a
document. This can be an absolute or relative path. See
3.1 - Document Specification for further details.
5.5 - Time Format
Note: Time formatting only applies if the HTTPd server has been compiled using DEC C.
Whenever a time directive is used an optional tag can be included to specify the format of the output. The default looks a little VMS-ish. If a format specification is made it must confirm to the C programming language function strftime().
The format specifier follows a similar syntax to the C standard library printf() family of functions, where conversion specifiers are introduced by percentage symbols. Here are some example uses:
The date is <!--#date_local fmt="%d/%m/%y" -->. The time is <!--#date_local fmt="%r" -->. The day-of-the-week is <!--#date_local fmt="%A" -->.
A problem with any supplied time formatting specification will be reported.
The following table provides the general conversion specifiers. For futher information on the formatting process refer to a C programming library document on the strftime() function.
Specifier Replaced by --------- ------------------------------------------------------------- a The locale's abbreviated weekday name A The locale's full weekday name b The locale's abbreviated month name B The locale's full month name c The locale's appropriate date and time representation C The century number (the year divided by 100 and truncated to an integer) as a decimal number (00 - 99) d The day of the month as a decimal number (01 - 31) D Same as %m/%d/%y e The day of the month as a decimal number (1 - 31) in a 2 digit field with the leading space character fill Ec The locale's alternative date and time representation EC The name of the base year (period) in the locale's alternative representation Ex The locale's alternative date representation EX The locale's alternative time representation Ey The offset from the base year (%EC) in the locale's alternative representation EY The locale's full alternative year representation h Same as %b H The hour (24-hour clock) as a decimal number (00 - 23) I The hour (12-hour clock) as a decimal number (01 - 12) j The day of the year as a decimal number (001 - 366) m The month as a decimal number (01 - 12) M The minute as a decimal number (00 - 59) n The newline character Od The day of the month using the locale's alternative numeric symbols Oe The date of the month using the locale's alternative numeric symbols OH The hour (24-hour clock) using the locale's alternative numeric symbols OI The hour (12-hour clock) using the locale's alternative numeric symbols Om The month using the locale's alternative numeric symbols OM The minutes using the locale's alternative numeric symbols OS The seconds using the locale's alternative numeric symbols Ou The weekday as a number in the locale's alternative representation (Monday=1) OU The week number of the year (Sunday as the first day of the week) using the locale's alternative numeric symbols OV The week number of the year (Monday as the first day of the week) as a decimal number (01 -53) using the locale's alterntative numeric symbols. If the week containing January 1 has four or more days in the new year, it is considered as week 1. Otherwise, it is considered as week 53 of the previous year, and the next week is week 1. Ow The weekday as a number (Sunday=0) using the locale's alternative numeric symbols OW The week number of the year (Monday as the first day of the week) using the locale's alternative numeric symbols Oy The year without the century using the locale's alternative numeric symbols p The locale's equivalent of the AM/PM designations associated with a 12-hour clock r The time in AM/PM notation R The time in 24-hour notation (%H:%M) S The second as a decimal number (00 - 61) t The tab character T The time (%H:%M:%S) u The weekday as a decimal number between 1 and 7 (Monday=1) U The week number of the year (the first Sunday as the first day of week 1) as a decimal number (00 - 53) V The week number of the year (Monday as the first day of the week) as a decimal number (00 - 53). If the week containing January 1 has four or more days in the new year, it is considered as week 1. Otherwise, it is considered as week 53 of the previous year, and the next week is week 1. w The weekday as a decimal number (0 [Sunday] - 6) W The week number of the year (the first Monday as the first day of week 1) as a decimal number (00 - 53) x The locale's appropriate date representation X The locale's appropriate time representation y The year without century as a decimal number (00 - 99) Y The year with century as a decimal number Z Timezone name or abbreviation. If timezone information is not available, no character is output. % %