HP Services Software Patches

» contact hp

»	software & drivers
»	ask Compaq
»	reference library
»	forums & communities
»	support tools
»	warranty information
»	contact support
»	parts
»	give us feedback

patches by topic
»	DOS
»	OpenVMS
»	Security
»	Tru64 Unix
»	Ultrix 32
»	Windows
»	Windows NT

associated links
»	what's new
»	contract access
»	browse patch tree
»	search patch tree
»	join mailing list

connection tools
»	nameserver lookup
»	traceroute
»	ping

Find Support Information and Customer Communities for Presario.

HP Services Software Patches - vaxshad09_061

 
NOTE:  An OpenVMS saveset or PCSI installation file is stored
       on the Internet in a self-expanding compressed file.
       The name of the compressed file will be kit_name-dcx_vaxexe
       for OpenVMS VAX or kit_name-dcx_axpexe for OpenVMS Alpha.
 
       Once the file is copied to your system, it can be expanded
       by typing RUN compressed_file.  The resultant file will
       be the OpenVMS saveset or PCSI installation file which
       can be used to install the ECO.
 
Copyright (c) Digital Equipment Corporation 1994, 1995. All rights reserved.

                      *****  WARNING!!!  *****

     Future OpenVMS VAX V6.1 kits that are issued for facilities
     included in the VAXSHAD09_061 kit will not install unless the
     VAXSHAD09_061 kit is installed on your system first.  It is
     highly recommended that the complete VAXSHAD09_061 remedial kit
     be installed as soon as possible.  Installation of individual
     images from the VAXSHAD09_061 remedial kit is not supported and
     could result in unpredictable system behavior.

     Descriptions for problems that were corrected in previous VAX
     Shadow kits are included in the VAXSHAD09_061 Release Notes.
     The release notes can be found in the save set VAXSHAD09_061.A. 
     If you have not installed a previous shadow kit it is
     recommended that you read these release notes before installing
     the VAXSHAD09_061 Shadow kit.  To access the release notes,
     restore them from the saveset by issuing a command with the
     following format:

     $ BACKUP/SEL=VAXSHAD09_061.RELEASE_NOTES DEVICE:[DIR]VAXSHAD09.A/SA-
       DEVICE:[DIR]VAXSHAD09_061.RELEASE_NOTES


PRODUCT:     Volume Shadowing for OpenVMS  (Phase II)

             NOTE:  The problems fixed in this ECO Kit also affect the
                    following products:

                         VAXcluster Software for OpenVMS VAX
                         VAXcluster Console System (VCS)

OP/SYS:      OpenVMS VAX

COMPONENTS:  System, Bugcheck, Backup,
             Mount, Dismount, MSCP, TMSCP, MTAAACP,
             I/O Routines, Audit Server,
             Security, System Primitives,
             Adaptive Pool Management (APM),
             Operator Communication Manager (OPCOM),
             User Environmental Test Package (UETP),
             Media Management Extensions (MME)

SOURCE:      Digital Equipment Corporation

ECO INFORMATION:

     ECO Kit Name:  VAXSHAD09_061
     ECO Kits Superseded by and Included in this ECO Kit:

          VAXSHADFT09_061  (Never Officially Released)
          VAXSHAD08_061    (Never Officially Released)
          VAXSHAD07_061    (For OpenVMS VAX V6.1 systems only)
          VAXSHAD06_061
          VAXSHAD05_061
          VAXSHAD04_061
          VAXSHAD03_061
          VAXSHAD02_061    (CSCPAT_1160)
          VAXSHAD01_061    (CSCPAT_1160)
          VAXMTAA01_062    (For OpenVMS VAX V6.1 systems only)
          VAXMTAA02_061
          VAXMTAA01_061    (CSCPAT_1154)
          VAXMONT01_061    (For OpenVMS VAX V6.1 systems only)
          VAXSYS14_061     (For OpenVMS VAX V6.1 systems only)
          VAXSYS12_061
          VAXSYS07_061
          VAXSYS01_061     (CSCPAT_1113)
          VAXMME01_061     (CSCPAT_1174)
          VAXOPCO01_061    (CSCPAT_1144)
          VAXAUDI02_061

     ECO Kit Size:  3960
     Kit Applies To:  OpenVMS VAX V6.1
     System/Cluster Reboot Necessary:  Yes


CAUTION:

Before Installing this Kit, Read the Following Cautions:

     After installation of this kit, the following issues may occur:

      1)  ISSUE:  When a node reboots into the cluster there may not
                  be an OPCOM message that reports the node is joining
                  the cluster.  Absent messages occur on a random
                  basis.

          WORKAROUND:  In order to verify the node has entered the
                       cluster, after the node has fully rebooted, the
                       user should enter the command:

                            $ SHOW CLUSTER

                       to verify the node is a valid member of the
                       VAXcluster.

      2) ISSUE FROM THE CSC:  An INVEXCEPTN in SNDRIVER may be seen if
                              DECnet/SNA V2.1 is used in conjunction
                              with the IO_ROUTINES from the VAXSHAD
                              ECO kit.  SNAVMS_E04021 (CSCPAT_5041) will 
                              fix this problem by replacing the
                              incompatible SNDRIVER in DECnet/SNA V2.1 

                              NOTE: SNAVMS_E04021 applies to
                                    DECnet/SNA V2.1 only.

     These issues are being addressed and will be corrected in a
     future version of OpenVMS VAX.


ECO KIT SUMMARY:

An ECO kit exists for Volume Shadowing on OpenVMS VAX V6.1.  This
kit contains the fixes described below.

Problems Addressed in the VAXSHAD09_061 Kit:

  o  A 'SET SECURITY' or 'SET ACL' on a volume in an OpenVMS 
     cluster places high I/O on the server process.   This 
     exhausts paged pool and the AUDIT_SERVER goes into an 
     RWPAG state.

     This problem is corrected in OpenVMS VAX V6.2

  o  A field in the IRP that is used during Volume Processing is
     not initialized in clones of USER IOs.  If an error occurs,
     the code that determines the severity of the error can be
     misled by data in these fields.  It can fail to locate the
     error and return the IO as successful.  Since a zero-byte count  
     is returned, an Incomplete Segmented Transfer error will occur.  
     The fix is to initialize the field when the clone is allocated.

  o  While creating a page, a user process might be swapped out and
     then return using a different balance set slot.  

     This problem is corrected in OpenVMS VAX V6.2.

  o  Certain applications calling $AUDIT_EVENT with ASTs disabled
     will be interrupted when $AUDIT_EVENT returns to the caller.

     This problem is corrected in OpenVMS VAX V6.2

  o  The code relies on a page being present when it attempts
     to release a spinlock.  If the system is paging heavily,
     the page may not be available.  This may result in pagefaults
     in EXE$BRKTHRU at IPL greater than 2.

     This problem is corrected in OpenVMS VAX V6.2
 
  o  Repeating wakeups from $SCHDWK show an accumulating drift over
     time.

     This problem is corrected in OpenVMS VAX V6.2.

  o  Magnetic tape position may be lost in differing circumstances:

     -  COPY and/or BACKUP of a DISK to a TMSCP-Served TAPE, will fail
        when the tape device is placed in an MV state.  The failure
        does not occur if the same task is performed locally.

     -  COPY will fail with: "SYSTEM-F-TAPEPOSLOST, magnetic tape
        position lost"

     -  BACKUP will fail with: "-SYSTEM-F-DATALOST, data lost"

     This problem is corrected in OpenVMS VAX V6.2.

  o  To transition an OpenVMS process from the virtual balance set
     to the real balance set, the SPTEs (system page table entries)
     which describe its process PTE pages (process page table pages)
     need to be copied from saved memory back into the real balance
     slot from where they originally came.  This makes the process'
     P0 and P1 space accessible again.  SPTEs for the process page
     table pages describing the undefined area between P0 and P1
     must be represented by pre-initialized null values (actually,
     ERKW DZERO-type values).  When this undefined void area is
     exactly zero pages (i.e., P0 and P1 are tangent), the
     VBSS$READ_OPT2_VBSM routine takes the wrong branch, causing a
     VBSSERR bugcheck.  This fix adds a test for this case, and
     takes the image's correct branch.

     This problem is corrected in OpenVMS V6.2.

  o  When a process is switched from a real balance slot to a
     virtual balance slot, the allocation may fail, causing a 
     VBSSERR bugcheck.

     This problem is corrected in OpenVMS VAX V6.2.

  o  Incorrect quota value is returned when process quota (BYTLM) is
     returned to a process for a created system global section.

     This problem is corrected in OpenVMS VAX V6.2.

  o  System crashes may occur due to corrupted PTE entries.  The
     corruption appears to be Global Section Table Entries pointing
     to Global Section Descriptors.

     The problem occurs only if 4095 GBLSECTIONS are exceeded.  To
     check the number of Global Sections currently in use, add the
     following values:

     - SDA> VALIDATE QUEUE EXE$GL_GSDSYSFL     !global sections

     - SDA> VALIDATE QUEUE EXE$GL_GSDDELFL     !delete pending global
                                               !sections

     - SDA> VALIDATE QUEUE EXE$GL_GSDGRPFL     !group global sections

  o  Devices can remain allocated to processes that no longer exist. 
     The device remains unusable until the system is rebooted.

  o  If a previously shadowed disk is mounted with a MOUNT/OVER=SHADOW 
     command and a new shadow set is created using this disk,
     OpenVMS VAX will attempt to create the old shadow set using the
     old physical device names.

  o  The system crashes with a NOBVPVCB bugcheck.  The crash occurs
     on the kernel stack with MTAAACP.EXE as the current image.

  o  The system crashes with an XQPERR while dismounting a MAD
     drive.

  o  SUBTRACED errors are not correctly determined for images installed
     with /HEADER_RESIDENT.

     This problem is corrected in OpenVMS VAX V6.2.
                         
  o  Users of ORACLE[R] Rdb V6.1 may get ILLIOFUNC errors when doing IO
     to a Host Based Shadowset whose members are served.

  o  The user will see a large number of shadow copies being done by
     OpenVMS rather than the controller, even when both disks are on
     the same controller and the controller has DCD (Disk Copy Data)
     capabilities.

  o  If a three-member Shadowset has its index zero member as a copy
     target and all three members require a MERGE, when the COPY
     completes the MERGE does not take place.  The LBN for the just
     completed COPY (the last LBN on the disk) is passed as the
     MERGE starting LBN, so it completes without doing any IO.

  o  Failures occur during attempts to start copies or restart
     copies, usually after a node halt, shutdown or reboot. 
     Additional symptoms observed include inconsistent values for
     HBS_CIP when compared to SHADOW_MAX_COPY, negative values for
     HBS_CIP and copies that should continue start over from the
     beginning.

  o  System hangs may occur when I/Os pending to a shadow set do 
     not complete.  

 
Problems Addressed in the VAXSHAD07_061 Kit:  

  o  In the VAXSHAD05 and VAXSHAD06 kits two new fields were added
     to the IRP data structure for shadow write logging information.
     This new IRP definition size conflicts with the IRP sizes of
     other images on the system that are not part of the SHADOW kits.
     This conflict may cause a variety of errors, including fatal
     bugchecks.  This fix changes the IRP definitions back to the SBB
     versions and adds some special definitions to the SHDRIVER for
     the new IRP fields.

  o  Fatal bugchecks from data structure corruption may occur due
     to the addition of the value 10 HEX to the corrupted field.
     Crashes are of various types and include node and cluster
     crashes, crashes due to invalid UCB addresses, invalid VCB
     addresses, invalid member IDs, and invalid number of devices.

  o  When trying to access a DFS disk, the following error may be
      seen:

        -SYSTEM-F-FILALRACC, file already accessed on channel

     The disk can be accessed immediately after reboot; however,
     after a period of time of not accessing the disk, a simple
     directory command will return this error.

  o  If a tape is initialized with a non-blank accessibility field
     and then mounted using /OVERRIDE=(ACCESSIBILITY), the tape
     mounts but cannot be read or written to.  The command format
     to initialize the tape would be similar to:

        INIT/LABEL=VOLUME_ACCESSIBILITY="+" tape: LABEL

     In addition, the following OPCOM messages are generated and
     the tape volume is automatically unloaded after an attempt to
     WRITE or READ the tape volume:

        %%%%%%%%%%%  OPCOM  12-DEC-1994 12:57:23.53 %%%%%%%%%%%  Message
        from  user  USERXX  on NODEXX non-blank accessibility field in
        volume labels on SYS$DEVICE:
        %%%%%%%%%%%  OPCOM  12-DEC-1994 12:57:23.54 %%%%%%%%%%%

  o  MTAAACP posts attention ASTs to its mailbox.  If the AST
     QUOTA reaches zero and an attempt is made to kill the MTAAACP
     process or the process that emitted the QIO, MTAAACP will go
     into the RWAST state and hang.


Problems Addressed in the VAXSHAD06_061 Kit:  

  o  When using PATHWORKS, data corruption may occur on the file
     container.  The corruption can be seen by running CHKDSK on the PC
     container disk.  Also using PCDISK to IMPORT and EXPORT files to
     and from the container will show a corrupted file when EXPORTed
     back to VMS.

  o  System crashes with INVEXCEPTN bugcheck at SCH$POSTEF+21.

     To correct this problem, a change was made in the IOC$SIMREQCOM
     routine to cause the destination of the IFNOWET test to 
     initialize R4 before calling the IOC$SCHEDEF routine.
     IOC$SCHEDEF expects R4 to have the address of the user's PCB.


Problems Addressed in the VAXSHAD05_061 Kit for OpenVMS VAX V6.1:

  o  After a node crashes, on reboot it cannot mount a Host Based
     Volume Shadowing virtual unit.  The error message usually
     returned is "volume not software enabled"; however, "Medium
     Offline" may also be seen.  A SHOW DEVICE will show that the
     the Shadowset is in 0% merge but SNA will show that a minimerge
     is pending.

  o  A double deallocation crash may occur as the result of MOUNT not
     properly initializing the Mounted Volume List (MTL) pointer.  This
     pointer had a stale value as a result of two calls to SYS$VMOUNT
     from a single program.  The stale pointer will only cause a problem
     if the system is unable to allocate space for defining the logical
     name.

     NOTE:  Since cells are initialized at image activation, this
            problem should not occur as a result of DCL commands.

  o  Tape devices with stacker/loaders, such as the TF857, may take
     up to 6 minutes to rewind/unload/load the next tape.  In
     VAXSHAD01_061, a change was made to the behavior of MOUNT to take
     this delay into account.   However, a side effect of that change
     was that non-stacker drives may also wait 6 minutes before failing.

  o  System crashes with an INVEXCEPTN during a SHDRIVER COPY_DATA_REPAIR
     copy operation.

  o  If the value of the ALLOCLASS SYSGEN parameter is not set and the
     user tries to use shadowing, a shadow volume can be created but
     members cannot be added to the shadow set.  No error messages are
     received up until a second member is added.  On the MOUNT command,
     the customer will receive the error messages:

        $   mount   /system    dsa500    /shadow=dkb400    alphavms015
        %MOUNT-I-SHDWMEMFAIL,  DKB400 failed as a member of the shadow set
        -SYSTEM-F-INCSHAMEM, incompatible shadow set member

     "Incompatible" is an inappropriate statement of the problem.  A
     more accurate message would be "missing allocation class," or
     "incorrect allocation class."

  o  If a shadow set member is dismounted at the same time from multiple
     nodes within a cluster, I/O to the shadow set may become stalled.

  o  Mount will not add shadow set members unless they are either
     MSCP or SCSI.

  o  Shadow set member expulsion is currently based on the time it takes
     a fork & wait and a PACKACK to complete rather than the actual time
     transpired.  On some devices, particularly SCSI, where a PACKACK
     can take approximately one minute, the timeout was much too long.
     Using the default value of 20 (seconds) for SHADOW_MBR_TMO would
     actually mean that it would take 20 minutes to expel from a SCSI
     shadow set a member experiencing errors.

  o  SHDRIVER loss of synchronization may result in a crash where
     SHADDETINCON is triggered by the check at the end of
     MATCH_MASTER_SCB.  In this consistency check, the
     SHAD$W_DEVSTS_PASSIVE_MV_CNTR is verified to be zero and is not.
     Another symptom is that the virtual unit UCB$W_RWAITCNT is
     zero.  Shadow set member counts of zero may also be seen.

  o  Crashes may occur in EXPEL_PACKACK_ANY with connections broken to
     all members and IRP$L_SHD_LOCK_FR5 = 1 (packack retries exhausted).

  o  All members of a shadow set become inaccessible at the same time and
     remain inaccessible for a period of time greater than "shadow
     member timeout" (SHADOW_MBR_TMO or SHADOW_SYS_TMO) seconds but
     less than MVTIMEOUT seconds.  All members subsequently become
     accessible within seconds of each other but not at exactly the same
     time.  This results in all but one member being expelled from the
     shadow set.

     This often occurs when changing HSJ microcode and all members are
     connected to the same HSJ.  When brought back online, polling will
     cause the devices to be found seconds apart which will result in
     all but one member being expelled.

  o  All members of the set must be checked to see if they meet the
     criteria of being MSCP.  The original design did not allow
     for having no index zero member.

  o  When the mounting of full copy targets exceeds the SHADOW_MAX_COPY
     threads for a given node, other nodes with the shadow set mounted
     do not pick up the copy work.

  o  In a cluster, using $PROCESS_SCAN explicitly or implicitly with the
     DCL 'SHOW USER' command sometimes causes a system crash due to an
     ACCVIO in kernel mode or an IVSSRVRQST bugcheck.

  o  When a node with a SCSI bus boots, it resets the SCSI bus.  In a
     multi-host SCSI cluster, this can cause the other node to
     experience I/O failures.  Normally, this results in a brief mount
     verification.  The I/O is retried, succeeds, and there is
     no serious consequence.  However, if the other node is in the
     process of booting and the system disk is a shadow set, the
     system will crash.

  o  A PGFIPLHI bugcheck may occur in the SHADOW_SERVER process at 
     the REMQUE in K_GET_COPYSHAD_IRP.  On OpenVMS VAX, the PC is 
     A0E and the VA is 274.

  o  A page setup module which draws a frame and company logo on each
     page of output is used on a queue pointing to an LN03.  This page
     setup module works on OpenVMS Version VAX 5.5-2 and prior versions.
     However, with VAXQMAN8_U2055 (CSCPAT_1165) or OpenVMS VAX Version
     6.1 installed, this page setup module causes the printer to
     continually spew out paper with only the output from the page setup
     module.  This continues until the entry is deleted from the queue.

  o  If a multi-programming application uses a non-homogenous access
     pattern to a file which is resident in Virtual I/O cache, there is
     a possibility that the size returned in the I/O status block from a
     READ operation will be truncated.

  o  If a clustered application uses of a large number of concurrent
     processes to perform file operations consisting of an OPEN, WRITE,
     and CLOSE sequence repetitively on the same data file, data
     corruption may occur.

  o  In a multi-programming environment where a significant amount of
     NEW data from a file is being loaded into the cache concurrently by
     multiple processes, the system may HANG.

  o  If a user attempts to mount a disk that is 100% full on OpenVMS VAX
     V6.* and the disk was originally initialized with a version of
     OpenVMS VAX prior to V6.0, paged pool can be corrupted leading to
     system crashes.  If the disk is filled AFTER it has been mounted
     under V6.*, there will not be any problem.

  o  The class driver will sometimes attempt to send an MSCP command
     packet on the wrong connection.  This fix detects this mismatch and
     corrects it.

  o  Due to invalid allocation counts, processes hang in RWNPG state
     waiting for a request for non-paged pool (NPP) so large that it
     cannot be satisfied.

  o  The system crashes with the current process executing a $CHKPRO
     system service call.

  o  A $AUDIT_EVENT system crash my occur in SECURITY.EXE due to corrupt
     scan structure storage.

  o  When a rights list is passed into $CHKPRO (CHP$_RIGHTS), it is
     copied into the ARB within the NSA$A_SCRATCH area.  This area
     will hold a maximum of eight rights.  The code that handles this
     copy operation will split any larger rights list into the first
     eight, which are copied into the local rights area, and the
     remainder, which a descriptor is created and its address is added
     as extended process rights.

     The code involved in copying the first eight rights is looping
     incorrectly and copying rights to random locations within the
     NSA$A_SCRATCH area usually resulting in a SSRVEXCPT crash.

  o  When a value block or value status block cannot be returned,
     SYS$GETLKI returns the error SS$_ILLRSDM.  A correction has been
     made to SYS$GETLKI to now return all other requested information
     and update the wildcard search index.


Problems Addressed in the VAXSHAD04_061 Kit:  

  o  When booting two or more systems simultaneously from shadowed
     system disks, the systems may appear to hang.  Crashing the
     systems and examining the crash dumps indicates that shadowing
     driver blocking AST routines have not run.

  o  When a node runs out of SHADOW_MAX_COPY threads while mounting
     new copy target units, other nodes in the cluster that have
     available SHADOW_MAX_COPY threads will not pick up the copy
     work.  This results in the copy not being started for copy
     members that are added to shadow sets.


Problems Addressed in the VAXSHAD03_061 Kit for OpenVMS VAX V6.1:

  o  A double-deallocation crash may occur as the result of MOUNT not
     properly initializing the MTL pointer.  This pointer had a stale
     value as a result of 2 calls to SYS$VMOUNT from a single program.
     The problem will not happen as a result of DCL commands, as the
     cells are initialized at image activation.  The stale pointer
     will only cause a problem if the system is unable to allocate
     space for defining the logical name.

  o  An OPCOM message was being output even though /NOASSIST was
     specified in the MOUNT command.  This caused problems for UETP.

  o  A system crash may occur in SECURITY.EXE.

  o  A process is in RWPAG while auditing an event.

  o  When the current process executes a $CHKPRO system service call,
     the system will crash.

  o  Processes hang in RWNPG state (Call to $CRMPSC) waiting for a
     request for NPP so large that it cannot be satisfied.

  o  DISMOUNT/OVERRIDE=CHECKS against the SYSTEM disk is allowed.
     Once this command is issued nothing else can be done.
     Installation of this kit will allow this command to
     only be issued on non-system disks.

  o  When booting from a Controller-Based Shadowed System disk
     for the first time as a Host-Based Shadowed System disk, boot
     fails with a SHADBOOTFAIL bugcheck.  A SHADBOOTFAIL may also 
     occur if SHADOW_SYS_UNIT is changed at boot time.

  o  During a copy operation the system may crash with an ACCVIO.

  o  When a user program allocates a read buffer from a TMSCP-served
     tape creator, the record on tape will get server node system data
     returned along with the data on tape.  Printing the buffer will
     show that the data from tape is in the correct location of the
     buffer but it will also show that the area of the buffer that was
     not supposed to be changed contains server node system data.


Problems Addressed in the VAXSHAD02_061 Kit:  

  o  The local MSCP server issues a fatal MSCPSERV bug check when it
     should not.  The server should instruct the remote DISKCLASS
     driver to BUGCHECK.

  o  When a serving node becomes so busy that it occasionally
     exhausts resource limits, the RWAITCNT for heavily used disks
     gets incremented.  If a client node requests an ONLINE and
     RWAITCNT is bumped, it is rejected by MSCP.  This makes
     MOUNTing devices very difficult.

  o  On OPCOM restart, the old privilege mask's upper 32-bits  may
     not be restored to their original value.  This mask is
     declared as a longword, but used as a quadword.

  o  When OPCOM receives a message that it does not recognize, the
     message is included in the log file with the following text:

     %%%%%%  OPCOM  19-APR-1994 11:20:40.06  %%%%%%  DUMP_LOG_FILE
     OPCOM has noticed a condition which might be due to an internal
     error. might also be explained by normal events, especially if
     nodes have just crashed or rebooted in a VAXcluster.  Please
     bring this message to Digital's attention only if you are having
     problems with operator communications.
     Buffer is     8 (%X0008) bytes -- "-  Unknown message received"
     00000000 00000000 00000000 00000000 00000000 00000000 -
     41534403 0015007B

  o  When an assisted merge is performed, an inaccurate number of
     LBNs (Logical Block Numbers) and bytes transferred may be
     computed.  Therefore, all LBNs may not be merged in assisted
     merge operations.

  o  Access path attention (ACPTH) messages are used by MSCP to
     determine secondary paths for disks that are attached to dual
     controllers.  DUDRIVER might incorrectly assign this
     information to the wrong device if two units with the same
     unit number and allocation class exist.  These messages may
     also trigger unnecessary failover attempts.

  o  Servers in VAXclusters with more than 127 nodes may crash
     when the 128th node attempts to access a given disk.  This
     usually occurs after a serving node crashes for other reasons,
     but this causes the rest of the servers to crash.

  o  In a small working set, it is possible for the EXE$PSCAN_NEXT_PID
     routine (called by $GETJPI) to take a page fault at IPL 8.  This
     causes a PGFIPLHI bugcheck.  The page referenced is in the
     PROCESS_SCAN context block (PSCANCTX$ data structure) in process
     virtual address space.

  o  While running a UETP tape test, fatal controller errors may
     occur.  This problem is caused by TMSCP (the tape server)
     incorrectly interpreting a TUDRIVER status subcode.  This
     misinterpretation is converted to a fatal controller error
     status and returned to the user.

  o  Shadow sets have separate mount verification done by SHDRIVER,
     instead of the usual system mount verification.  The SHDRIVER
     mount verification has an error updating the volume label on
     shadow sets that have the volume label changed except on the
     node that issues the label change.  Once the devices are in
     this state, they can not be recovered until MVTIMEOUT is
     reached or a reboot of all affected nodes is performed.

     This correction enables the behavior of virtual units to be
     consistent with the behavior of physical units.

  o  Unnecessary calls to MOUNT verification or host-based
     volume shadowing processing may occur.  On Alpha nodes,
     these mount verification or Host-Based Volume Shadowing
     processing calls will fail, resulting in I/O hangs and,
     eventually, volume invalid errors.

  o  AVAILABLE or OFFLINE status returned from a transfer command
     does not implement the MSCP specification correctly.

  o  OpenVMS VAX MSCP Parity with OpenVMS Alpha.  A served disk may
     appear to be ONLINE when it is really OFFLINE.  This occurs
     because the MSCP server's CHECK_SERVICE routine searches the
     device database and incorrectly returns an ONLINE status.

  o  There is no synchronization between SHADOW_PROCESSING and
     INVALIDATE_ALL_ENTRIES, which allows these two code threads to
     run simultaneously.  This can cause a system crash due to the
     fact that the SHADOW_PROCESSING thread may remove a member from
     a multimember shadow set and the INVALIDATE_ALL_ENTRIES thread
     is not aware that the member has been removed.  The system
     crash occurs in RESTORE_WLE because no Write Log table
     exists.

  o  A problem exists with the SHADOW_SERVER.  The symptoms of this
     problem are:

     +  Undiagnosable hangs in individual copy operations or on
        the entire server

     +  Unexpected copy aborts

     +  Poor copy performance

     +  Shadow set inconsistency

  o  High interrupt stack activity occurs on a node performing a merged
     copy operation.  This could adversely affect configurations using
     HSJ40 controllers with many shadow sets.

  o  Data inconsistency may exist between members of a Phase II shadow
     set.  This occurs under very heavy I/O operations to a shadow
     set while the members of that shadow set are undergoing failover
     from one controller to another.

  o  Invalid Command status processing of Write History Management
     commands unconditionally puts an entry into the error log.
     This occurs even when there is not actual error.

  o  A second shadow server may accidentally be created using the
     startup command procedure.  This results in desynchronization
     of shadow sets.  The startup procedure has been modified so
     that it does not allow multiple servers.

  o  When a serving node becomes so busy that it occasionally
     exhausts resource limits, the RWAITCNT for heavily used disks
     gets incremented.  If a client node requests an ONLINE and
     RWAITCNT is bumped, it is rejected by MSCP.  This makes
     MOUNTing devices very difficult.

  o  After a system failure, the number of blocks to be rewritten
     is not computed correctly.  This may cause inconsistent data
     between shadow set members.  This occurs during an assisted
     merge when the information regarding which LBNs to include
     is only requested from one shadow set member.

 o   A process issuing I/O to a TMSCP tape device may appear to
     hang after a controller failover attempt.  This is caused by
     an incorrect check of the cached data's lost error status,
     which results in an endless loop trying to recover a
     nonexistent error.

  o  In the past, Volume Shadowing checked device IDs and the
     maximum logical block numbers (LBNs.)  Volume Shadowing
     now checks for geometries and maximum LBNs.  This
     enables devices like the RZ28 and RZ28B to operate in
     the same shadow set.  Even though their device IDs differ,
     their geometries and maximum LBNs will match when configured
     on like controllers.

     NOTE:  If this remedial kit is installed across a VMScluster
            system, SCSI shadow sets that are configured across
            different controller types are not supported and will
            no longer work.

  o  A device may be mounted by an MSCP server, even though a local
     controller could be used.  This situation may still occur after
     the installation of this ECO kit under extreme timing circumstances.

  o  When new MSCP server I/O is sent to a device that is RWAITCNT
     stalled and the connection from the driver to the device fails,
     server I/O is posted to the restart queue if it is active.  If
     not, they are incorrectly left on the UCB (Unit Control Block)
     pending queue.  This causes shadow sets to appear to be stalled.

     If the connection from the client to the server then fails,
     I/O from the client that has been passed to the driver is
     then allowed to complete.  If this I/O is stalled on the
     pending queue, it completes much later, possibly after
     the client has reissued the stalled I/O.

  o  Incorrect MSPC-served disk synchronization might cause I/O to
     an MSCP-served disk to become stalled on an internal queue
     which would be restarted later.

  o  I/O hangs to a shadow set might occur because the shadowing
     driver has no way to disable write logging if the write log
     entries are mismanaged or depleted to a point that the
     shadow set is unusable.

  o  An Invalid Exception bugcheck might occur in DUDRIVER during
     I/O request complete processing.

  o  In the past, MSCP could only serve 256 disks.  It can now
     serve 512.

  o  During the processing of a write-log entry in SHDRIVER, a
     register value may be improperly maintained if the system
     is low on nonpaged pool.  This will cause a system crash
     with an INVEXCEPTN Bugcheck within SHSB$GET_WLE_TABLE in
     module SHDSUBS when the entry is resumed.

  o  After approximately 18 hours of operation, some OPCOM
     messages that should be logged are skipped.

  o  If two members of a three-member shadow set are
     simultaneously removed, either intentionally or in
     a failover situation, the system may hang or fail.

  o  System crashes might occur during virtual I/O cache (VIOC)
     expansion under the following circumstances:

     +  Multiple processes (or processors) are accessing the same
        file concurrently;

     +  The cache space for that file was being expanded;

     +  That expansion caused the need for a new hash table
        structure.

  o  When subjected to a high I/O load and multiple failures,
     the write logging (minimerge) and shadowing synchronization
     subsystems become unreliable.

  o  Unreliable shadow subsystem behavior and shadow-set hangs
     occur when VMScluster nodes fail to relinquish shadow-set
     resources.

  o  The TMSCP server bugchecks in TMSCP$FIND_UQB when a command
     that refers to a specific unit is processed and that unit
     does not have the Server Local Unit Number (SLUN) bit set.

     The fix contained in this ECO kit will cause the bugcheck
     to occur in TUDRIVER instead of the TMSCP server.

  o  I/O may stall to a served shadow-set member.  Load balancing
     makes this condition more likely.

  o  System crashes may occur during processing of stale I/O in
     Host-Based Volume Shadow Sets.   This I/O does not properly
     reflect changes in shadow set  configuration, notably removal of
     members and changes in the write-logging state.

  o  Shadow set members may be inconsistent after the failure
     of a node accessing a shadow set served by an Alpha node.
     The amount of corrupted data depends on previous I/O
     operations to the shadow set.


Problems Addressed in the VAXSHAD01_061 Kit:  

  o  In Volume Shadowing for OpenVMS Alpha Version 6.1, several
     changes were made to the assisted merge (minimerge)
     functionality.  These changes disabled mimimerge functionality
     across mixed architecture VMSclusters.   With minimerge
     disabled, shadowing continued to function normally, except that
     a full merge was always done when a merge operation occurred. 
     Full merges take considerably longer than minimerges.  If you
     want minimerge functionality, Digital recommends that you
     install this kit across any VMSclusters that contain an Alpha
     node running OpenVMS Alpha Version 6.1.

     Mixed-architecture VMSclusters that are running OpenVMS Alpha
     Version 6.1 must apply this kit and reboot the entire cluster
     simultaneously.  In these cases, rolling upgrades are not
     supported.

  o  Prior to this remedial kit, if attempts were made to mount an
     RZ28B disk device with an RZ28 in the same shadow set, Volume
     Shadowing detected different device IDs and may not have
     allowed the devices to be mounted.  This behavior applied only
     an RZ28/RZ28B shadow-set combination when connected with a
     local SCSI controller.  Since RZ28 and RZ28B are different
     device types but can be shadowed, the checking for shadow-set
     membership  in  the host-based shadowing software needed to be
     modified.

     This remedial kit enables the combination of RZ28 and RZ28B
     devices in a shadow set, as long as they are connected to like
     controllers.  With the use of SCSI devices, like controllers
     are required because geometry can vary from controller to
     controller.   Digital recommends that SCSI  shadow sets be
     configured across like controller types. Existing SDI and DSSI
     configurations are unaffected; if they are not using SCSI
     drives and are shadowing SDI devices across different
     controllers, these configurations will  continue  to work
     without this remedial kit.

     VMSclusters with shadowed SCSI disks and mixed-architecture
     VMSclusters running OpenVMS Alpha Version 6.1 must apply the
     kit and reboot the entire cluster simultaneously, so that the
     entire VMScluster is running the same version of Volume
     Shadowing software.  The kit is required for both VAX and Alpha
     nodes.   Do not mount shadow sets containing RZ28 and RZ28B
     devices without first applying this kit.

  o  The MME$$MNTREQ function, which requests that a volume should
     be selected for mount, allowed the use of logical names for the
     device name.  However, since these are process logical names,
     as part of the caller's process, these logical names are not
     available to the media manager.

  o  A device not ready for magtapes error is not reported until a
     delay of up to 6 minutes has expired.

  o  If a user creates a shadow set, dismounts the set, then mounts
     just one of the members, the other members of the set will be
     marked "ONLINE" when viewed from the HSC.  As a result, no HSC
     operations are allowed until the disk is MOUNTed then
     DISMOUNTed from the shadow set.

  o  If MOUNT fails to create a logical name, no error information
     is displayed.  In this case, the logical name may point to
     an incorrect device.

  o  If a device is MOUNTED/SYSTEM and then it is MOUNTED/CLUSTER
     with conflicting /OWNER_UIC or /PROTECTION qualifiers,
     incorrect error messages may be displayed.  The following two
     types of errors may occur.

       +  The error message may generate garbage which would
          force terminal characteristics to be reset to
          ASCII.

       +  The following error messages may be displayed:

            inconsistent /PROTECTION option.  Cluster mounted (garbage)
            inconsistent /OWNER_UIC option.  Cluster mounted (garbage)

  o  When a disk with a large EXTENT value is mounted under
     V6.* for the first time or if the SECURITY.SYS file is
     missing from the system, the SECURITY.SYS file will be
     created as EXTENT size and rounded up for the disk
     cluster size.  This may waste disk space.

  o  The message for %MOUNT-F-BADUNDFAT has a typographical error.

  o  If the VOLUME_ACCESSIBILITY option is used in conjunction
     with the INITIALIZE/LABEL= command upon tape initialization,
     a user with all privileges enabled is unable to access the
     tape unless he/she is the owner.

  o  In an OPCOM message, there is no  separating the
     device name and the comment text.

  o  After a BACKUP operation, the header of the INDEXF.SYS file
     of the backup save set is corrupted.  This can be seen by
     issuing the following DCL command:

          $ ANALYZE/DISK DJA0:

  o  Previously, MOUNT only waited 10 seconds to allow magtape
     devices to become ready before determining that the device is
     off line.  Tx8x7 tape devices may take up to 6 minutes to
     become ready during a volume switch.  This fix causes the wait
     to be done in user mode so that the wait can be aborted by the
     user via a CTRL/C.


Problems Addressed in the VAXMTAA01_062 Kit:  

  o  The system crashes with a NOBVPVCB bugcheck.  The crash occurs
     on the kernel stack with MTAAACP.EXE as the current image.

  o  The system crashes with an XQPERR while dismounting a MAD
     drive.


Problems Addressed in the VAXMTAA02_061 Kit:  

  o  If a tape is initialized with a non-blank accessibility field
     and then mounted using /OVERRIDE=(ACCESSIBILITY), the tape
     mounts but cannot be read or written to.  The command format to
     initialize the tape would be similar to:

          INIT/LABEL=VOLUME_ACCESSIBILITY="+" tape: LABEL

     In addition, the following OPCOM messages are generated and the
     tape volume is automatically unloaded after an attempt to WRITE
     or READ the tape volume:

     %%%%%%%%%%% OPCOM 12-DEC-1994 12:57:23.53 %%%%%%%%%%%  Message
     from  user  USERXX  on NODEXX non-blank accessibility field in
     volume labels on SYS$DEVICE:

     %%%%%%%%%%% OPCOM 12-DEC-1994 12:57:23.54 %%%%%%%%%%%

  o  If a user attempts to stop the MTAAACP process or a process that
     emitted a QIO, MTAAACP will go into RWAST state and hang.


Problems Addressed in the VAXMTAA01_061 Kit:  

  o  If the wrong magnetic tape volume is inserted as the next volume,  
     MTAAACP cancels the request and then hangs.


Problems Addressed in the VAXMONT01_061 Kit:  

  o  Specifying the DISK Class to Monitor can result in unexpected
     side effects to the display.  When the MONITOR DISK command is
     issued on a system with DFS devices mounted, only the first
     three characters of the DFS name are displayed correctly.
     Instead of the fourth character, the low byte of the unit
     number is output.  It is often displayed as an non-printable
     character or as an escape sequence (in which case, it may cause
     terminal lock-ups, resetting characteristics, etc).

     The following command illustrates this problem when executed
     on a system with DFS disks mounted:

          $MONITOR DISK

                              DISK I/O STATISTICS
         
                               on node NODENAME
                              7-APR-1994 16:25:17

          I/O Operation Rate

          DSA2241:         FOLKLORE            6.27    6.27    6.27    6.27
          DSA2249:         AUDIT               0.00    0.00    0.00    0.00
          DSA2263:         VMS19NOVC3L         0.00    0.00    0.00    0.00
          DSA2264:         LAV19NOVC3L         0.00    0.00    0.00    0.00
          DSA2265:         MDF19NOVC3L        15.84    15.84   15.84   15.84
          DSA2266:         VMS28APRB3E         0.00    0.00    0.00    0.00
          DSA2267:         LAV28APRB3E         0.00    0.00    0.00    0.00
          DSA2268:         MDF28APRB3E         0.00    0.00    0.00    0.00
          DSA2269:         VMS18JANC3L         0.00    0.00    0.00    0.00
          DSA2270:         MDF18JANC3L         0.00    0.00    0.00    0.00
          DSA2271:         LAV18JANC3L         0.00    0.00    0.00    0.00
          DSA2280:         VMS12OCTM3C         0.00    0.00    0.00    0.00
          $254$DFSé1001()  DEC:..._STAR        0.00    0.00    0.00    0.00
          $254$DFSH8008()  V501_RESD           0.00    0.00    0.00    0.00
          $254$DFSI8009()  V51_RESD            0.00    0.00    0.00    0.00

  o  The 'MONITOR DISK' command hangs when monitoring a system with
     more than 800 disks.  MONITOR contains an arbitrary upper limit
     of 800 on the number of disks it can monitor.  When a system
     contains more than 800, MONITOR generates an error status, but
     the status is not properly signaled, and the display appears to
     hang.  This can also be seen with a 'MONITOR CLUSTER' command
     (which collects DISK data implicitly).

  o  Due to an inadequate synchronization mechanism, the MONITOR
     DISK command can go into an infinite loop on multi-processor
     machines.

  o  MONITOR PROCESS in a local environment will fail if the
     SYSGEN parameter MAXPROCESSCNT is set to allow more than 1040
     processes.  When Virtual Balance Slots were added in OpenVMS
     V6.0, this number dropped to 978.


Problems Addressed in the VAXSYS14_061 kit:

  o  There is a race condition possible when a CFCB (Cache File
     Control Block) is being deleted due to XQP action and cache
     space is being reclaimed from a LIMBO file.

  o  Disk corruption can occur when heavy open/read/write/close/delete 
     operations are occurring.

  o  At some point after a node CLUEXITs, 2 or more cluster nodes
     crash with LOCKMGRERR Bugchecks.

  o  When two or more VAX or Alpha nodes are booting at the same
     time, one or both of them will crash.


Problems Addressed in the VAXSYS12_061 Kit:  

  o  When a value block or value status block cannot be returned,
     SYS$GETLKI returns the error SS$_ILLRSDM.  A correction has
     been made to SYS$GETLKI so that it now returns all other 
     requested information and updates the wildcard search index.


Problems Addressed in the VAXSYS07_061 Kit:  

  o  If a multi-programming application uses a non-homogenous
     access pattern to a file which is resident in Virtual I/O
     cache, there is a possibility that the size returned in the I/O
     status block from a READ operation will be truncated.

     If a clustered application consisting of a large number of
     concurrent processes which perform file operations consisting
     of an OPEN, WRITE, CLOSE sequence on the same data file
     repetitively, a possibility of data corruption exists.

     In a multi-programming environment, where a significant amount
     of NEW data from a file is being loaded into the cache
     concurrently by multiple processes, the possibility of a HANG
     exists.


Problems Addressed in the VAXSYS01_061 Kit:  

  o  SYS$CHKPRO had several problems that did not manifest themselves
     in a readily visible effect to the end user.  The problems
     include:

       -  accepting up to 11 rights lists even though no more than two
          would actually be processed.

       -  CHKPRO would accept a CHP$_UIC and write it over a location
          which was to contain a rightslist pointer.

       -  In most cases the wrong UIC was used in access checking.

     The only time the customer would notice a problem is if they
     specifically tested access to an object known to be protected
     from current rights and UIC settings.

  o  Nonpaged dynamic memory (NPAGEDYN) expansion occurs even when
     there is a large amount of free space available.  This can lead
     to performance problems as pool expansion causes free memory to
     be diverted away from that available to processes and dedicated
     to nonpaged pool usage.  For example, with a SHOW MEMORY/POOL
     command you can observe that the "Total" amount of "Nonpaged
     Dynamic Memory" increases when the amount of "Free" bytes is
     quite large:

          Dynamic Mem Usage (bytes): Total     Free      In Use    Largest
          Nonpaged Dynamic Mem       38555136  17372224  21182912    38720
          Paged Dynamic Mem          17282048   8295888   8986160  8265232

     Starting with the introduction of the Adaptive Pool Management
     (APM) feature, in OpenVMS VAX V6.0, these figures include the
     contributions of both the lookaside lists and the variable pool.
     So, a large "Free" figure is indicative of large (and possibly,
     growing) lookaside lists.  If the "Total" figure is increasing,
     it indicates that pool expansion is occurring, and that the
     lookaside list space is not being used effectively.

     The above symptom can result from either of the two following
     separate problems:

       -  A routine in the software which supports security features
          such as "rightslists" was obtaining a nonpaged pool block
          and then freeing it in two smaller pieces.

       -  An internal loop counter governing the number of times a
          lookaside list allocation was attempted, was set too low.
          This problem will most likely be seen on the VAX 6000 - 500
          and 600.

     A third software change associated with APM will also be
     available in a future OpenVMS VAX version, but is not available
     as a remedial change.  The third change provides a potential
     performance benefit under very specialized conditions, such as
     during VMScluster state transitions.


Problems Addressed in the VAXMME01_061 Kit:  

  o  The MME$$MNTREQ function which requests that a volume should
     be selected for MOUNT, allows the use of logical names for
     the device name.  However, since these are process logical
     names, as part of the callers process, these logical names are
     not available to the media manager.

  o  MME applications are no longer able to set mount and device
     context.


Problems Addressed in the VAXOPCO01_061 Kit for OpenVMS VAX V6.1:

  o  When a node leaves a VAXcluster, OPCOM goes into a tight
     loop on one of the remaining nodes in the cluster.  OPCOM
     can be seen using 90-95% of the CPU.


Problems Addressed in the VAXAUDI02_061 Kit for OpenVMS VAX V6.1:

  o  The Audit Server EXCLUDE process list may become corrupt after
     the DCL 'SET AUDIT/EXCLUDE=pid' command is issued.


INSTALLATION NOTES:

This kit *MUST* be installed on every VAX in a mixed-architecture
VMScluster, and the Alpha (ALPSHAD) version of this kit *MUST* be 
installed on every Alpha system in the cluster BEFORE any systems 
are re-booted into the VMScluster.  If the correct kit is not 
installed on each system, shadow sets cannot be created.  System 
crashes may also occur if the kits are not installed on all
appropriate cluster nodes.

The following restrictions will apply upon completion of the
installation:

  o  VMSclusters with shadowed SCSI disks and mixed-architecture
     VMSclusters running OpenVMS Alpha V6.1 must apply the kit and
     reboot the entire cluster simultaneously.  In these cases,
     rolling upgrades are not supported.

  o  Working configurations that contain SCSI shadow sets on
     dissimilar controllers may no longer work.
References:

ORACLE is a registered trademark of Oracle Corporation.
WordPerfect is a trademark of WordPerfect Corporation.

Files on this server are as follows:

»vaxshad09_061.README
»vaxshad09_061.CHKSUM
»vaxshad09_061.CVRLET_TXT
»vaxshad09_061.a-dcx_vaxexe



privacy statement	using this site means you accept its terms


© 1994-2002 Hewlett-Packard Company