Windows NT / 2000 Partition Size and Disk Errors on

advertisement
Windows NT / 2000 Partition Size and Disk Errors on HSxx
Controllers, Disk Admin Signature Issue
Problem Description
Windows NT Disk Administrator can read incorrect data from a newly configured RAID volume
on an HSZ/ HSJ / HSG controller, if the first disk was previously used in another NT disk, or if
the first disk contains an EISA volume, or any other data that Disk Administrator interprets as
a valid disk signature. This can lead to strange errors and loss of customer data.
See Advisory OD010710_CW01
Symptoms
Symptoms vary widely, but can include the following:
o Disk Administrator shows a partition size larger than the physical disk size, and compression
is not checked.
o Applications fail, and Event log shows event 9 and event 11 paired together:
Event 9 from aic78xx: "The device, \Device\ScsiPortx, did not respond within the timeout
period."
Event 11 from hszdisk: "The driver detected a controller error on
\Device\Harddisk1\Partition1."
DECEvent will break out the sense data: "ASC:21 ASCQ:00 Logical block address out of
range"
o Windows NT or 2000 experiences Lost Delayed Write errors
o Windows NT or 2000 experiences unexplained data corruption
o Windows NT or 2000 Secure Path crashes or won't boot when storage
is attached
o Windows NT or 2000 Secure Path will not install or refuses to properly
recognize the storage
o Windows NT or 2000 system hangs or crashes during bootup when storage
is connected
o Windows NT or 2000 MSCS cluster services reports signature errors or
won't operate properly with units
o Windows NT or 2000 MSCS cluster drive letters won't stick and other
strange unexplained behavior with MSCS
o Windows NT or 2000 MSCS unexplained corruption of registry entries
containing disk signatures.
o Windows NT or 2000 Virtual Replicator will not recognize any available
HSG80 devices
o Windows NT or 2000 8MB partitions and other invalid data visible on
the disks when first installed
o Windows NT or 2000 disk administrator tool fails to prompt the user to
write a new signature on the new storage when first detected
o Windows NT or 2000 full format fails or triggers lost delayed write
errors before completion when a quick format will not detect this
in most cases
o Windows NT or 2000 EVM or other utilities refuse to recognize disks or
storage as valid resources
Cause
HSxx controllers do not overwrite user data on disks when creating RAID volumes. If the first
member of a new RAID volume was previously a member of a larger or smaller volume, the
signature block will remain intact and be picked up by Disk Administrator. If a new Universal
drive with an 8 MB EISA partition is the first member of a RAID volume, it's signature can be
picked up as well.
The types of errors seen will vary depending on the information in the signature file.
Disk Administrator does no error checking of the disk signature block, and there is no way to
force an update. Usually by the time a customer notices this, it is too late and data loss has
occurred.
Resolution/Workaround
Resolution 1 (Recommended)
Run the DILX exerciser as indicated below before presenting any new storage unit to a
Microsoft Windows NT, Windows 2000, or any other operating system that may misinterpret
user data left on the disk. NOTE: Windows NT40 8MB partitions can be deleted from
within Disk Administrator
If the customer is already experiencing problems listed above, back up the customer's data,
shut down the server, then run the DILX utility as indicated.
When Disk Administrator is first run, it should tell you that it is writing signature information to
the disk. When formatting a partition under Disk Administrator for the first time, the customer
should be encouraged to select a full format instead of a quick format. This will force a write to
every block on the disk and expose any possible corruption issues or partition problems.
The following instructions must be followed carefully to ensure the disk signature information
will be erased. WARNING: All Data will be destroyed!
Execute the basic function test:
1. Be sure the device is offered as a unit and that it is preferred
to the controller that you are working from.
2. Run DILX.
3. Answer no to the Autoconfigure prompt.
4. Answer no to the "read only mode" prompt.
5. Answer 5 to the "execution time prompt". (We only need to write
over sector 1)
6. Accept the defaults until the "test number prompt"; enter 1 for
the basic function test.
7. Answer "y" to the "write enable disk" prompt.
8. Accept defaults until you receive the "Perform initial write"
prompt, answer "Y".
9. Accept defaults until the "enter unit number to be tested"
prompt, enter the number for your device, without the
"D" (ex. 350, not D350).
10. Answer "Y", to the "do you still want to add this unit" prompt.
11. Answer "Y" to the "select another unit" prompt, and enter any
other unit numbers to test, then answer "N". You can only test
units served by "This" controller.
12. The test starts now.
Download