Disk fundamentals Old edition chapter 14 The virtual layering • Virtual layering of disk storage system: 1. disk controller firmware – controller chips or card to map physical disk geometry for different drive brands and models 2. BIOS – low level functions to read/write sectors or format tracks 3. The OS API services to open/close files, set properties, read/write files Virtual levels of disk access OS API System bios Disk controller firmware Common to all systems • Physical partitioning of data • Access to data at the file level • Map filenames to physical locations Hardware level • • • • • Platters sides Tracks Cylinders sectors OS level • OS level view of the disk is in terms of partitions, directories and files Assembly access to disk • Readily available using BIOS under MSDOS for ME, NT, XP, Windows7, etc. • Store and retrieve data in a special format (like Hamming or Huffman codes) • Recover lost data • Perform diagnostics • Using NT or XP you must use Win32 API for disk manipulation…or write device drivers with high privilege Tracks, cylinders, sectors • Disk is made up of multiple platters • Attached to a spindle which rotates at constant speed • Above the surface of each platter is a r/w head that records magnetic pulses • The heads move in or out as a group • See text sketch p 465 Tracks, cylinders, sectors • Surface of disk is formatted into (invisible) concentric bands called tracks where data is stored magnetically. • A disk will have thousands of tracks. • Moving r/w head from one track to another is called seeking. • (not mentioned) latency is the time it takes a particular sector to rotate around under the head • Seek time for a disk is one sort of performance measure • RPM is another performance measure- usually 7200 • The outermost track is track 0 and numbers increase as you move toward the center. Tracks, cylinders, sectors • All tracks readable from a given r/w head position together form a cylinder. • A file would typically be stored on disk using adjacent cylinders. This reduces seek time. • A sector is a 512-byte portion of a track • Physical sectors are magnetically marked at the factory using low-level formatting. Their size does not change regardless of the OS used. A hard disk may have 63 or more sectors per track. Photo of hard disk with reflective platter visible A platter from a 5.25" hard disk, with 20 concentric tracks drawn over the surface. Each track is divided into 16 imaginary sectors Figure 1 Sectors & tracks • A sector is the basic unit of data storage on a hard disk. The term "sector" emanates from a mathematical term referring to that pie shaped angular section of a circle, bounded on two sides by radii and the third by the perimeter of the circle - See Figure 1. An explanation in its simplest form, a hard disk is comprised of a group of predefined sectors that form a circle. That circle of predefined sectors is defined as a single track. A group of concentric circles (tracks) define a single surface of a disks platter. Early hard disks had just a single one-sided platter, while today's hard disks are comprised of several platters with tracks on both sides, all of which comprise the entire hard disk capacity. Early hard disks had the same number of sectors per track location, and in fact, the number of sectors in each track were fairly standard between models. Today's advances in drive technology have allowed the number of sectors per track, or SPT, to vary significantly, but more about that later. More about disks • • When a hard disk is prepared with its default values, each sector will be able to store 512 bytes of data. Without elaborating, there are a few operating system disk setup utilities that permit this 512 byte number per sector to be modified, however 512 is the standard, and found on virtually all hard drives by default. Each sector, however, actually holds much more than 512 bytes of information. Additional bytes are needed for control structures, information necessary to manage the drive, locate data and perform other functions. Exact sector structure depends on the drive manufacturer and model, however the contents of a sector usually include the following elements: ID Information: Within each sector a small space is left to identify the sector's number and location, which is used to locate the sector on the disk and provide for status information about the sector itself. For example, a single bit is used to indicate if the sector has been marked defective and remapped. • Synchronization Fields: These are used internally by the drive controller to guide the read process. • Data: The actual data in the sector. • ECC: Error correcting code used to ensure data integrity. • Gaps: Often referred to as spacers used to separate sector areas and provide time for the controller to process what it has been read before processing additional data. • Servo Information: In addition to the sectors, each of which contain the items above, space on each track is allocated for servo information on drives that utilize embedded servo drives. Most, if not all, modern drives not employ servo technology. Aside: Zoned Bit Recording • • We would be remiss in our discussion of drive sectors, tracks and performance without mentioning mass improvements such as Zoned Bit Recording. One of the methods used to increase capacity and data access speeds on hard disks is by improving the utilization of the larger, outer tracks of the disk. Early hard disks were extremely primitive, and their controllers weren't capable of handling complicated arrangements such as being able to change tracks. As the result of this arrangement, every track had the same number of sectors, with the standard set at 17 sectors per track. As you can see from our sketch above, Figure 1, tracks are concentric circles, with the ones on the outside of the platter much larger in circumference than the ones closer to the center. Since there is a constraint on how tightly the inner circles can be packed with bits, developers packed them tightly as possible given the state of technology at the time. By reducing bit density, developers were able to assign the same number of sectors to the outer circles. Essentially this meant that the inner sectors were being packed so tightly there was no room for error, and the outer sectors underutilized, as in theory they could hold many more sectors given the same linear bit density limitations as were imposed on the inner sectors. Zoned Bit Recording • • • • Drive developers, in an effort to create larger drive sizes, as well as improve utilization and performance, developed a technology referred to as zoned bit recording (ZBR). Zoned bit recording is often referred to as multiple zone recording or just zone recording. With this technology, tracks are grouped into zones based on their distance from the center of the disk, and each zone is assigned a number of sectors per track. As you move from the innermost part of the disk to the outer edge, you move through different zones, each containing more sectors per track than the one before. This makes more efficient use of the larger tracks on the outside of the disk. In essence, with ZBR, the size (or length) of a sector remains reasonably constant over the entire surface of the disk. Stark contrast to very early hard disks that did not employ ZBR, as their tracks were limited to only 9 sectors regardless of track size. An interesting added benefit from zoned bit recording is that the raw data transfer rate of the disk, also referred to as the media transfer rate (a bit of a misnomer), when reading the outside cylinders is considerably higher than when reading the inside ones. Although the angular velocity of the platters is constant regardless of which track is being read, the outer cylinders contain more data. Bear in mind though that angular velocity does not necessarily compensate for the fact that the outer tracks (periphery of the platter) is moving much faster than the tracks at the core of the platter. Take note that constant angular velocity is not the case for all drive technologies, such as older CD-ROM drives. Since data is written to the outer tracks of a drive first, hence the drive is filled with data from the outside in. The fastest data transfer occurs when the drive is first used and data retained in the outer tracks. Many people that perform benchmarks on their systems and their hard drives when new, then make some tweaks and changes to their system only to return to their benchmarks weeks or months later only to be unpleasantly surprised that the disk and its benchmarks are getting slower. Actually, the disk has probably has not changed at all, but the second benchmark may have been run on tracks closer to the center of the disk. While most people that take benchmarking seriously defragment their drives before running the tests, fragmentation of the file system can have impact performance benchmarks. fragmentation • Disk storage becomes fragmented over time just like main memory. • A fragmented file is not located in contiguous disk sectors. This slows access time. translation • Translation is the process converting physical geometry into logical structure • The drive itself or a card has a controller to perform this operation. • The OS works with logical (not physical) sector numbers. Logical Block Addressing: aside • • Prior to the advent of Logical Block Addressing, all hard drives were accessed via CHS (Cylinder, Head, Sector) or Extended CHS, which means that the drive was accessed by specifying its cylinder, head and sector address. More appropriately, it was referred to as accessing the drive through its "geometry". Extended CHS was a transition change in the way a drive was accessed in order to work around the 504 MiB barrier, however, the addressing was still done in terms of cylinder, head and sector numbers and then translated one or more times before actually accessing the drive itself. By contrast, logical block addressing (LBA) involves a completely new method of addressing sectors. New in that it is new to the EIDE/IDE interface. LBA was first developed around SCSI hard drives. With LBA, instead of referring to a drives cylinder, head and sector number geometry in order to access or "address" it, each sector is assigned a unique "sector number". In essence, LBA is a means by which a drive is accessed by linearly addressing sector addresses, beginning at sector 1 of head 0, cylinder 0 as LBA 0, and proceeding on in sequence to the last physical sector on the drive, which, for instance, on a standard 540 Meg drive would be LBA 1,065,456. While this was new it the AT Specification ATA-2, it has always been the one and only addressing mode in SCSI. AT Attachment ATA-2 has been subsequently replaced, and the latest AT specification is at ATA-7. Note also that LBA does not allow you to address more sectors than CHS style addressing would. Logical Block Addressing • • • • In order for you to employ LBA support, it must be supported by both the BIOS and the operating system. In addition, since it is a new method of communicating with the hard drive, the drive itself must support LBA as well. All newer hard drives do in fact support LBA. Often we review other sites to ensure that we provide you with accurate information, and with respect to LBA, we came upon a unique, but inaccurate, statement. One purported authority on computer systems stated that when drives supporting LBA are auto-detected by a BIOS that supports LBA, it will be set up to use that mode. This is inaccurate and misleading, as there's nothing in the BIOS code that will set up your drive to use LBA mode. If you have ever used Fdisk, you may recall that during the drive setup process, you are asked whether you want to enable LBA. Hence, it is a function of the operating system, and therefore don't expect your BIOS to somehow mysteriously setup your drive. While it is true that a drive enabled for LBA is not subject to the 504 MiB drive size barrier, there still remains considerable confusion about Logical Block Address and what it does. Many knowledgeable technicians and users believe that it is LBA addressing that avoids the 504 MiB barrier, however this is not quite accurate. Logical Block Addressing isn't getting around the barrier, because it is just another manner in which to address the same geometry. If you were still limited to 1,024 cylinders, 16 heads and 63 sectors, you would still have logical sectors beginning with number 0, and progressing sequentially through to 1,032,191, with the 504 MiB still in place. What does avoid this barrier is that LBA mode automatically enables geometry translation. This translation is required because the operating system calling the BIOS Int 13h routines knows nothing about LBA. Therefore it is the translation part of LBA that really gets around the barrier. When LBA is enabled, the BIOS will enable geometry translation. This translation may be done in the same way that it is done in Extended CHS or large mode via a drives geometry, or it may be done using a different algorithm called LBA-assist translation. It is this translated geometry that is presented to the operating system for use in Int 13h calls. Basically, the difference between LBA and ECHS is that when using ECHS the BIOS translates the parameters used by these calls from the translated geometry to the drive's logical geometry. With LBA, it translates from the translated geometry directly into a logical block (sector) number. LBA is currently the dominant form of hard disk addressing. When the 8.4 GB limit of the Int13h interface was reached in 1998-1999, it became impossible to express the geometry of large hard disks using cylinder, head and sector numbers, regardless of whether translated or not, while remaining below the Int13h limits of 1,024 cylinders, 256 heads and 63 sectors. This is one of the reasons that today's hard drives no longer indicate their classical geometry. Disk partitioning • A single harddrive may be partitioned into logical units named partitions or volumes represented by a letter, A, B, C, .. • A partition may be primary or extended and a drive may contain both types. • A primary partition is bootable. • An extended partition may be further divided into unlimited logical partitions. Each is mapped to a drive letter and can not be bootable. But each may be formatted with a different file system. Multiboot systems • It is common to create multiple primary partitions each booting a different OS. • Mathlab is dual boot • In industry, you might have primary partitions for development and production. • Logical partitions hold data. Different OS can access the same file systems. Both Linux and DOS can read FAT32 disks. FDISK.exe under MS-DOS • Create and remove partitions • Does not preserve data • Later versions (Win2000 and later) have a disk manager utility File systems • Every OS has some disk management system. • At the lowest level it manages partitions, at the next highest, files and dirctories. • It must keep track of location, size and attributes for each file. FAT File-Allocation-Table (see also later slide) • Maps logical sectors to clusters (a basic storage unit) • Maps files and directories to sequences of clusters. • A cluster is the smallest unit of space used by a file, consisting of one or more adjacent disk sectors. Wikipedia FAT • • • File Allocation Table (FAT) is a file system developed by Microsoft for MSDOS and was the primary file system for consumer versions of Microsoft Windows up to and including Windows Me. FAT as it applies to flexible/floppy and optical disk cartridges (FAT12 and FAT16 without long file name support) has been standardized as ECMA-107 and ISO/IEC 9293. The file system is partially patented. The FAT file system is relatively uncomplicated, and is supported by virtually all existing operating systems for personal computers. This ubiquity makes it an ideal format for floppy disks and solid-state memory cards, and a convenient way of sharing data between disparate operating systems installed on the same computer (a dual boot environment). The most common implementations have a serious drawback in that when files are deleted and new files written to the media, directory fragments tend to become scattered over the entire media, making reading and writing a slow process. Defragmentation is one solution to this, but is often a lengthy process in itself and has to be performed regularly to keep the FAT file system clean. Wikipedia NTFS • NTFS (New Technology File System) is the standard file system of Windows NT, including its later versions Windows 2000, Windows XP, Windows Server 2003, Windows Server 2008, and Windows Vista. • NTFS replaced Microsoft's previous FAT file system, used in MS-DOS and early versions of Windows. NTFS has several improvements over FAT and HPFS (High Performance File System) such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization plus additional extensions such as security access control lists and file system journaling. The exact specification is a trade secret, although (since NTFS v3.00) it can be licensed commercially from Microsoft through their Intellectual Property Licensing program. XP disk management tool Cluster sizes for 1.25-2gig volume FAT Type FAT16 Cluster Size 32 kiB Number of FAT Entries65,526 Size of FAT~ 128 kiB FAT32 4 kiB 524,208 ~ 2 MiB Clusters used by FAT • A chain of clusters is referenced by a FAT that keeps track of all clusters used by a file. Pictures show cluster chain and wasted space examples. sector 1 2 3 4 5 6 7 8 cluster1 cluster2 4096 used 4096 used 1000 bytes used FAT 12 • Still supported by Windows and Linux • Cluster size is 512 bytes – perfect for small files • Each table entry is 12 bits • A volume holds less than 4087 clusters FAT 16 • The only system for drives formatted under msdos • Supported by all versions of windows and linux • Drawbacks: – Storage is inefficient on volumes over 1 gig due to large cluster size – Each table entry is 16 bits limiting the total number of clusters that can be accessed – Volume holds between 4087 and 65,526 clusters – Boot sector has no backup so a read error can be catastrophic – No built in security or individual user permissions FAT 32 • • • • • • Introduced with OEM2 release of win 95 and later refined A single file can be up to 4gb (minus 2b) Each table entry is 32 bits a volume holds 65,526 up to 268,435,456 clusters Volume can hold up to 32 gig Smaller clusters than FAT 16 on volumes 1gb to 8gb resulting in less waste • Boot record has a backup of critical information NTFS • Supported under NT, 2000, XP • Handles large volumes possibly spread over multiple drives • For disks>2gig, default cluster is 4kb • Supports unicode filenames up to 255 chars long • Permissions • Built-in encryption • Change journal can track file revisions • Disk quotas for individuals or groups of users • Robust recovery for data error and automatically repairs errors • Supports multiple disk mirroring (a mirror is a copy) ECC and Hamming • Hamming is a fairly expensive single-error correction scheme developed by Hamming at Bell Labs. • 2-power bits store parity of the other bits which they correct. So bit 1 is parity for all the odd bits. Bit 2 is parity for bits 3, 6, 7, 10,11, 14, 15, bit 4 is correction bit for bits 5, 6, 7, 12, 13, 14, 15. Bit 8 corrects 9, 10, ..15, and so on. Hamming… performance • To send an 8 bit (ASCII code for example) piece of data, we will use correcting bits 1, 2, 4, and 8 (4 bits) plus the 8 data bits means we will “package” 12 bits. Notice this is a 33% overhead. • To send 16 bits of data we would use correction bits 1,2,4,8 and 16 for a 21-bit package where overhead has dropped to less than 25% • We can send up to 247 bits of data using parity bits 1,2,4,8,16,32,64,and 128 (=8 correction bits) so the overhead has dropped down to 8/255…smaller than 3% Hamming…a 12 bit example • Compute the correcting bits to send 8 bits of data, like ‘A’ or ‘9’. • Assume even parity bits. ECC example • In bit interleaved parity disk 4 might hold parity bits for data on the other three disks. • Bits are read simultaneously off the 4 disks. If data is lost on one of the 3 data disks it can be recovered from the parity disk. • For example, if 2 good data bits read (with X marking lost data) are 1X1 with parity bit==1 we see lost data (X) must be a 1. MS DOS boot record • See text pg 471 • Root directory is the main directory for a disk volume A directory entry for a file contains filename, size, attribute and starting cluster number. Directory trees • FAT and NTFS have root directories containing primary list of files on the disk. • Subdirectories may be contained in the directory Directory trees Root directory cpp java asm lib jdk source bin jar bin etc MS DOS directory structure • MS-DOS entries are 32 bytes long with fields shown in table 14-5 MS DOS directory entry Hex ofs 00 08 0B 0C 16 18 1A 1C Field Filename extension attr reserved time data Start cluster size format ASCII ASCII 8-bit bin 16-bit bin 16-bit bin 16-bit bin 32-bit bin Filename status byte Status byte description 00h Entry never used 01h With attr=0fh and status byte 1h, this is the first entry of a long filename 05h E5h 2E5h 4nh Entry is for a filename where the file has been erased (.) for directory name First long name entry… with attr=0fh this marks the end. n=#entries for filename Attribute field is bit-mapped reserved reserved archive subdir Volume label System hidden Readonly An entry of 0Fh indicates that the current dir entry is for an extended filename Date stamp Year = 0..119 and is added to 1980 Month=1..12 Day=1..31 month day 15 year 9 8 5 Time stamp Hour=0..23 Minute=0..59 Seconds=0..59 seconds hours minutes 0 15 11 10 5 4 MSDOS 32 bit date/time same as 16 bit, but date is high word of a double word • • • • • • Year bits 31-25 Month 24-21 Day 20-16 Hour 15-11 Min 10-5 Sec 4-0 Cluster chain example- just links are shown 2 3 4 8 1 2 3 4 9 10 eoc 5 6 7 8 9 10 File starting cluster=1, filesize=7 11 12 13 14 15 16 Cluster chain example#2- just links are shown 6 7 11 1 2 3 4 5 6 7 12 eoc 8 9 10 File starts in cluster 5, size5 11 12 13 14 15 16 FAT • When a file is create the OS looks for an available cluster entry in the FAT. Gaps occur if insufficient contiguous entries are available – typically as files are deleted & new ones added. • As files are modified and resaved, their chains become fragmented. • As r/w heads jumps between cylinders to locate all of a file’s clusters, performance degrades. 3 programs • Previous (5th) edition text contained 3 programs to read sectors, check free diskspace, and look at clusters. • But the first two do not run under xp.