Block-IO

advertisement
Juan Ortega
9/23/09
NTW412
A block is a sequence of bytes or bits having a nominal length (block size).
Blocking is used to facilitate the handling of the data-stream by the computer
program receiving the data. Blocked data are normally read a block at a time.
Blocking is employed in storage devices such as floppy disks, hard disks,
optical discs, and flash memory.
The actual arrangement of information on the surface of a disk platter is
referred to as a physical block. The Physical Block Number (PBN) is an
address used for identifying a particular block on the surface of the disk.
(Usually 512 bytes)
When the blocks on a disk are considered from a programming point of
view, they are viewed as logical blocks. The address of a logical block
on a disk is its Logical Block Number (LBN). LBN 0 (zero) is the first LBN
on a disk. Logical blocks correspond one-for-one to physical blocks, but
the logical block number might not correspond directly to the same
physical block numbers.
The block size specifies size that the filesystem will use to read and write
data. Larger block sizes will help improve disk I/O performance when using
large files, such as databases. This happens because the disk can read or
write data for a longer period of time before having to search for the next
block.
On the downside, if you are going to have a lot of smaller files on that
filesystem, like the /etc , there the potential for a lot of wasted disk space.
In classical file systems, a single block may
only contain a part of a single file. This leads
to space inefficiency due to internal
fragmentation, since file lengths are often not
multiples of block size, and thus the last block
of files will remain partially empty. This will
create slack space, which averages half a
block per file. Some newer file systems
attempt to solve this through techniques
called block suballocation and tail merging.
Doing block I/O means that the application or file system is sending
blocks to the disk drive to be written or asking for blocks using a
logical block address (LBA).
File systems turn file requests into block I/O. Applications (including
databases) can do file I/O or they can bypass the filesystem and do
block I/O (this is usually called raw I/O).
Obviously it's easier to do file I/O, and you can do file sharing much
easier that way. Doing block I/O may have performance advantages
(in control of the buffering/caching and not having the file system
overhead).
NAS does file I/O while SANs typically do block I/O.
File I/O is referencing data as a file entity from a remote file system. When
referencing a file, an application uses a "file handle:offset" which really is
the name of the file and the number of bytes into the file for access to
data.
For NAS, a redirector diverts the access from a local file system to a
remote file system that is accessed across a network usually through
TCP/IP over Ethernet. The NAS device turns the remote file system
access into its own local file system access that results in a block I/O
(raw I/O) to attached devices.
1) The I/O request is intercepted by a
network redirector, also referred to simply
as a redirector, on the local computer.
2) The redirector constructs a data packet
containing all of the information about the
request, and sends it to the server where
the file is located.
3) The redirector on the server receives the packet from the client,
authenticates the access to the file required by the I/O request, and, if
authenticated, executes the request on behalf of the client. If not, it returns an
error code to the redirector on the client.
4) When the request has been executed, the redirector on the server
sends any data resulting from the I/O request to the redirector on the
client along with a success notification.
5) The redirector on the client receives the packet from the server and passes
the data in the packet to the application along with a success notification.
For a SAN access, usually a block I/O is done where the application
will access the file, the file system on the local server will turn that into
a request for a block on a particular device (LUN - logical unit) and
then block I/O is done over an interface such as Fibre Channel.
The bottom line is that file I/O eventually (and always) turns into a
block I/O -- locally it's done through the file system, for NAS it's
redirected to a NAS device to do it on a remote system.
Basically, a SAN does block I/O just like having a disk directly attached
to a server. A NAS is really remote file system I/O where the file
request is redirected over a network to a device (really a processing
entity with its own file system) where the file I/O is actually performed.
Deciding whether to use a SAN or NAS has many factors to consider
so there's not a specific answer. In general if your application requires
block I/O or there is a significant performance requirement, use a SAN.
If it's file based I/O for the application or you need to share files and
you want simple administration, use NAS.
2, 5) Block (data storage). Retrieved September 23, 2009 from Wikipedia Web
site: http://en.wikipedia.org/wiki/Block_(data_storage)
3, 4) Fragmentation: Chapter 1. Retrieved September 23, 2009 from diskeeper
Web site: http://www.diskeeper.com/fragbook/chapter1.htm
6) Kerns, R.(2005). What is block I/O?. Retrieved September 23, 2009 from
techtarget Web site:
http://searchstorage.techtarget.com/expert/KnowledgebaseAnswer/0,2896
25,sid5_gci1135655,00.html
7, 9) Kerns, R.(2001). Block I/O and I/O transfer. Retrieved September 23,
2009 from techtarget Web site:
http://searchstorage.techtarget.com/expert/KnowledgebaseAnswer/0,2896
25,sid5_gci750396_mem1,00.html
6) A Windows File system Win32 library functions reference and info used in
Windows system programming. Retrieved September 23, 2009 from
tunouk Web site: http://www.tenouk.com/Supplement.html
10-12) Kern, R.(2005). SAN vs. NAS: A diagram of the differences. Retrieved
September 23, 2009 from techtarget Web site:
http://searchstorage.techtarget.com/expert/KnowledgebaseAnswer/0,289625,s
id5_gci1150562_mem1,00.html
Download