b.

advertisement
Chapter 15: Network Integrity
Chapter Outline
1. On the Test
a. 3.4: Identify the main characteristics of network attached storage.
b. 3.5: Identify the purpose and characteristics of fault tolerance.
c. 3.6: Identify the purpose and characteristics of disaster recovery.
2. Network Integrity
a. The definition of network integrity is maintaining the state of the network
such that all parts function as a whole in a sound and unimpaired state.
b. The areas that must be included in a plan to maintain network integrity
are:
i. Documentation
ii. Disaster Planning/Recovery
iii. Fault Tolerance
iv. System Backup
3. Documentation
a. Documentation for the network includes information on the following:
i. LAN/WAN topology
ii. Hardware inventory
iii. Software inventory
iv. Change logs
v. Server information
vi. Router and switch configurations
vii. User policies and profiles
viii. Baseline documents
ix. Mission-critical applications and hardware
x. Network service configuration
xi. Procedures
b. Good network documentation aids in troubleshooting problems that occur
within the network such as failed connections, failed servers, hung
applications, WAN connection failure, and user resource access failure.
c. Documentation can be formalized using custom forms or informally kept
using inexpensive notebooks to record change and repair events.
4. Disaster Planning
a. A disaster is an event that causes widespread destruction or distress, or
total failure.
b. Planning for the worst-case scenario allows the network administrator,
planning team, and technicians to anticipate the consequences of both
natural and man-made disasters.
c. Disaster planning may be as simple as writing a procedure to back up all
data, or as complex as contracting for remote hot-sites with 100% uptime
so that a network could sustain a disaster without loss of service.
5. The Disaster Recovery Plan
a. A disaster recovery (DR) plan follows a set of well-defined steps:
b. Creation of the disaster recovery (DR) team
c. Identifying the risks and vulnerabilities that threaten the network.
d. Business Impact Assessment (BIA)
e. Definition of needs
f. Detailed plan development
g. Testing
h. Maintenance of the plan
i. A disaster recovery plan is a living document that may take many weeks
or months to develop and implement, and this plan must be consistently
updated as changes are made to the network.
6. Mirrored Servers (Failover Clustering)
a. Mirrored servers provide 99.9% uptime for mission-critical applications
and data.
b. To build mirrored servers, both servers must be configured with identical
equipment and software, and both must be attached to the network.
c. The “primary” mirrored server answers requests from the network, and
issues a “heartbeat” to its twin to let the secondary mirrored server know
that the primary is servicing the network.
d. When the secondary server does not hear a heartbeat in a predetermined
time frame, it will begin answering requests from the network. The
window between failure of the primary server and “cutover” to the
secondary is usually 30 to 45 seconds.
7. Clustered Servers
a. Clustered servers represent two or more servers that are configured with
identical applications and file structures, all attached to the network, and
all answering requests from the network. All servers are acting as one very
large server.
b. Clustered servers can make use of replication services. Several servers
may be located off-site and participate in replication to assure that data is
identical on all boxes in the event of failure within the network or disaster.
c. Clustering is very expensive to implement and is a complex
implementation. For this reason, small- to medium-sized businesses
usually do not choose this option for disaster planning and recovery.
8. Power Protection
a. Power loss is one of the small disasters that an administrator can mitigate
without undo expense or complex configurations.
b. Several types of power protection can be used in the network. The choice
will depend on the nature of the operations and the stability of the
geographical location of the business.
9. Surge Protectors
a. Surge protectors are designed to minimize the effects of power spikes,
surges, and brownouts.
b. Surge protectors do not protect equipment from “dirty power,” noise on
the line, or power failure.
c. Over time, the circuit breaker in a surge protector loses its sensitivity to
power fluctuations and can allow great variation in power to pass through
to components. This weakens the components and may contribute to
premature failure.
d. Surge protectors should be replaced at least yearly on equipment to reduce
the likelihood of component damage.
10. Online Uninterruptible Power Supplies (UPS)
a. The purpose of a UPS is to provide enough power for enough time to
allow a server or other critical machine the ability to shut down gracefully.
b. An online UPS provides protection for equipment by conditioning the
power before it reaches the equipment.
c. Inside the UPS is a battery that stores power coming from a wall outlet.
That power is then sent to the equipment. All noise and fluctuation is
minimized, thus making the power used by the server “clean” again, and
providing a power source should there be a loss of power.
d. The size of the UPS depends on the wattage of the attached equipment. 1
watt = 1.4 VA. Calculate the wattage of the equipment, multiply it by 1.4,
and determine the length of time necessary to complete the shutdown
process and any other routines that must be done while the machine is still
running. Most UPSs will provide 15-20 minutes worth of power by
default, but if longer times are needed, then the total wattage must be
multiplied by the amount of time (above 15-20 minutes) to determine the
size of the UPS.
11. Standby UPS
a. A standby UPS allows power to go directly to the equipment while
charging a battery in the UPS. When a power failure occurs, the UPS
detects a reduction in power and cuts over to battery power.
b. Some devices, such as servers, may reboot or shut down during a short
gap between loss of power and cut-over to battery backup.
12. Fault Tolerance
Fault tolerance is the system’s capacity to continue functioning given a “fault”
or malfunction of one or more components.
13. Disk Fault Tolerance
a. Disk fault tolerance provides the network with the ability to recover from
loss of function of a hard disk storage device, and to prevent loss of data
stored on that device.
b. One of several disk fault tolerance strategies can be implemented in the
servers to protect the data. The most common is some form of Redundant
Array of Inexpensive Disks (RAID).
14. RAID Level 0
a. RAID level 0 is commonly called disk striping without parity.
b. This form of RAID allows data to be written across multiple disks, but
does not provide any fault tolerance.
c. RAID 0 requires at least two hard disks to implement.
d. With RAID 0, both read and write performance will improve over singledisk usage.
e. RAID 0 uses all available disk space for storage.
f. This form of RAID is useful for noncritical data that is routinely backed
up.
15. RAID Level 1
a. RAID level 1 is commonly referred to as disk mirroring (or disk duplexing
when two controllers are used).
b. With RAID 1, data is written to both disks at the same time. Should one
disk fail, the other disk takes over servicing requests from the network.
c. RAID level 1 requires two disks to implement.
d. Mirroring/duplexing will provide good read and write access to data on
the disk.
e. Only 50% of the total disk space can be used for storage.
f. This form of RAID is used where fault tolerance is needed, but cost is of a
concern.
16. RAID Level 2
a. RAID level 2 is known as bit-level striping with Hamming code ECC.
b. This level of RAID is not used in modern systems.
17. RAID Level 3
a. RAID level 3 uses byte-level striping with dedicated parity.
b. Data is striped across multiple drives and a parity bit is written to a
dedicated hard disk for recovery of lost data.
c. Read performance with RAID 3 is good, but write performance is only
poor to fair.
d. This type of RAID is costly to implement and is not as efficient as other
implementations.
18. RAID Level 4
a. RAID level 4 uses a method called block-level striping with dedicated
parity.
b. The difference between RAID 3 and RAID 4 is simply that 4 uses blocks
of a size determined by the administrator and 3 uses a stripe at the bit
level.
c. Read performance is good and write performance is fair.
d. This type of RAID is a midline between 3 and 5, but is not frequently
implemented.
19. RAID Level 5
a. RAID level 5 is commonly known as striping with parity.
b. This form of raid requires at least three disks. Data is striped across the
disks, and a parity bit is written to the disk as well. This is not a dedicated
parity disk system.
c. Read performance is very good, while write performance is fair.
d. When figuring available storage space, add the amount of disk space on all
drives and subtract the amount of space on one drive.
e. RAID 5 is considered to be the best choice for fault tolerance and
performance.
20. RAID Level 6
a. RAID level 6 uses block-level striping with dual distributed parity.
b. This form of RAID requires a minimum of four disks to implement. The
equivalent of two disks is lost to parity.
c. The read performance is good and the write performance is poor to fair
due to the parity bits written to the drives.
21. RAID Level 7
a. RAID Level 7 is a proprietary form of RAID that uses an asynchronous
cached striping mechanism with dedicated parity storage.
b. Although a defined RAID level, consult the vendor for more information.
22. Backups
a. When determining a backup strategy, the first two considerations are how
you want to accomplish the backup (the hardware) and what software you
will use to complete this task.
b. Some of the options for backup include:
i. Small- and large-capacity removable disks
ii. Optical disks
iii. Magnetic tape (the most commonly used medium)
c. Once the medium is identified, the administrator will determine a schedule
of backups using one or more of the following methods:
i. Full backups
ii. Incremental backups
iii. Differential backups
23. Full Backups
a. A full backup takes all data and commits it to tape.
b. During a full backup, the archive bit (attribute) is reset to off to notify the
backup software that the file has been saved to tape.
c. Full backups done on a daily basis allow quick restore because only one
tape will be used to complete the restore.
24. Incremental Backups
a. During an incremental backup, only files that have changed since the last
backup are committed to tape. The last backup may have been a full,
incremental, or differential backup.
b. This method of backup is used in conjunction with weekly full backups.
c. Incremental backups reduce the amount of time it takes to complete the
backup process because of the limited selection of files that are backed up.
d. When restoring, the administrator or technician must locate the last full
backup, and all incremental backup tapes since the last full backup.
e. Incremental backups reset the archive bit to off.
25. Differential Backups
a. A differential backup saves all files that have changed since the last full
backup.
b. To restore, only the tapes from the last full backup and the last (most
recent) differential backup will be used.
c. This method of backing up data is used in conjunction with weekly full
backups.
d. A weekly full backup and daily differential backups are considered the
most efficient and safest strategy for maintaining data integrity.
26. Other Considerations for Backup Strategy
a. Tape rotation patterns are determined when the backup strategy is
designed.
b. The choices are:
i. Daily rotation
ii. Weekly rotation
iii. Monthly rotation
iv. Yearly rotation
c. With each option, the administrator must consider what archive of past
data must be maintained for the business, and whether the cost of
maintaining a large archive of tapes outweighs the protection of the data.
d. Most businesses use either a weekly rotation or a monthly rotation to
manage archived data.
e. Tape storage is important to consider as well.
f. Magnetic tape is susceptible to damage from natural elements including
heat, sun, water, and humidity. Proper storage for disaster recovery is
necessary.
g. Tapes should be stored in climate-controlled rooms that are physically
protected or at an off-site storage facility. The best option for disaster
recovery is to contract with a third party to maintain the tape archive at a
remote location. This method allows the tapes to remain safe should there
be a disaster at the location of the business. Restoration of the data can
then take place at the new location should the old one be rendered
unusable.
27. Network Attached Storage (NAS)
a. NAS is a relatively new data storage concept that attaches large data
storage boxes to the network, but does not require a server to manage.
b. Access to NAS is controlled through file system permissions.
c. NAS can use multiple file formats such as CIFS and NFS, allowing the
storage facility to be platform (operating system) independent.
d. When considering NAS, keep in mind that the NAS box is a storage
facility, and does not expend any resources providing any other services to
the network.
e. NAS boxes can be brought down for maintenance without causing outages
on the network.
Download