Doc. OceanStor S2600 Technical White Paper Issue 01 Date 2010-05-03 HUAWEI TECHNOLOGIES CO., LTD. code Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved. No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd. Trademarks and Permissions and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their respective holders. Notice The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied. The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute the warranty of any kind, express or implied. Huawei Technologies Co., Ltd. Address: Huawei Industrial Base Bantian, Longgang Shenzhen 518129 People's Republic of China Website: http://www.huawei.com Email: support@huawei.com Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 2 of 28 Technical White Paper for Oceanspace S2600 Storage System Contents Change History Date Version Description Author 2012-5-3 V1.0 Initial version Peng Xiao Wu Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 3 of 28 Technical White Paper for Oceanspace S2600 Storage System Contents Contents Change History ................................................................................................................................. 3 1 Executive Summary ...................................................................................................................... 7 2 Introduction.................................................................................................................................... 8 2.1 Evolutionary ..................................................................................................................................................... 8 2.2 Easy .................................................................................................................................................................. 8 2.3 Enhanced .......................................................................................................................................................... 8 2.4 Energy-Saving .................................................................................................................................................. 9 2.5 Economical ....................................................................................................................................................... 9 3 Solution ......................................................................................................................................... 10 3.1 Hardware Architecture Transported from Mid-Range Products ..................................................................... 10 3.1.1 Advanced Bus Technology .................................................................................................................... 10 3.1.2 Full-Redundancy Architecture .............................................................................................................. 11 3.1.3 Active-Active Dual-Controller Technology .......................................................................................... 11 3.1.4 Coffer Technology ................................................................................................................................ 11 3.2 Various Software Functions............................................................................................................................ 12 3.2.1 Redundant Copy.................................................................................................................................... 12 3.2.2 Disk Spin-Down Technology ................................................................................................................ 14 3.2.3 Automatic Recovery of Bad Sectors ..................................................................................................... 15 3.2.4 HyperImage........................................................................................................................................... 16 3.2.5 HyperCopy ............................................................................................................................................ 17 3.2.6 Remote Replication ............................................................................................................................... 17 3.2.7 WORM Technology .............................................................................................................................. 17 3.2.8 DHA Technology .................................................................................................................................. 18 3.2.9 Split Mirror ........................................................................................................................................... 20 4 Experience ..................................................................................................................................... 22 4.1 Unique Values of the S2600 ........................................................................................................................... 22 4.1.1 Combo Interface Technology ................................................................................................................ 22 4.1.2 Disk Hibernation Technology ............................................................................................................... 22 4.1.3 Distributed Power Supply System ........................................................................................................ 23 4.1.4 Perfect Combination of HyperImage and Backup Software ................................................................. 23 4.2 S2600 Application Cases ................................................................................................................................ 24 Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 4 of 28 Technical White Paper for Oceanspace S2600 Storage System Contents 4.2.1 Yueyang City of Hunan Province .......................................................................................................... 24 4.2.2 Harbin Railway Station ......................................................................................................................... 25 5 Acronyms and Abbreviations ................................................................................................... 27 Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 5 of 28 Technical White Paper for Oceanspace S2600 Storage System Figures Figures Figure 3-1 Logical architecture of the S2600 ...................................................................................................... 10 Figure 3-2 Principle of the redundant copy technology ...................................................................................... 13 Figure 3-3 Conditions for setting a RAID group to the hibernation state ........................................................... 14 Figure 3-4 Two ways to set a RAID group to allow hibernation ......................................................................... 15 Figure 3-5 Working process of the WORM technology ...................................................................................... 18 Figure 3-6 Working principle of the DHA system ............................................................................................... 19 Figure 4-1 Storage networking of Yueyang city .................................................................................................. 25 Figure 4-2 Storage backup system of Harbin railway station .............................................................................. 26 Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 6 of 28 Technical White Paper for Oceanspace S2600 Storage System 1 1 Executive Summary Executive Summary As the third generation storage product developed by Huawei, the OceanStor S2600 (hereinafter referred to as the S2600) is designed for mid-range and low-end markets as well as small- and medium-sized enterprises. In addition to stability and reliability provided by the last-generation mid-range product, the S2600 is easier to use. Thus, it provides you with stable, reliable, convenient, and easy-to-use data storage and management services with various functions. Considering requirement variety and management complexity of small- and medium-sized enterprises, the S2600 focuses on the needs of customers and is designed to solve your problem. The S2600 provides a wide range of flexible host ports, including FC, SAS, iSCSI and a special combo port combining FC and iSCSI, to adapt to complex networking environments. The combo port helps you integrate the FC with IP network to simplify networking and leverage your investment. Another major problem of small- and medium-sized enterprises is that the staff work in scattered places and possess different levels of technical skills, which results in difficulties in management. To solve this problem, Huawei develops the Integrated Storage Manager (ISM) for helping you manage your S2600. The ISM is designed with a Java Web Start (JWS) architecture. The ISM is easy to operate and configure, and provides functions, such as configuration wizard and typical application scenario, especially for mid-range and low-end markets. Configuring the S2600 for the first time takes less than five minutes. Ease of use, as the major advantage of the ISM, meets the requirement of mid-range and low-end markets. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 7 of 28 Technical White Paper for Oceanspace S2600 Storage System 2 2 Introduction Introduction The S2600 provides a variety of features to ensure superior and convenient services. 2.1 Evolutionary Seamlessly combines FC SAN with IP SAN. Dual protocols support with FC and iSCSI as the base configuration on all models. Supports functions of mid-range products. Provides many functions of mid-range storage products, thus improving cost-effectiveness significantly. For example, the remote copy function provides higher-level disaster recovery services; the data coffer ensures data integrity when the system collapses; the disk pre-copy function avoids the risk of failure with two disks at the same time. Easy installation and operation 2.2 Easy The JWS installation-free technology is introduced to the ISM software. With a configuration wizard, you can complete all configurations within five minutes. Convenient maintenance and support The S2600 generates alarms in forms of SMS, emails, sound, and light. The controller, power supply, and hard disk modules support the hot-swap. Ubiquitous products and services Our world-wide marketing and service network provides customers with quick and quality services. 2.3 Enhanced Enhanced data protection The S2600 supports multiple enhanced software functions, such as snapshot, local copy and remote copy, providing overall data security protection. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 8 of 28 Technical White Paper for Oceanspace S2600 Storage System 2 Introduction Complete disk protection solutions The S2600 combines snapshot with backup software, thus backing up data rapidly and efficiently. The S2600 supports automatic check and repair of disk bad tracks. Powerful scalability The S2600 supports background initialization of the RAID and online expansion. It also supports a maximum of 96 disks and 256 hosts. Carrier-class availability The system is 99.999% reliable. 2.4 Energy-Saving Energy-saving for hard disks The industry-leading disk dormancy and disk down-speeding technologies reduce over 40% of power consumption of backup and archiving applications. Energy-saving for components The S2600 adopts low power consumption and lead-free components, making it a green product. 2.5 Economical Provides eight host ports The S2600 provides eight host ports. You do not need a switch when the number of hosts on the network is less than eight. Intermixing SAS and SATA to optimize space You can choose either SAS or SATA disk to store data, depending on specific requirements for access and security, which leverages your investment. Space-saving design The overall height of the subrack is less than 2U. It is specifically designed to help you save space in cabinets. Energy-saving for power supplies The S2600 provides AC and DC power supplies, thus meeting the requirements of a complicated power supply environment. The DC power supply is more efficient and energy-saving. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 9 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 3 Solution Solution 3.1 Hardware Architecture Transported from Mid-Range Products With the features of the last-generation mid-range product incorporated, the S2600 adopts the mid-range architecture although it is designed for mid-range and low-end markets. The stability and reliability of the S2600 are trustworthy for you, mainly in the following aspects: Active-active dual-controller architecture Full 64-bit operating system System bus with a higher bandwidth Full-redundancy hardware design to ensure high reliability of the system Figure 3-1 shows the logic architecture of the S2600. Figure 3-1 Logical architecture of the S2600 3.1.1 Advanced Bus Technology The S2600 adopts the advanced PCI-E technology. Compared with the PCI-X bus technology, the PCI-E technology ensures higher bandwidth. The S2600 provides a system bandwidth high up to 24 Gbit/s. Even if the system is fully configured with FC host ports, line rate access Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 10 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution can be ensured on each FC port. In addition, the system uses high-end SAS chips to provide the bandwidth of up to 24 Gbit/s. 3.1.2 Full-Redundancy Architecture The S2600 uses the fully-redundant architecture. All the key components of the system are configured in 1+1 redundancy mode, such as power modules, fan modules, and controllers. Physically, the mirroring channels between two controllers use two redundant links in SAS interconnection mode. The total bandwidth of mirroring channels can reach 2.4 Gbit/s. 3.1.3 Active-Active Dual-Controller Technology In storage systems with two or more controllers, controllers may work in either of the following two modes:Active-passive mode (AP mode) It is also called the active/standby mode. In this mode, only one controller is activated (active controller) to process I/O requests sent from application servers. The other controller is idle (standby controller). The standby controller takes over services from the active controller when the active controller fails or becomes offline. Active-active mode (AA mode) In the active-active mode, both controllers are activated. The two controllers concurrently process I/O requests sent from application servers. Once one controller fails or becomes offline, the other controller takes over the services from the failed one without affecting its own services. This active-active mode, where the two controllers serve as the backup of each other, ensures high system reliability and resource usage, balances service traffic, and improves system performance. The two controllers of the S2600 work in AA mode. In addition to high reliability, the controllers can balance service traffic, make full use of system resources, and boost system performance. If a controller fails, for example, if the link connected to the controller fails, the service on the failed controller can be switched over to the other controller. After the link recovers and the failed controller resumes to work properly, the controller continues to control the previous services. In the entire process, the switchover of services is transparent to you. You see a link failure for a short time and then link recovery on the host. Another main function of the two controllers working in AA mode is load balancing. Storage services are shared on two controllers. This prevents the situation wherein excessive load is endured by one controller while the other controller is idle for a long time. Thus, the load on one controller is reduced, system resources are used more effectively, and the working efficiency and performance of the system are improved. 3.1.4 Coffer Technology In general, the storage system uses the cache to improve the read/write performance of the host. The host data is written to the cache first instead of the disk directly. However, the cache of the storage system is made of volatile media. If the storage system experiences a power failure, the integrity and completeness of the data is hard to guarantee. The S2600 adopts the coffer technology to provide data protection in case of a power failure. A coffer refers to a group of disks storing cache data in case of a power failure. The coffer can permanently store the cache data after a power failure. This ensures the reliability of the system. In addition, the coffer of the S2600 can store system configuration data and alarm logs. The cache data in case of a power failure as well as system configuration data and alarm logs are Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 11 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution critical for the storage system. Therefore, the reliability of the coffer is of vital importance. The coffer of the S2600 consists of four disks, which work in full redundancy mode. That is, data is stored on the disks of the coffer as four copies. If a disk fails and then is replaced with a new one, the data restoration mechanism of the coffer ensures that the data is completely restored onto the new disk. In addition, the entire operation is performed online, which does not affect services. 3.2 Various Software Functions 3.2.1 Redundant Copy Data security is the basic requirement for a storage system. The S2600 storage system uses the RAID technology to ensure system reliability. However, the security of RAID algorithms relies on the reliability of disks. When a disk runs for a long time, the probability of failure increases. Especially for a storage system using the disks of the same batch, when a disk fails, it indicates that the failure probability of the entire system increases. In addition, any RAID algorithm allows only a specific number of disks to fail at the same time. If you cannot find the potential faults in running disks and handle the faults in time, great risks are posed to data security. If a disk fails, it takes a certain time to reconstruct the data in the failed disk, which degrades the performance of the entire system. In this case, the redundant copy technology is introduced to prevent or reduce the impact on the storage system caused by disk reliability. The redundant copy technology allows you to obtain the information about disk status through the self-monitoring analysis and reporting technology (S.M.A.R.T). The redundant copy algorithm checks the running status of the disks to calculate the probability of potential failures of disks and copies the data from the disk with a potential failure to the hot-spare disk in advance. The entire copy process is performed when the system is idle to prevent impact on host services. This prediction act shortens the reconstruction time after a disk failure and reduces the probability of further failures of the disk during reconstruction. In addition, it greatly improves storage security and ensures service continuity. The S2600 uses the redundant copy technology. The accuracy of disk status prediction is the key to the redundant copy technology. By recording traceable property items during disk running, it determines the health status of the disks.Common disk status prediction faces the following problems: Because the disk is a precise mechanic component, there is a small probability that some faults cannot be found in time through the S.M.A.R.T. According to statistics from professional organizations, up to 36% of the disks receive no alerts from the S.M.A.R.T before they fail. The S.M.A.R.T mainly detects mechanical problems, but disk damage is caused by the problems of electrical components. The latest research in reliability shows that we have some misunderstanding about the reliability of disks. Specific environmental factors shall be considered in failure judgment in future. The following innovation is introduced in the redundant copy technology of the S2600. Figure 3-2 shows the principle of the redundant copy technology. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 12 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution Figure 3-2 Principle of the redundant copy technology 1. The first part of data: disk items checked by the S.M.A.R.T and disk status predicted. 2. The second part of data: customized data periodically collected and checked by the storage device, including: Special properties of disks, including: − Age of each disk − Manufacturer and production batch of disks − Reliability specifications of the disks of this batch − Appearance features of disks, for example, a disk is too new or too old Running parameters of RAID groups The running parameters of RAID groups indicate dynamic statistics of the usage of disks, such as: 3. − Sequential read and write − Random read and write − Bandwidth − Running period features Device management data Environmental factors The preceding two parts of data are called extra safe data. The data is calculated by a specific algorithm of predicting disk status and then the result of the predicted disk failures is presented. The specific algorithms of the S2600 analyze the data based on different conditions and different weight and predication strategies. For example: 4. Issue 5.0 (2010-06-11) Magnetic calibration during frequent random read/write operations is considered. Read/write errors during frequent sequential read/write operations are considered. The performance change rate during running if a disk fails is considered. Enable redundant copy according to the predicted result. Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 13 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution The OceanStor series storage system provides the redundant copy technology and the technology of completely and effectively predicting disk status, which ensures data security and service continuity. 3.2.2 Disk Spin-Down Technology The disk spin-down technology aims to reduce the power consumption of storage devices and prolong the service life of disks by hibernating certain disks. A disk can be in any of the following three states: Active: The I/O operations on the disk work properly. The disk motor and read/write head work properly. Idle: There is no I/O operation on the disk. The disk motor works properly, but the read/write head does not work. Standby: The disk is supplied with power and the system works properly, but the disk motor and read/write head do not work. In the S2600, you can set only all the disks in a RAID group as a whole to the hibernation state. That is, you cannot set a single disk in the RAID group to the hibernation state. This is the same to waking up the disks in a RAID group. That is, you can wake up only all the disks in the RAID group as a whole. Before setting the disks in a RAID group to the hibernation state, you must set the RAID group to allow disk hibernation. It is recommended that you set a RAID group whose data is in the near-line or offline state to allow hibernation. After you set a RAID group to allow hibernation, the storage device monitors the I/O operations of the disks in this RAID group. If no I/O operation is performed on any disk in the RAID group within a specified period, for example, 30 minutes, all the disks in the RAID group changes to the hibernation state. If I/O operations are performed on a disk, the disks in the RAID group do not change to the hibernation state. Figure 3-3 Conditions for setting a RAID group to the hibernation state RAID group 1 RAID group 2 No I/O access in the preset period of time RAID groups that do not allow hibernation No I/O acces I/O access RAID groups that allow hibernation As shown in Figure 3-4, there are two ways to set a RAID group to allowing hibernation on the S2600: Out-of-band management channel: using the ISM or CLI SCSI inband management channel: using commands sent by the host program Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 14 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution Figure 3-4 Two ways to set a RAID group to allow hibernation Application server Inband management: using commands sent by the host program Maintenance terminal Out-of-band management: using the ISM or CLI RAID groups that allow hibernation 3.2.3 Automatic Recovery of Bad Sectors Disks are one of the most important components in a storage system. However, disks are likely to fail or be damaged easily. Data is stored in magnetic media. During long-time running of disks, a sector is bad due to frequent read and write operations or improper use. As the number of bad sectors on a disk increases, the disk is to be discarded.On the other hand, the maintenance costs of disks are highest in the entire storage system. Because the failure rate of disks is high, users have to replace failed disks to ensure data security. With increasing data volume and read/write operations, the failure rate of disks will be higher and thus costs will increase. To solve the problem, Huawei develops the automatic bad-sector recovery technology for the OceanStor series storage system. This technology provides the following features: Recovering the data in bad sectors of disks intelligently Prolonging the service life of disks Reducing the failure rate of disks Reducing total costs Among the software modules of the OceanStor series storage system, dedicated bad-sector monitoring module and bad-sector recovery module are developed to implement intelligent recovery of bad sectors of disks. Combining the SMART of disks with the intelligent prefetch technology of Huawei, the bad-sector monitoring module monitors bad sectors of disks completely. This monitoring method has a better real-time capability, compared with the method of finding bad sectors through read/write data errors returned by disks. By combining the two technologies, the accuracy of monitoring bad sectors of disks reaches 100%. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 15 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution The working principle of the bad-sector recovery module is as follows: 1. After the storage system detects a bad sector on a disk through the bad-sector monitoring module, the data in the bad sector is reconstructed through the data on the other disks in the RAID group. 2. The system searches for a reserved sector on the disk and maps the address of the damaged sector to the reserved sector of the disk. 3. The system writes data to the mapping address and then the data can be read and written to this address. Because only the data in the bad sector is reconstructed, the reconstruction time can be ignored. The automatic bad-sector recovery technology of the OceanStor series storage system can find bad sectors on disks and recover the bad sectors automatically, without manual intervention. The intelligent recovery technology greatly ensures data security and effectively reduces the failure rate of disks. The failure rate of disks is reduced by more than 48%. 3.2.4 HyperImage As defined by Storage Networking Industry Association (SNIA), a snapshot is a fully usable copy of a defined collection of data that contains an image of the data as it appeared at the point in time at which the copy was initiated. There are many methods of implementing snapshots on disk arrays. The methods vary according to manufacturers. The HyperImage technology of the OceanStor series storage devices is a virtual snapshot, which combines mapping tables with copy-on-write to implement snapshots. Virtual copy-on-write snapshots are a widely used technology. Therefore, this document does not describe it. In the S2600, you can take snapshots for each LUN at eight time points and save eight data duplicates at different time points for future use. HyperImage provides the following features: No backup window A backup window refers to an interval of time during which a set of data can be backed up without seriously affecting applications that use the data. Data backup can be performed online through HyperImage. In the backup process, almost no backup window is involved and the services are running properly. Quick data recovery HyperImage can directly read snapshots to obtain the original data at the time points of the snapshots. If the data of the original LUN is damaged, you can restore the data at specific time points from snapshots to implement data rollback. Periodical HyperImage to ensure continuous data protection HyperImage supports virtual snapshots at multiple time points for an original LUN. In addition, you can set policies to allow the system to activate or stop snapshots automatically. In this way, snapshots are automatically and periodically taken at multiple time points, which saves costs and performs continuous data protection. Redefinition of data purposes Through HyperImage, you can directly read consistent images of the original data through snapshots at different time points. The system can allocate them to other applications such as testing, archiving, and querying. This protects the production system and also defines new purposes for backup data. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 16 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution 3.2.5 HyperCopy HyperCopy creates a copy of original volume to a target volume. The original volume and the target volume may reside on the same disk array or different disk arrays. HyperCopy can make copies in a disk array or between disk arrays. HyperCopy provides two different approaches of copy: full copy and incremental copy. Full copy means copying the data in the original volume to a target volume completely. Full copy is performed offline, that is, when the services are stopped. Otherwise, the data copied is a copy of process data. If the data in the original volume is large, it takes a long time to complete fully copy. Incremental copy means copying only updated data to the target volume after initial synchronization between the original volume and the target volume, that is, copying the data updated from the time of last copy to the current time. Compared with full copy, incremental copy can be performed online. In addition, incremental copy is implemented together with HyperImage. The HyperCopy technology provided by the S2600 supports full copy and incremental copy to meet different requirements on data copy and data backup. HyperCopy provides the following features: Combined with HyperImage, incremental copy of HyperCopy can be performed online, without interruption of services. As an array-based data replication technology, HyperCopy has no impact on ASs and service networks. HyperCopy can make copies based on FC links or IP networks. HyperCopy can make copies in an OceanStor array, between two OceanStor arrays, or between an OceanStor array and third-party arrays. HyperCopy can make copies from one OceanStor array to multiple OceanStor arrays. 3.2.6 Remote Replication As a data mirroring technology, remote replication allows you to maintain a number of data copies of two or more sites, to avoid data loss caused by a disaster. There are many remote replication technologies, and synchronous replication and asynchronous replication are most commonly used. The S2600 supports both synchronous replication (HyperMirror/S) and asynchronous replication (HyperMirror/A), to provide multiple disaster recovery modes. 3.2.7 WORM Technology The write once read many (WORM) technology allows data to be written to the storage medium only once. After written to the storage medium, data cannot be modified or deleted, or written again. This technology applies to the protection of important data. The WORM technology functions on the disk array. If a LUN is set to a WORM LUN, it becomes read-only. Figure 3-5 shows the working process of WORM. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 17 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution Figure 3-5 Working process of the WORM technology Host side Write request from the host Read request from the host Array controller Write request from the array Read request from the array 1. The host sends read/write I/O requests to the disk array. 2. The ticket granting ticket (TGT) layer of the disk array determines whether the LUN is a WORM LUN. 3. If the LUN is a WORM LUN, the disk array reads data from the LUN and forwards the data to the host. The WORM LUN, however, does not support write operations. Thus, an error code is returned to the host. You can set a LUN as a WORM LUN and set a protection period for the LUN in the unit of the day. During the protection period, the data stored in the LUN cannot be modified or deleted. Before the protection expires, you can prolong but not shorten the protection period. When the protection expires, you can set the WORM LUN back to a common LUN. The restrictions on a WORM LUN do not take effect for a common LUN. The protection period ranges from 1 day to 60*365 days. 3.2.8 DHA Technology The disk health analyzer (DHA) technology supports the function of collecting, storing, and automatically transmitting information about the statuses of disks. It also supports the function of checking and analyzing the status of disks offline and generating an alert when finding an invalid disk. These functions are highlights of the DHA technology and can remarkably enhance data security. The DHA collects the changes in disk information including SMART information when disks are working. The DHA analyzes the collected information and predicts the change trend of disk statuses and generates alerts of disks that are about to fail, to avoid data loss. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 18 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution A DHA system contains the disk information collection module in the controller of the disk array, analysis module and Call Home module in the ISM, pre-process of the data in the dedicated system of the service center, database of the disk information, and data analysis module. Figure 3-6 shows the working principle of the DHA system. Figure 3-6 Working principle of the DHA system The working principle is described as follows: The information collection module of the DHA system is integrated into the controller of the disk array. The module collects disk information and forwards the information to the ISM. The analysis module of the DHA system is integrated into the ISM. The module analyzes the disk information collected by the information collection module and performs diagnosis based on the analysis results. The functions of the DHA system can be enabled or disabled in the ISM. The ISM sends the disk information to Huawei service center over the network by using the Call Home function. After receiving the disk information, Huawei service center stores the information in the database. The data analysis system analyzes the data to predict and identify faults, and provide suggestions for rectifying the faults. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 19 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution 3.2.9 Split Mirror Split mirror, a snapshot technology, creates a completely physical copy of a LUN at a point in time. The split mirror feature of the OceanStor series is named HyperClone. Generally, the volume that stores original services is called the master volume, and that generated as a copy volume is called the slave volume. In split mirror, they are called the master LUN and the slave LUN respectively. When a user performs splitting and synchronization on a master LUN and a slave LUN to obtain the physical copy of the master LUN, split mirror physically generates a complete copy of the master LUN at a point in time without interrupting services. In addition, after the splitting, writing data to or reading data from the copy has no impact on the data of the master LUN. Therefore, split mirror can be used in online backup, data mining, and application tests scenarios. Split mirror adopts the bitmap and copy-on-write, and bitmap and dual-copy (write on the slave LUN and then the master LUN) technologies. The working principle of split mirror is as follows: 1. After the slave LUN is added to the split mirror group, by default, the complete synchronization from the master LUN to the slave LUN needs to be performed once, and the copy progress is displayed during the data copy process. 2. If the master LUN receives a write request from the production host during the initial synchronization, the system checks the initial synchronization progress. If the data block to which the new data will be written has not been copied onto the slave LUN, the production host will be notified that the write operation is complete after the new data is written into the master LUN, and new data written into the master LUN will be copied onto the slave LUN during the initial synchronization. If the data block to which the new data will be written has been copied onto the slave LUN, the new data will be written into the master LUN and slave LUN respectively. If the data block to which the new data will be written is being copied, the new data will be written to the master LUN and slave LUN respectively after the data block is fully copied onto the slave LUN. 3. After the initial synchronization is complete, the data on the master LUN is consistent with that on the slave LUN. If the master LUN receives a write request from the production host, data needs to be written to the master LUN and slave LUN respectively. 4. After the initial synchronization is complete, the master LUN and the slave LUN are split from each other. In this case, the master LUN and the slave LUN applies to independent data analyses and tests, and their data changes do not affect each other. The changes of data blocks on the master LUN and the slave LUN are only recorded by progress bitmaps. HyperClone of the OceanStor series has the following advantages: 1. One-to-eight mode: In this mode, HyperClone supports one master LUN and eight slave LUNs, and backs up eight data copies, which can be applied to data analyses of different methods. 2. Zero backup window: The backup window refers to a period accepted by the application for data backup tasks. That is to say, the backup window is the maximum downtime accepted by the applications. During the processing of the backup task using HyperClone, the user does not need to shut down applications, and the backup window is near to zero. 3. Reverse data synchronization: HyperClone supports reverse data synchronization. When the data on the master LUN is incomplete or damaged, and needs to be recovered, you Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 20 of 28 Technical White Paper for Oceanspace S2600 Storage System 3 Solution can reversely synchronize the incremental data from the slave LUN to the master LUN for recovering original service data, which ensures the integrity and consistency of data. 4. Issue 5.0 (2010-06-11) Dynamic copy rate modification: HyperClone supports the dynamic copy rate modification to avoid the conflicts between copy tasks and production services. When services on storage arrays are busy, you can manually lower the copy rate to save the system resources of storage arrays for services. When services are idle, you can manually heighten the copy rate to accelerate the process, avoiding the implementation of the copy task in peak hours. Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 21 of 28 Technical White Paper for Oceanspace S2600 Storage System 4 4 Experience Experience 4.1 Unique Values of the S2600 4.1.1 Combo Interface Technology In the S2600, combo interfaces mean that a controller provides two types of interfaces, that is, FC interfaces and iSCSI interfaces. Each controller of the S2600C storage system supports two FC interfaces and two iSCSI interfaces. That is, two controllers support four FC interfaces and four iSCSI interfaces in total. Combo interfaces have the following advantages: In a set of storage system, the combination of two types of host interfaces makes the storage system meet different networking requirements, that is, FC networking and IP networking. This greatly simplifies your networking environment. Two separate applications are integrated in a set of storage system. This improves equipment manageability and saves costs. 4.1.2 Disk Hibernation Technology Green storage, energy saving, and emission reduction are popular topics in the current storage field. Huawei is dedicated to research energy saving technologies. The S2600 uses the DC power supply system and disks that support energy saving. In addition, it adopts a most obvious energy saving technology, that is, disk hibernation. As the name suggests, disk hibernation means that disks stop running and enter a hibernation state. The power consumption of disks in hibernation state is far less than that of disks in normal running state.The following section takes SATA disks with the rotating speed of 7200 krpm and the capacity of 1 TB as an example. The power consumption of the SATA disks in normal read/write state is about 15 W. After the disks enter the hibernation state, their power consumption is about 3 W only. If an S2600 controller subrack holds 12 disks, after all the disks in the subrack enter the hibernation state, the power consumption of 144 W is reduced, that is, (15 – 3) * 12 = 144. In other words, more than 40% power is saved for the entire controller subrack. The disks of the S2600 enter the hibernation state in the unit of a RAID group. That is, all the disks in a RAID group enter the hibernation state or are waken. You can set the hibernation attribute of a RAID group through the ISM. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 22 of 28 Technical White Paper for Oceanspace S2600 Storage System 4 Experience For example, you can set a RAID group to enter hibernation after an interval of 10 minutes. If the RAID group has no I/O stream for 10 minutes, the RAID group enters the hibernation state. You can also automatically or manually waken a RAID group in hibernation state as required. For example, if you set automatic wakening, once an I/O stream occurs in a member disk, the RAID group is wakened within 10 seconds. The disk hibernation technology can be widely used in near-line or offline applications, such as backup and archiving. It saves power, thus reducing the total cost of operation (TCO). 4.1.3 Distributed Power Supply System In general, most disk array systems use a centralized power supply mode to power the disks in an entire subrack. The power supply system is integrated in control boards. This mode is simple and convenient. As long as you turn off the power switch, the disks in the entire subrack are powered off. However, this power supply mode cannot control power-on and power-off of a single disk and thus the reliability is low. In addition, the power consumption of the disks in the entire subrack has higher requirements on the power supply system integrated in control boards. In comparison with the traditional centralized power supply mode, a distributed power supply mode is designed for the S2600. In this design, the power supply system integrated in control boards is transported to disk conversion boards. This helps control power-on and power-off of each disk. In addition, it has the following advantages: Helps turn off the 5 V power supply to a disk The distributed power supply mode enables you to turn off the 5 V power output to a disk by controlling the 5 V conversion circuit through the logic level. However, the centralized power supply mode turns off the power circuit that uses a large amount of current for each disk individually, which is of high cost, difficult implementation, and poor reliability. Implements zero power consumption of disks in standby state After the 5 V control circuit is cut off, the relevant electric generator stops and the power consumption is zero. Prolongs the service life of the system The disk shutdown mode with zero power consumption greatly reduces the power consumption of the system and prolongs the service life of disks. Provides more reliable hot-swap management Hot swapping is implemented through cold swapping. 4.1.4 Perfect Combination of HyperImage and Backup Software Currently, backup through the HyperImage technology is one of the mainstream backup modes. A copy (snapshot) is created for the original volume at a specific time point to back up source data as a snapshot.This backup mode has the following advantages: This reduces the backup data amount and backup window. Backup snapshot can be performed online, which does not affect the production system of the original volume. The preceding process should be performed manually step by step: 1. Issue 5.0 (2010-06-11) On the page of the array management software, take a snapshot for the original volume. Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 23 of 28 Technical White Paper for Oceanspace S2600 Storage System 2. 4 Experience On the backup software page, back up the snapshot. However, when you back up a database by using only the snapshot function provided by the array, this may cause database data inconsistency. Considering the preceding conditions as well as the application and features of the S2600, a solution is developed based on the Backup Exec (BE) in Windows VSS and HyperImage. Volume Shadow Copy Service (VSS) is a service integrated in the OS of Microsoft. It works with applications, backup programs, and storage hardware to implement mapping copy. VSS provides functional modules and external interfaces. The S2600 adopts the Windows VSS interface driver to allow you to call HyperImage through VSS in the BE where the VSS interface program is integrated. In this way, backup through HyperImage is implemented. This greatly simplifies the backup operation. In addition, data consistency is ensured by VSS. Instead, you do not need to install a database agent for each type of backup software. 4.2 S2600 Application Cases 4.2.1 Yueyang City of Hunan Province Challenges Faced by Customers To protect the public security of the city, prevent crimes effectively, handle public incidents, improve modernization management, and build up a safe and harmonious Yueyang, an electric security protection system is set up according to the requirements of the government and police bureau. Solution The high-performance S2600C is used and the storage architecture adopts centralized storage mode. That is, all storage devices are deployed in the control center and all devices and data are managed by the data center. The S2600 is incorporated with the integrated video management platform, and applies for the necessary video information to replay from the integrated video management platform. The storage network solution including the centralized storage management platform, data management tools, and storage devices is used. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 24 of 28 Technical White Paper for Oceanspace S2600 Storage System 4 Experience Figure 4-1 Storage networking of Yueyang city Application server Data center equipment room of Yueyang electronic surveillance system Digital decoder TV wall S2600 Workstation Digital decoder TV wall Optical private network Urban management office/ command center/district office Monitoring center Local police station Video distributor group Analog matrix Fiber access Multichannel decoder TV wall Workstation Optical Optical Optical Optical Optical Optical transceiver transceiver transceiver transceiver transceiver transceiver Multichannel optical transceiver Benefits to Customers As the core of the city surveillance storage system is data storage, the data classification and management of storage devices are significant. The S2600C provides a high-performance storage device with up to 8 GB cache. To improve efficiency and security of system management, the storage system must be equipped with storage management tools and provide various management UIs, such as LED, Web UI, RS232, GUI, and CLI. In addition, several pre-alert and alarm methods are provided. 4.2.2 Harbin Railway Station Challenges Faced by Customers Harbin railway station has already several PC servers running Windows OS wherein applications such as office automation and finance are running on the server. Data is stored in the server. With the development of services, the storage in the server cannot accommodate the ever increasing data. Furthermore, misoperations, hardware or software faults, viruses and natural disasters might cause data loss and even result in inestimable loss. Solution A set of high-performance and high-reliability S2600 can be used as the main storage array of service data and data of various hosts can be stored to the array. Another set of S2600 can be used as a backup array. The world-class data protection software, Backup Exec, is used as the backup software to back up key data based on the policy specified by the railway station. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 25 of 28 Technical White Paper for Oceanspace S2600 Storage System 4 Experience Figure 4-2 Storage backup system of Harbin railway station File server Database server Backup Exec server Benefits to Customers Service data are stored in a centralized manner, which speeds up response and simplifies management and maintenance. The data backup system protects data security effectively and eases the concerns of system users and operations. Issue 5.0 (2010-06-11) Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 26 of 28 Technical White Paper for Oceanspace S2600 Storage System 5 Acronyms 5 Acronyms and Abbreviations Acronyms and Abbreviations Expansion A ATA advanced technology attachment B BE Backup Exec C cache cache F FC fiber channel I IDC Internet data center IP Internet Protocol iSCSI Internet SCSI L LUN logical unit number N NAS Issue 5.0 (2010-06-11) network attached storage Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 27 of 28 Technical White Paper for Oceanspace S2600 Storage System Acronyms 5 Acronyms and Abbreviations Expansion P PCI peripheral component interconnect R RAID redundant array of independent disks S S.M.A.R.T self-monitoring, analysis, and reporting technology SAS serial attached SCSI SATA serial advanced technology attachment SCSI small computer system interface T TCP Transmission Control Protocol U UPS uninterruptible power supply V VSS Issue 5.0 (2010-06-11) volume shadow copy service Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved Page 28 of 28