Version 1.5,
Revision 1
Chapter 4
Storage Design
Reference Architecture Guide
Abstract
This chapter describes the Microsoft® Systems Architecture Internet Data Center storage design. Considerations for storage planning are discussed in the areas of SAN fabrics, storage systems and host bus adapters. SAN security issues are also discussed.
The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESSED OR IMPLIED, IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.
© 2002 Microsoft Corporation. All rights reserved. Microsoft, Windows,
Windows NT, Active Directory, BizTalk, JScript, and Visual Basic are either
registered trademarks or trademarks of Microsoft Corporation in the
Other product and company names mentioned herein may be the trademarks of their respective owners.
Microsoft Corporation
1001
Contents |
Introduction.............................................................................
1
SAN Fabric Design....................................................................
2
SAN
Interconnect Technologies
2
Fibre
Channel Topologies
3
Basic Dual-Switch SAN Design
3
Departmental SAN Design
4
Two-Tier Layered Design
4
Backbone Design
5
Redundant Star Design
6
Fan-Out Design
7
Fibre
Channel Switches
7
The
Host Bus Adapters (HBA)
9
Storage Array........................................................................
13
Storage Device
13
Storage Disk Subsystem
13
Logical Unit Numbers (LUNs)
14
RAID Configuration
14
SAN Security Design.............................................................
16
LUN
Security
16
Zoning
17
Persistent Binding
19
Summary....................................................................................
21
A SAN is a networked infrastructure designed to provide a flexible, high-performance, highly scalable storage environment. SANs enable many direct connections between servers and storage devices such as disk storage systems and tape libraries.
The SAN is made up of specific devices, such as host bus adapters (HBAs) in the host servers, SAN switches that help route the SAN traffic in a method similar to a LAN network switch, disk storage subsystems, and tape libraries. All these devices are interconnected by copper or, more commonly, fibre. The fibre and switches used to connect the host and storage devices are commonly referred to as the fabric of the SAN.
The main characteristic of a SAN is that the storage subsystems are generally available to multiple hosts at the same time, offering scalability and flexibility in the use of the storage subsystem.
Figure 1. SAN Architecture
The main advantages of a SAN are:
· Incredible scalability with the ability to handle the addition of more storage to the fabric or connecting more hosts to the storage subsystems.
· Increased availability to storage subsystems via multiple paths through the SAN fabric.
The main disadvantages of a SAN are:
· Complexity. A SAN requires specific specialist knowledge in order to design, operate and maintain them.
· Expense. A SAN is a more expensive solution for the infrastructure in comparison to DAS or NAS.
SAN Fabrics can become increasingly complex as more devices are added. As with LAN subnets that are interconnected by routers and switches, separate fabrics called SAN islands can be interconnected by fabric switches. When designing the SAN fabric, it is important to understand the technology that will provide the SAN interconnect technologies.
The two interconnect technologies in SAN fabric design are Fibre Channel and SCSI. Fibre Channel is an up-and-coming technology poised to eventually replace SCSI hardware technology as the disk subsystem interconnect of choice. Fibre Channel, which is based on ANSI standards similar to SCSI, is the next step in the disk I/O connectivity evolution as parallel SCSI will eventually reach electrical limitations that will impede continued performance enhancements.
Fibre Channel's primary strength is that it can be implemented over fibre optics and is serial in nature, as opposed to the parallel nature of SCSI. With the use of fibre optics, Fibre Channel provides increased throughput, less distance limitations, and more topology implementation options than traditional SCSI technologies. Currently, throughput for a Fibre Channel bus ranges up to 200 Mbytes per second. When implementing Fibre Channel, various topologies can be used which encompass arbitrated loop (the most commonly-used topology), point-to-point, and switched implementations. Although differential SCSI limits the distance from the server to the disk drive to 25 meters, the fibre optic implementation of Fibre Channel allows for distances up to 10,000 meters. This increased distance option allows for more flexibility in planning for disasters and overall system architectures.
Even with the new features and possibilities that Fibre Channel brings, the same concepts apply as when reviewing SCSI technologies. The primary performance hindrance of a disk drive attached to a Fibre Channel bus is still the physical disk drive limitations. With Fibre Channel, more disk drives per Fibre Channel bus channel can be supported due to the higher available throughput.
The expanding role of open systems computing environments requires high performance and higher availability. To meet these needs, Fibre Channel:
· Relieves input/output (I/O) congestion by providing low latency and high throughput at the application and end-user level.
· Improves operational efficiency because Fibre Channel is highly scalable, flexible, and easy to manage.
· Extends connectivity by providing any-to-any connectivity over longer distances.
· Maintains investment protection by providing a simple growth path for open systems-based environments.
· Improves the quality of service by maximizing system uptime with non-disruptive maintenance and superior reliability, availability, and serviceability.
Fibre Channel architectures are based on gigabit speeds, with effective 200 megabyte per second throughput (400 megabyte full duplex). All Fibre Channel architectures allow for both copper and fibre optic cable plant, with maximum distances appropriate to the media (30m for copper, 500m for short-wave laser over multimode fibre, and 10k for long-wave laser over single-mode fibre).
The IDC SAN is based on the Fibre Channel architecture.
Some of todays most common SAN fabric topologies include the following approaches:
· Basic dual-switch SAN design
· Departmental SAN design
· Layered 2-tier design
· Backbone design
· Redundant star design
· Fan-out design
One of the most
basic types of SANs is a dual-switch design, which generally includes a small
number of hosts and storage devices (see Figure 2). It is not highly scalable,
because all servers and storage must be connected to both switches to achieve
high availability. Dual fabrics (fabrics in which switches are not connected to
each other) protect organizations against errors such as a user erasing or
changing the zoning information. Because the zoning information is separate for
each fabric, altering the zoning in one fabric does not affect the
second.
Figure 2. A basic dual-switch design
A dual-switch, dual-fabric design provides very high availability. Dual HBAs in each server and storage device must have at least two ports. Failover for a failed path or even a failed switch depends on host failover software. The switches do not reroute traffic for a failed link, because there is no fabric or network with this design. Each switch is a single fabric.
A more scalable design is the departmental SAN, which is typically a mesh or a full-mesh network with an optimal design of four switches (see Figure 3). In a full-mesh design each switch connects to every other switch. This departmental SAN design has limited scalability: as the number of switches in the mesh increases, the number of ports available for server and storage nodes decreases.
Figure 3. A departmental SAN design
In contrast, a benefit of the full-mesh design is the ability to handle failures by the switch FSPF rerouting functions instead of relying on only the host failover software. For example, GigaBit Interface Converter (GBIC), link, path, and entire switch failures can often be handled by a simple rerouting through the network. The host server never sees the failure. However, the host failover software is still required for certain failures (such as HBA or certain types of link failures). As a result, the highest level of availability stems from a combination of host failover software and the intelligence in the Brocade fabric.
The next more scalable design is a 2-tier layered
design. Organizations can implement this design with dual fabrics, which
essentially look like two versions of Figure 4. A 2-tier design typically has a
host tier and a storage tier of switches. All host servers are connected to the
host tier, and all storage is connected to the storage tier. A path from a host
server to its storage is always a single hop away. This design is often used for
connecting a few large servers to a few large storage devices.
Figure 4. A 2-tier layered design
Although this design is more scalable than some others, it has limited scalability overall. There is a limit to how many switches can reside in either the host or tier layers, because all host tier switches are connected to all storage tier switches and vice versa. Organizations would eventually run out of ports if they added too many switches to either layer. However, this design can support hundreds of end-node connections with high throughput and high availability.
A much more scalable approach is a backbone design borrowed from traditional networking methodologies that include core switches and edge switches. The edge switches are networked in any topology (such as dual switch, mesh, full mesh, 2-tier, or redundant star) to form SAN islands. The SAN islands then connect to each other to form a single fabric by using the core switches. Servers and storage devices are usually connected to the switches in the SAN islands, not directly to the switches in the core, which is typically reserved for connecting the islands. One exception to this rule might be a centralized tape backup solution, where all SAN islands can access a large tape library connected to the core switches. However, attaching a tape library to the core reduces the number of core ports available to connect the islands. Figure 5 shows two sets of core switches in a 4-switch mesh design. However, many other core switch topologies are possible.
Figure 5. A backbone design with edge and core switches
SAN backbone designs are an easy way to deploy enterprise fabrics, even if organizations begin with several smaller fabrics. Backbone designs provide a simple migration path to a single fabric without requiring complete SAN redesign, because the SAN islands are typically dispersed either geographically or logically. Geographic dispersion might be for departments or applications that are many miles apart. Logical dispersion might be a way to separate platforms or applications from each other even if they are in the same physical location. The most logical use for a large, high-port-count switch might be as a core switch in a SAN backbone. For the highest availability requirements, however, multiple core switches might be necessary.
The most scalable, enterprise-level design is a topology borrowed from the traditional networking world: the redundant star topology consisting of a 3-layer switch design. The left central switch is connected to all switches above and below it as shown in Figure 6, forming a star. Higher availability requires a redundant star (the right central switch). The central switches are not connected to each other. This is a highly scalable design because it can grow both vertically and horizontally, and organizations can connect servers to one tier and storage to another tier.
Figure 6. A redundant star design
The fan-out design typically connects many servers to a large storage array with very high availability. As shown in Figure 7, this redundant fabric design with a 16-port storage array enables 160 host server ports per fabric (using 16-port Brocade switches) and two ports per switch for tape backup connections.
Figure 7. A
redundant fan-out design
The IDC SAN is a basic dual-switch SAN design
A Fibre Channel switch typically provides 8, 16, or more ports, with full gigabit speeds available at each port. Following the model previously established by Ethernet switches, a Fibre Channel switch port may be configured to support a single node or a shared segment of multiple nodes (e.g. a loop). Because a switch requires more processing power, memory and microcode at each port to properly route frames, switch per-port costs are usually greater than four times arbitrated loop hub per-port costs.
A Fibre Channel
switching hub is a hybrid technology that offers the advantages of both
arbitrated loop and fabric switching. A switching hub manages the address space
of two or more arbitrated loop segments to create a larger, logical loop. This
larger loop allows nodes on physically separate loops to transparently
communicate with one another, while maintaining higher available bandwidth on
each physical loop.
Fibre Channel switches and Fibre Channel network protocols ensure that device connections between host servers and SAN-based disk subsystems are both reliable and efficient. These connections are based on either native Fibre Channel or SCSI. One or more Fibre Channel switches provide the interconnectivity for the host servers and storage devices in a meshed topology referred to as a SAN fabric.
The Fibre Channel switch provides high port counts and throughput with an opportunity to connect multiple data center SANs by utilizing one or more dedicated wavelengths over dense wavelength division multiplexing technology and a metropolitan area network (MAN). By extending the SAN across MAN infrastructures, SANs improve disaster tolerance through a seamless connection to remote facilities. Connecting SANs over MAN infrastructures increases the distances between SAN components by as much as 150 km with little or no decrease in performance. This allows separate, dedicated bandwidth for the storage replication operations and allows for scaling as the storage volume grows.
Fibre Channel switches are capable of extremely high reliability. Availability improves even further with fibre switches that support Hot Add redundant power supplies and cooling, as well as Hot Add optic modules that support Hot Add Memory and enable single-port replacement of optics without impacting working devices. One of the easiest ways to increase SAN availability through fibre switching is to implement a core-to-edge topology. Core-to-edge topologies connect devices to edge switches, which then connect to central interconnecting switches that in turn connect to other parts of the SAN or other devices. A core layer of switches enables scaling of both the central bandwidth of the SAN and additional attachment of edge switches. As higher speed Gigabit/second Fibre switches become available, the central 2 Gigabit/sec switches can be migrated to the edge to enable an even higher speed core fabric.
Fabric elements cooperate to receive data from a port of an attached device, route the data through the proper switch ports, and deliver the data to the port of a destination device. The data transmission path through the fabric is typically determined by the fabric elements and is transparent to the user. Subject to zoning restrictions, devices attached to any of the interconnected switches can communicate with each other through the fabric.
A multiswitch fabric is typically complex, and provides the facilities to maintain routing to all device ports attached to the fabric, handle flow control, and satisfy the requirements of the classes of Fibre Channel service that are supported.
Most enterprises have unique configurations determined by the characteristics of the end devices, fabric elements, cost, and the installation's performance objectives (such as high data transfer rate or high availability). These factors, along with non-disruptive growth and service requirements, must be evaluated when planning an initial fabric.
The distance between fibre switching devices in a fabric affects the type of interswitch links required. Consider the following:
· If the distance between two switches is less than 300 meters, any port type (shortwave or longwave laser) and any fibre-optic cable type (multimode or single mode) can be used to create an interswitch link connection. In this case, cost or port card availability may be the determining factor.
· If the distance between two switches exceeds 500 meters, only longwave laser ports and single-mode fibre-optic cable can be used to create an interswitch links connection.
· Distance limitations can be increased and port type restrictions eliminated by using multiple fibre switching devices. Each switch retransmits received signals, thus performing a repeater function. However, cost may be a determining factor.
Variables such as the number of connections, grade of fibre-optic cable, device restrictions, application restrictions, buffer-to-buffer credit limits, and performance requirements can affect distance requirements.
Interswitch link connections can be used to increase the total bandwidth available for data transfer between two fibre-optic switches in a fabric. Increasing the number of interswitch links between switches increases the corresponding total interswitch links bandwidth, but decreases the number of port connections available to devices.
Planning consideration must be given to the amount of data traffic expected through the fabric. Because the fabric automatically determines and uses the least cost (shortest) data transfer path between source and destination ports, some interswitch link connections may provide insufficient bandwidth while the bandwidth of other connections is unused. Special consideration must also be given to devices that participate in frequent or critical data transfer operations.
The HBA is the Fibre Channel interconnect between the server and the SAN (replacing the traditional SCSI adapter for storage connectivity.) Fibre Channel is initially offered as an HBA to server-attached external disk drives with little intelligence in the HBA. Without any inherent intelligence, Fibre Channel HBAs do not currently provide much hardware RAID support. This limits the current viability of this technology in the midrange Windows Server space. In the higher end of the Windows Server space, Fibre Channel HBAs are best used when paired with an intelligent external disk array device. In this configuration, the intelligent external disk array device provides the hardware RAID implementation and the Fibre Channel HBA provides the connectivity back to the server.
As Fibre Channel technology matures and more products become available, HBA technology will become a more viable option in the price-sensitive Windows Super Server product space.
After high availability for servers is taken into account, the path between the server and the storage should be considered as the next single point of failure. Potential points of failure on this path might include HBA failures, fibre-optic cable malfunction, or storage connection problems. Using a dual-redundant HBA configuration ensures that an active path through the SAN fabric to the data is available. In addition to providing redundancy, this configuration may enable overall higher performance due to additional SAN connectivity.
To achieve better fault tolerance, the multiple paths through the SAN fabric to the data provided by the dual-redundant HBA configuration should be connected to separate dual SAN fabrics. This configuration ensures that no fault scenario can cause the loss of both data paths. Server based software for path failover enables the use of multiple HBAs, and typically allows a dual-active configuration that can divide workload between multiple HBAs to improve performance. This software can monitor the availability of the storage, the servers, and the physical paths to the data and can automatically reroute to an alternate path should a failure occur.
Should an HBA failure occur, host server software detects that the data path is no longer available and transfers the failed HBA workload to the redundant active HBA. The remaining active HBA assumes the workload until the failed HBA is repaired or replaced. After identifying failed paths or failed-over storage devices and resolving the problem, the host server software automatically initiates failback and restores the dual path without impacting applications. The host server software that performs this failover is provided by system vendors, storage vendors, or value-added software developers.
The HBA performs the operation of getting requested data from the SCSI bus onto the PCI bus. HBAs provide a range of functions from basic SCSI connectivity to complex RAID support. There are four distinguishable primary features that separate the various HBAs available today:
· Number and type of SCSI channels supported
· Amount of server CPU overhead introduced
· RAID support level
· I/O workload supported
The channel density that a SCSI HBA supports directly impacts the number of disk devices that Windows Servers can support. The channel density also affects how the disk devices are configured in the server. Today, one HBA can support from one to four onboard SCSI channels. This HBA capacity allows for the conservation of precious PCI slots available on the server. Utilizing these higher-density HBAs, a standard high volume server can support up to five HBAs for a total of 20 SCSI channels. Whether or not configuring that many SCSI channels in a single-standard high-volume-based Windows Server is a good idea is dependent upon the environment. When implementing high numbers of HBAs in a single Intel-based standard high volume server, a shortage of interrupts is likely to become the limiting factor. Each server vendor implements the setup of interrupts in a slightly different manner; refer to vendor server documentation for the steps to follow for interrupt control.
Each HBA interrupts the Windows Server when an operation requires server CPU intervention. The amount of CPU cycles required varies between HBAs. The CPU time required to service an HBA is considered as the overhead the HBA places onto the Windows Server. This overhead is actually good in one sense because this overhead is used to retrieve data from the disk drives to the CPU so that productive work can be completed. Other I/O devices, such as network interface cards, also introduce this CPU interrupt overhead. It is important to be aware that CPU cycles are required to drive the I/O subsystem. For smaller servers that are configured with small disk subsystems, the CPU overhead introduced by the disk arrays is not significant enough to worry about. As Windows Server solutions get larger and larger, the amount of CPU cycles required to drive the disk subsystem becomes more of an influencing factor. If the server application requires a great deal of computational power and there is a large amount of disk I/O activity, CPU power must be planned for in order to support both functions.
Various vendor specification sheets for four or eight CPU Pentium Pro servers state that multiple terabytes of disk can be connected to one server; one vendor even states support for 7 Terabytes. Although this is possible from a connectivity perspective, the server will never be able to drive the disk subsystem if all of the disks and applications are active. Somewhere a balance must be found.
When examining
relevant TPC-C benchmarks closely, for example, it becomes apparent that the
larger
These test results are subject to many factors such as Windows Server version, database version, and numbers of clients, but still provide a rough estimate of the relative sizing between CPUs, SCSI channels, and disk drives in a client server database environment.
There is more than just throughput to consider when reviewing a disk drive's performance and there is more to consider than just the number of SCSI channels that an HBA can support. When sizing the number of HBAs required, consider the theoretical I/Os per second that the HBA can support as well. For example, when configuring ten disk drives for a sequentially intensive disk environment, do the associated math. Each 7200-RPM SCSI disk drive in a sequential environment should be able to support up to 190 disk I/O operations per second and 2 Mbytes/sec of throughput. A 10 disk drive implementation would therefore require an HBA or HBAs that can support 1900 I/Os a second and 20 megabytes (MB) a second of aggregate throughput. To fulfill this outlined requirement, a 2 Channel Ultra Wide SCSI-3 HBA can be used, which allows room for growth by providing a second usable SCSI channel and has a workload rating of 2400 I/Os per second. Several HBAs are available to meet the throughput requirement (40 Mbytes/sec per channel). This type of performance data can be obtained from the server or HBA vendor's Web site, or by contacting the vendors directly.
The operating performance level of the SCSI channels connected to the HBA is determined by the HBA setup. An HBA that may be rated for Ultra Fast and Wide SCSI speeds does not guarantee that the HBA is configured for this speed. As the server boots, or by using the tools provided by the HBA vendor, be sure that the SCSI channel speeds are set to operate at the desired performance level. Some HBA settings to enable and confirm are:
· SCSI bus speed
· Tagged command queuing
· Disconnect
· Wide transfers
Manufacturers of disk drive adapters are constantly working on removing bugs and improving the performance of respective disk adapters. Typically, the latest drivers are available from the manufacturer's Web site.
However, downloading and installing HBA drivers from the manufacturers Web site is not a recommended best practice. The performance and expected serviceability of HBA drivers is an MSA-tested process. The Windows HCL includes the versions of drivers tested and supported by the MSA IDC.
The MSA IDC implements two HBAs in each SAN-connected mission critical server. Path failover capabilities are supported either by software or by firmware support on the HBA.
It is important to determine the storage requirements for each service within the solution. Here are a few of the questions that should be asked when evaluating the design considerations for the subsystem usage:
· What applications or user groups will access the subsystem?
· How much capacity will be required?
· What are the I/O requirements?
· If an application is data transfer-intensive, what is the required transfer rate?
· If it is I/O request-intensive, what is the required response time?
· What is the read/write ratio for a typical request?
· Are most I/O requests directed to a small percentage of the disk drives?
· Can the I/O requests be redirected to balance the I/O load?
· Is the data being stored mission-critical?
· Is availability the highest priority or would standard backup procedures suffice?
A SAN is comprised of a number of subsystems that work together to deliver the storage functionality to the enterprise services. These subsystems are the topic of the following section.
Machines that host the storage subsystem have the following characteristics:
· Large main memories for caching or speed-matching to the user network
· Many I/O channels for streaming data between storage subsystems and memory
· Multiple processors to manage time-critical processing of I/O requests
A storage subsystem is a collection of devices that share a common power distribution, packaging, or management system scheme. The SAN-attached disk storage subsystem is connected to the server through a Fibre Channel by way of fibre-optic cables, a switching device, and a host bus adapter installed in the server.
Generally, the SAN can be enhanced with critical software components to achieve continuous availability. For example, manufacturers produce software that allows a single storage volume to be split across multiple remote redundant disk storage subsystems. These facilities enable remote clustering of servers, which in turn ensures service availability even in the event of complete data center failure.
Other value added software can provide the functionality to create synchronous remote mirrors of all data as the data is written, including any local mirroring for redundancy. Each data center can contain an up-to-date copy of all information. In the event of failover, the mirror copy can become the production data, and the mirror can be synchronized to the failed site when the site is restored. Most disk storage subsystem vendors also offer software to produce snapshots of data that can be used to recover accidentally deleted data, for tape archiving, or for offline processing such as reporting and analysis.
SAN disk subsystems vary remarkably in functionality and capacity. Matching application and data consolidation requirements for each customer is a critical first step in identifying an appropriate MSA IDC SAN disk subsystem and configuration.
Storage subsystems have one or more addresses on the I/O bus (called a SCSI ID), while the devices inside the subsystem are addressed by the host I/O controllers as LUNs. Each SCSI ID can have up to 8 LUN addresses. The addressing model used for SCSI bus operations is a target-LUN pair. LUN security is a feature of the storage subsystem, which limits visibility, and accessibility of a LUN to certain HBAs. Access to a given LUN is granted based on the WorldWide Name (WWN) of the HBA.
A server device name is bound to a specific Fibre Channel storage volume or LUN through a specific HBA and eight-byte storage port WWN. Each server HBA is explicitly bound to a storage volume or LUN and access is explicitly authorized (access is blocked by default).
With RAID, fault tolerance can be provided by storing redundant data on an array of small, inexpensive disks. Although there are six levels of RAID architecture, the levels that are most commonly used for fault tolerance are RAID 1, RAID 0+1, and RAID 3/5.
The level of RAID used will depend on the following factors:
· Services to be hosted. In most cases, mission-critical databases will be hosted on RAID 1 or RAID 0+1, while logs and snapshot images might be placed on RAID 5 disks.
· RAID implementation cost. RAID 1 and RAID 0+1 are more expensive implementations than RAID 5 because, for the same number of physical disks, the net disk space available for use after a RAID 1 or RAID 0+1 configuration is less than that from RAID 5.
RAID Level |
Relative Availability |
Request Rate (Read/Write) I/O per Second |
Transfer Rate (Read/Write) MB per second |
Applications |
Stripe set (RAID 0) |
Proportionate to the number of disk drives; worse than single disk drive |
Excellent if used with large chunk size |
Excellent if used with small chunk size |
High performance for now critical data |
Mirror set (RAID 1) |
Excellent |
Good/Fair |
Good/Fair |
System Drives; Critical Files |
RAID set (RAID 3/5) |
Excellent |
Excellent/Good |
Read: Excellent Write: Good (If used with small chunk sizes) |
High request rates, Read intensive, Data lookup |
Striped Mirror set (RAID 0+1) |
Excellent |
Excellent if used with large chunk size |
Excellent if used with small chunk size |
Any critical response-time application |
Table 1. RAID characteristics
Access to data on the SAN can and should be controlled by a combination of three methods: LUN Security , Zoning and Persistent binding. The actual names and capabilities of these security mechanisms vary depending on the hardware vendor involved. Some terms such as LUN masking are used by different vendors and authors to refer to both persistent binding (host-side LUN masking) and LUN security (storage-side LUN masking).
The advantages of any-to-any connectivity afforded by SAN can be a liability unless LUN security is implemented. There are four ways in which vendors provide a measure of LUN security today. In a mixed environment with many different vendor storage devices, there may well be a need to implement some or all of these methods.
LUN security or masking techniques work adequately in a well-behaved environment, but they can be easily bypassed. The user should be aware of the shortcomings of each implementation and take additional measures to ensure security. Such measures might include an authorization process for access to these features and implementation of multiple masking techniques to ensure a cross-check between the HBAs and the storage subsystem, for example. The coordination and management of these different methods is not integrated at this time. Although the vendors provide configuration utilities that can be launched from a common desktop, the user will need to map LUN and server configurations on a spreadsheet in order to coordinate mapping across these different implementations.
Some storage subsystems have the ability to do LUN masking within their storage controllers. The WWN of all fibrechannel-attached host bus adapters can be mapped against the LUNs. This allows multiple host bus adapters to access different LUNs through the same storage port, independent of any intervening SAN infrastructure, such as hubs or switches.
After adding or deleting LUNs from the mask for each WWN of a host bus adapter, the initiators will only see the LUNs that they have permission to view during boot time. The advantage of LUN masking in the storage controller is that it allows many more hosts like Windows 2000 to attach to a given storage device through a common fibre channel port and still maintain LUN security. It can work in point-to-point mode or through hubs or switches in loop or fabric mode. Since it is based on the WWN of the host bus adapter, it is independent of the physical loop or switch address. This LUN masking is implemented or checked during boot timenot with every I/Oin order to sustain the high performance. However, it is possible for a user with the proper systems authorization to change configurations after boot time and bypass the mask. It has to be remapped if a host bus adapter fails and needs to be replaced. Ease of access to the WWN of the host bus adapter is dependent on the operating system and driver.
Precautions must be taken to prevent unintentional access to the storage devices. For switched fabrics, this precaution would involve setting up zones. In any case, devices may implement LUN masking capabilities to prevent data corruption caused by unintended access to the storage.
Most fibre switching devices support a simple name server zoning feature, referred to as software zoning, virtual zoning, WWN zoning or soft zoning, that partitions attached devices into restricted-access groups called zones. Devices in the same zone can recognize and communicate with each other through switched port-to-port connections. Devices in separate zones cannot recognize name server information and cannot readily communicate with each other. Soft zoning allows a device to communicate only with other devices whose WWNs are included in the same zone. In environments where cables may be moved between ports either for load balancing reasons or in the case of a hardware failure, soft zoning is required.
Hardware zoning or port zoning in a Fibre Channel SAN fabric provides an additional level of protection. Hardware zoning allows devices attached to certain ports on the switch to communicate only with devices attached to other ports in the same zone. After a hardware zone is established, a switch creates a table of devices that can communicate with other devices in the fabric. Only traffic from devices in this zoned list is passed to the destination. All unauthorized devices are blocked and dropped by the actual switch ASIC hardware. The analogy for hardware zoning is similar to caller ID blocking in a telephone system, where only an approved list of phone numbers can cause the telephone to ring. Any number not on the list would get a busy signal and not be connected. By blocking the flow of unauthorized data and control information, hardware zoning can restrict interaction between devices within the fabric. As a result, zoning servers provides an additional level of assurance against potential failures. In a static environment where cables are never moved, hardware zoning is typically implemented.
Both software and hardware zoning segment the fabric into two or more separate but possibly overlapping zones.
System administrators create zones to increase network security measures, differentiate between operating systems, and prevent data loss or corruption by controlling access between devices (such as servers and data storage units), or between separate user groups (such as engineering or human resources). Zoning allows an administrator to establish logical subsets of closed user groups. Administrators can authorize access rights to specific zones for specific user groups, thereby protecting confidential data from unauthorized access.
Zoning can establish barriers between devices that use different operating systems. For example, it is often critical to separate servers and storage devices with different operating systems because accidental transfer of information from one to another can delete or corrupt data. Zoning prevents this by grouping devices that use the same operating systems into zones.
Zoning can establish groups of devices that are separate from devices in the rest of a fabric. Zoning allows certain processes (such as maintenance or testing) to be performed on devices in one group without interrupting devices in other groups.
Zoning can establish temporary access between devices for specific purposes. Administrators can remove zoning restrictions temporarily (for example, to perform nightly data backup), then restore zoning restrictions to perform normal processes.
Zoning is configured by authorizing or restricting access to name server information associated with devices. A zone member is specified by the port number of the port to which a device is attached, or by the eight-byte WWN assigned to the HBA or Fibre Channel interface installed in a device.
If name server zoning is implemented by port number, a change to the fibre switch fibre-optic cable configuration disrupts zone operation and may incorrectly include or exclude a device from a zone.
If name server zoning is implemented by WWN, removal and replacement of a device HBA or Fibre Channel interface (thereby changing the device WWN) disrupts zone operation and may incorrectly include or exclude a device from a zone.
Zones are grouped into zone sets. A zone set is a group of zones that is enabled (activated) or disabled across all fibre switches in a multiswitch fabric. Only one zone set can be enabled at one time.
Zone members are defined and zones or zone sets are created using proprietary fibre switch Manager applications.
Typically, a fibre switch supports any or all of the following zoning features:
· Zone membersthe maximum number of members configurable for a zone varies according to the number of zones in the zone set, the length of zone names, and other factors; but is essentially bounded by the available nonvolatile random-access memory (NV-RAM) in the fibre switch.
· Number of zonesthe maximum number of configurable zones in a zone set varies from manufacturer to manufacturer.
· Number of zone setsthe maximum number of configurable zones sets varies from manufacturer to manufacturer.
· Active zone setthe zone set that is active across all fibre switches in a multiswitch fabric. For the active zone set:
· When a specific zone set is activated, that zone set replaces the active zone set.
· If the active zone set is disabled, all devices attached to the fabric become members of the default zone.
· All devices not included as members of the active zone set are included in the default zone.
· Default zonethe default zone consists of all devices not configured as members of a zone contained in the active zone set. If there is no active zone set, then all devices attached to the fabric are in the default zone. For the default zone:
· The default zone is enabled or disabled separately from the active zone set.
· If the default zone is disabled and there is no active zone set, then the zoning feature is completely disabled for the fabric and no devices can communicate with each other.
· All devices are considered to be in the default zone if there is no active zone set.
· RSCN service requestsregistered state change notification (RSCN) service requests are transmitted to specific ports attached to the fibre switch when the zoning configuration is changed.
· Broadcast framesClass 3 broadcast frames are transmitted to specific ports attached to the fibre switch, regardless of zone membership.
To enhance the network security provided by zoning through the fibre switch, security measures for SANs should also be implemented at servers and storage devices.
Server-level access control is called persistent binding. Persistent binding uses configuration information stored on the server and is implemented through the server's HBA driver. The process binds a server device name to a specific Fibre Channel storage volume or LUN through a specific HBA and storage port WWN. For persistent binding, each server HBA is explicitly bound to a storage volume or LUN, and access is explicitly authorized (access is blocked by default).
Any LUN that is accessed via a port on the storage subsystem that is visible to an HBA can be mapped for use by that hosts HBA. Many HBAs allow LUNs to be either automatically mapped or manually mapped. This capability is usually a function of the device driver for the HBA. The main concern when using the automap capability is that there may be a LUN that either intentionally or accidentally is visible to the HBA that should not be accessible by that HBA. For this reason, most SAN implementations manually map the LUNs that are accessed by a given HBA. The term persistent binding comes from the fact that once a LUN is mapped (bound) to an HBA, this binding is retained until manually removed.
This process is compatible with open system interconnection (OSI) standards. Generally, the following are transparently supported:
· Different operating systems and applications.
· Different storage volume managers and file systems.
· Different fabric devices, including disk drives, tape drives, and tape libraries.
· If the server is rebooted, the server-to-storage connection is automatically re-established.
· The connection is bound to a storage port WWN. If the fibre-optic cable is disconnected from the storage port, the server-to-storage connection is automatically re-established when the port cable is reconnected. The connection is automatically re-established if the storage port is cabled through a different fibre switch port.
Access control can also be implemented at the storage device as an addition or enhancement to redundant array of independent disks (RAID) controller software. Data access is controlled within the storage device, and server HBA access to each LUN is explicitly limited (access is blocked by default).
Storage-level access control:
· Provides control at the storage port and LUN level, and does not require configuration at the server.
· Supports a heterogeneous server environment and multiple server paths to the storage device.
· Is typically proprietary and protects only a specific vendor's storage devices. Storage-level access control may not be available for many legacy devices.
IDC will use a combination of LUN Masking and HBA port binding to ensure security of data within the SAN. Zoning is strongly recommended in more complex SAN architectures.
Note: Zones should be configured before HBA port binding. This configuration order will prevent the possible corruption of data should multiple HBAs gain access to the same storage device.
This chapter provides an overview of SAN technologies and gives an insight into designing SAN architectures. Six fabric topologies are discussed. Each topology has its own advantages and disadvantages and the chapter provide selection criteria that is dependent on scalability and availability requirements.
A proper storage subsystem design is critical for optimal application performance. The chapter provides a brief description of storage subsystems and explains various levels of RAID and their applications. Finally, the chapter extensively discusses SAN security. It recommends that access to data on the SAN should be controlled by a combination of three methods: LUN Security, Zoning and Persistent binding.