IBM Israel Research Seminars
 
Storage devices are becoming cheaper, but reliable and highly available storage systems are still expensive. Also, as long as any given ECC group resides in a single box, they are susceptible to failures that affect the entire box. This problem can be overcome by a multi-box RAID comprising a controller that is connected to multiple target "boxes", with each ECC group comprising at most one block from any given box. However, retaining performance despite the use of an external controller remains a challenge. Also, retaining the same size of storage box (for cost-effectiveness) requires the controller to manage more storage and activity, resulting in a scalability challenge.
iSCSI over iSER is an extension of iSCSI that splits the control and data paths. It also takes advantage of an RDMA mechanism (provided, for example, by InfiniBand) for data transfers, while sending of control messages is left unchanged. This inter-box communication solution, which we use as a baseline for comparison, is a candidate substitute for the intra-box DMA, but leaves two problems unsolved:
- All data to/from hosts passes through the controller, rendering the controller a communication bottleneck.
- ECC calculations are carried out in the controller, requiring additional data transfers between the controller and the disks, further aggravating the controller bottleneck problem.
Our TPT-MB RAID jointly addresses the aforementioned challenges. The main ideas are:
- A multi-box RAID that uses iSCSI over iSER as described above. Specifically, RDMA over InfiniBand is used for data transfers.
- Separation of the data path from the control path, permitting data to be transferred directly between hosts and targets as well as among targets. To this end, we have extended iSCSI over iSER by introducing a 3rd-party transfer mechanism. With this, one iSCSI (the controller) instructs a 2nd iSCSI entity (target or host) to read or write data to a 3rd iSCSI entity.
- ECC calculations (parity update etc.) are carried out by the target boxes under management of the controller.
Unlike the aforementioned baseline approach, commands and data thus follow different physical paths rather than merely using different communication semantics over the same paths. This leaves the controller out of the main data path, thereby sharply mitigating the bottleneck and enhancing scalability while retaining the simplicity of centralized control. In summary, we have successfully extended the idea of out-of-band controllers that manage multiple boxes to the intra-RAID level, as demonstrated by our proof-of-concept prototype.
Joint work with Yitzhak Birk.
 
- Speaker: Erez Zilber, Technion
- Time: 28/02/2006, 10:00 AM - 11:00 PM
- Back to Previous Seminar Listings
