Hydra FS High Throughput FS

21
HydraFS: a High- Throughput File System for the HYDRAstor Content-Addressable Storage System 10/28/22 1

Transcript of Hydra FS High Throughput FS

HydraFS: a High-Throughput File System for the HYDRAstorContent-Addressable Storage System

10/28/22 1

Authors•Cristian Ungureanu

•Benjamin Atkin•Akshat Aranya•Salil Gokhale

•Stephen Rago•Grzegorz Calkowski

•Cezary Dubnicki•Aniruddha Bohar

10/28/22

2

Contents• HYDRAstor Introduction• Content-Addressable Storage (CAS)• Problems and Solutions for CAS• Challenges for the new implementation• Filesystem layout and software architecture• Read/write processing, metadata cleaning, deletion

• Conclusion

10/28/22

3

HYDRAstor Introduction•Multi-node CAS system•Stores blocks at configurable redundancy levels

•Supports high-throughput R/W for large blocks•Use for:

•Backup solutions•DR solutions•Data archive

10/28/22

4

NEC HYDRAstor•HYDRAstor architecture introduced in March 2007

•Conducts R&D in the USA, Japan, Germany, China

•High performance, capacity-optimized, and highly available storage solutions, for Enterprise backup, long-term data archive, and DR solutions

10/28/22

5

Competitors•EMC Corporation

• Centera, DataDomain

•Hewlett-Packard• Information Access Platform (RISS)

•Hitachi Data Systems• Content Archive Platform (HCAP)

•IBM• DR550

10/28/22

6

Content-Addressable Storage•A Content-Addressable Storage (CAS): •Elimination of duplicate blocks •Gives high throughput for streaming access

•Once saved an object, can’t be deleted until retention expired

10/28/22

7

Problems with CAS System•Absence of standardized API,•makes barrier to use of CAS with existing applications

•Have to rewrite applications to use with the CAS-specific API

•Applications have to deal with unique characteristics of CAS. • Immutability of blocks•High latency on operations

10/28/22

8

Solution•Built HydraFS filesystem on top of CAS system making an interface

•Support distributed CAS systems

10/28/22

9

CAS System

HydraFS

App 1 App 2 App 3

Solution cont…•HydraFS presents standard interface to effective use of CAS system without requiring changes in applications

•Gain storage performance by mapping best access patterns of the applications

•HydraFS increases R/W performance of HYDRAstor by 82–100%

10/28/22

10

Challenges•Immutable block size•High latency•Chunking algorithm to determine block boundaries

10/28/22

11

Filesystem Design Achievements•High throughput for sequential R/W

•Minimized dependent I/O•Guarantees data availability•Supports both local & remote FS access

10/28/22

12

Filesystem Layout

10/28/22

13

•Designed as a DAG•Root stores searchable block which holds super block

•Superblock holds imap and current FS version

Software Architecture

10/28/22

14

•Implemented as user-level processes

•Fileserver manages FS interface

•Commit server generates new FS version

Write Processing•Fileserver buffers file write data•Apply content-defined chunking algorithm

•Create new variable size blocks, marked ‘dirty’

•Write dirty block to disks•Uncommitted block table – modified FS metadata

10/28/22

15

Metadata Cleaning•Commit server creates FS new super block,

•After that, fileserver can clean its dirty metadata

•Generates new version of the FS

10/28/22

16

Admission Control•Mechanism to control memory usage•Define memory for objects, Ex: data blocks, inode

•New events are allowed when memory available, otherwise event blocks

10/28/22

17

Read Processing•Get read-ahead metadata into in-memory

•Content addresses of this range retrieve from inode’s B-tree

10/28/22

18

Deletion•Remove the pointer to the data block from namespace. Storage space is left

•After new FS created, old version marked for deletion

•Number of retain FS ver. are configurable

10/28/22

19

Conclution•HydraFS outcome:•High throughput RW access•High duplicate elimination rate

•Evaluation results:•Efficient and support up to 82% for read and 100% for writes

•Best suitable for:•Backup applications and repositories

10/28/22

20

Thank you…

Questions?

10/28/22

21