Managing unstructured data
Today’s rapid growth of unstructured data calls for efficient storage management of emerging data types — a critical consideration for keeping up with evolving business demands cost-effectively. In particular, organizations are realizing a huge shift in data composition, driven by trends such as virtualization, electronic document stores, Web 2.0 technologies and digital records retention. As a result, the capacity needed to store unstructured file data continues to escalate far beyond the capacity required for structured database–type data typically stored on block storage.
Now, IT leaders are looking to deploy additional file storage to meet data growth without intensifying administrative burden — for example, by simplifying data migration, backup and disaster recovery. At the same time, IT leaders must minimize capital expenditures to help the enterprise run as efficiently as possible.
Traditional file servers and network attached storage (NAS) appliances often do not address the demands of massive, high-growth data volumes. Although file servers offer an easy way to add file capacity, they have scaling limitations. They create data silos that limit access to mission-critical information, complicating day-to-day administration and increasing data center–wide management complexity. Traditional NAS appliances commonly incur performance bottlenecks as storage capacity is added and typically require a forklift upgrade every refresh cycle.
These approaches often cannot efficiently meet the requirements of growing file systems that must scale performance and capacity transparently, independently and linearly. To address these scalability challenges, IT leaders are rethinking their storage strategy and seeking tools that help them manage burgeoning file data in a simple, cost-efficient manner.
Foundation for scalable file storage
A core component of the Dell Fluid Data™ architecture, Dell Fluid File System (FluidFS) was developed from the ground up to avoid the scalability limitations associated with traditional, monolithic NAS and file servers, such as limited volume size, rigid allocation of file systems to physical volumes and siloed namespaces. FluidFS is an advanced scale-out NAS technology designed to completely separate the management of data from the underlying disks and logical units (LUNs).
The FluidFS architecture is a symmetric clustered file system with distributed metadata, native load balancing, advanced caching capabilities and a rich set of enterprise-class features. A FluidFS cluster includes up to four 2U NAS gateway appliances. Each appliance, which is Dell hardware specifically designed for FluidFS, houses a pair of redundant active-active controllers, or nodes. FluidFS is virtualized across controllers in the cluster, enabling any node to serve any file. The file system supports industry-standard protocols, including Network File System (NFS) and Common Internet File System (CIFS), to translate between client-side protocol requests and internal file system requests. The cluster connects to a shared back-end storage area network (SAN) fabric — a Dell Compellent™ Storage Center™ or Dell EqualLogic™ PS Series group (see figure).
A shared infrastructure for block-based and file-based storage enables exceptional efficiencies and cost savings. FluidFS unifies block and file data by delivering dynamic, scale-out NAS capabilities across Dell Compellent or EqualLogic SAN arrays. Through FluidFS, advanced features of the SAN back end are extended to the file storage environment: for instance, the automated tiered storage and thin provisioning capabilities of Dell Compellent arrays or the peer scaling feature of EqualLogic arrays.
FluidFS is easily configured to support the requirements of a wide variety of applications, from standard user shares to cost-effective high-density archiving to high-performance storage in compute-intensive, vertical-industry workloads. FluidFS is designed to scale capacity and performance independently and transparently, avoiding disruption of system availability. Administrators can scale capacity by adding disks to the SAN and scale performance by adding additional NAS appliances or SAN controllers. This approach allows organizations to purchase only the performance and capacity they need today and simply add NAS appliances, storage controllers or disk capacity to accommodate block and/or file growth as required.
To bolster these scale-out capabilities and meet the needs of demanding enterprise workloads, Dell is introducing the next major release of FluidFS, version 3. FluidFS v3 includes planned features such as enhanced file-protocol support, Fluid Data Reduction with policy-based deduplication and compression, and an expanded maximum namespace. (For more information, see the sidebar, “Advancing efficiency.”)
Dell Fluid File System (FluidFS) v3, which Dell plans to release toward the end of this year, is designed to offer organizations advanced file storage capabilities that accommodate enterprise workloads.
FluidFS v3 is expected to be the industry’s only primary storage solution with policy-driven, variable block data reduction. Administrators can use this feature, called Fluid Data Reduction, to establish best-practice lifecycle management policies on a per-volume basis to enable the efficient storage of large data volumes.
Fluid Data Reduction is a policy-driven, post-process operation. After files are written to the network attached storage (NAS) appliance, they are deduplicated once they meet an administrator-defined set of criteria. System administrators also have the option to compress file data after it is deduplicated. The post-process implementation is designed to align data reduction with the aging of files, as defined by the administrators. This capability enables Fluid Data Reduction in FluidFS v3 to impose little or no performance overhead to active data I/O. Fluid Data Reduction utilizes advanced variable-block/sliding-window deduplication technology and Level Zero Processing System (LZPS) compression, an algorithm that maximizes throughput while minimizing system resource consumption.
Other features planned for FluidFS v3 include the following:
- Expanded multiprotocol support for Server Message Block (SMB) 2.1 and Network File System (NFS) v4 with the MIT Kerberos™ v5 network authentication protocol
- Thin-volume cloning
- Multiple back-end storage area network (SAN) and client-side connectivity options
- Access-based enumeration at file and directory levels
- Nondisruptive scaling to up to four dual-controller appliances and up to 2 PB within a single namespace in the Dell Compellent FS8600 scale-out NAS
All enhancements are included with the upgrade, which is available at no additional cost to existing owners of Dell Compellent and EqualLogic storage; licenses do not need to be repurchased.
Enterprise-class file solution that supports diverse workloads
Because of its flexible, scalable architecture, a FluidFS-based storage solution can serve a wide range of applications, including traditional NAS workloads, performance-intense workloads and high-capacity workloads.
File server consolidation
For example, many organizations use FluidFS for file server consolidation by simply consolidating file servers and/or traditional NAS devices on a single FluidFS-based platform. Unifying block and file storage helps increase storage utilization, centralize storage management and streamline backup and recovery. By mitigating server sprawl and reducing the time spent on system management, many organizations are able to minimize operational expenditures and free resources for mission-critical tasks.
Electronic design automation
Demanding electronic design automation (EDA) workloads require file storage that can keep up with the large number of EDA data files generated by many users who access shared data throughout the process of designing electronic systems such as integrated circuits or printed circuit boards. EDA users create and share massive libraries of files that can grow extremely large as engineers and designers work together to complete a project. The intelligent caching of FluidFS lends itself to this type of workload by dynamically adjusting the size of the shared read/write cache, helping reduce latency.
Animation and special effects
In the media and entertainment industry, high-capacity and high-bandwidth performance is essential to efficiently complete animated features or special-effects scenes. Continual competition to enhance scene details and support novel viewing models such as 3D applies intense pressure on the scale and performance of the rendering infrastructure. Advanced storage technology can help keep rendering times low while accommodating dramatically expanding data scale. FluidFS-based NAS solutions are well suited to meet these requirements. End users can scale up a FluidFS cluster by adding storage to the back end and scale out the cluster by adding NAS appliances — thus accelerating performance.
Research and clinical healthcare organizations using medical imaging technology such as picture archiving and communication systems (PACSs) need high-capacity systems for storing and retrieving large medical files. Providers can consolidate disparate storage systems into a single, scalable FluidFS-based platform with the ability to capture thousands of patient files. These patient records are then readily available to doctors from their workstations; multiprotocol support enables fast access for NFS and CIFS clients.
The same healthcare organizations also may invest in biotech applications such as sequencing, proteomics and metabolomics. Processes such as DNA sequence assembly and protein-folding simulation create novel storage challenges because of the heavy processing required. These processes not only produce massive data volumes, but also need gigabyte-per-second performance for analysis in clustered compute environments — both requirements that are easily addressed by the flexible FluidFS architecture. Organizations can easily configure a FluidFS NAS system to meet performance, capacity and budget requirements for biotech workflows.
Virtual desktop infrastructure
A virtual desktop infrastructure (VDI) allows employees to easily access organizational resources without compromising IT security. Organizations looking to enhance virtual machine performance and end-user productivity can separate user data from desktop virtual machines on a FluidFS-based platform. The virtual machines can be placed on block storage for high performance and the user data on file storage for granular data protection.
Designed for performance and scalability
For organizations facing ever-growing streams of unstructured data, Dell FluidFS is designed to go beyond the limitations of traditional file systems. Its flexible architecture helps add nondisruptive scale-out and scale-up NAS capabilities to Dell Compellent and EqualLogic storage. This scalability allows organizations to keep pace with growth while avoiding the risk and expense of forklift upgrades. By growing storage in step with the business, organizations can leverage their existing infrastructures and minimize capital expenditures.
Furthermore, not only can IT teams manage diverse data sources efficiently, they can also meet the needs of business users by putting the data to work. FluidFS leverages advanced data management features of Dell Compellent and EqualLogic storage, enabling data to be tiered according to value and processed in a way that keeps it readily available to inform business decisions. By maximizing performance and streamlining data management, storage solutions based on FluidFS help organizations gain control of data, minimize complexity and cost-effectively meet ever-changing, ever-expanding demands.
Julita Kussmaul is a senior marketing manager in the Dell Enterprise Infrastructure Solutions Marketing Group.
Emily Rund is the product marketing manager for the Dell NAS portfolio and leads go-to-market strategy for the FluidFS product portfolio.
Oliver Kaven is a senior product manager for enterprise storage in the Dell Product Group.
David Stevens is a NAS customer architect for the Dell NAS portfolio, focusing on architecting complex NAS infrastructures.
Dell Fluid Data architecture:
Dell Compellent NAS:
Dell EqualLogic NAS:
The relentless growth of unstructured file data is accelerating the need for network file storage systems. Download this white paper to discover how Dell Fluid File System enables organizations to gain control of their data, reduce complexity and meet growing data demands over time.