AmpliStor is an Optimized Object Storage system designed specifically for the world’s largest Big Data applications. The system was purpose-built to enable the most durable, scalable and easiest to manage storage system for applications that create and manage multiple petabytes of big data, consisting of large media objects such as images, video and audio files for online and archive applications. The system can also be used to efficiently store large backup payloads, simulations & scientific big data archives from genomics, manufacturing and research. Regular files can be stored as objects along with optional user-defined metadata. AmpliStor efficiently handles Big Data, by solving key problems related to large-scale storage:

  • Scaling to petabytes and beyond
    • Scaling capacity and performance seamlessly beyond the limits of a single host
    • Removing the limitations of file system based data organization & management
  • Providing “unbreakable” storage durability on high-density drives at low cost
    • RAID related long rebuild times, poor performance, limited failure protection, bit error induced corruption and data loss
    • Storing multiple full copies of data is expensive (mirrored RAID or 3-copies)
    • Reducing or eliminating the need for tape backups for large-scale data
  • Rising Operational Costs
    • Constantly increasing administrative and operational costs for data storage
    • The need to manage 10-20x more capacity per administrator
    • Reducing the environmental cost of power and cooling requirements

The system solves the scalability problem imposed by common file system limitations. It provides the highest levels of storage durability, enabled through the BitSpread erasure-coding algorithm. This protects data against any number of disk, storage module, rack or site level failures as well as ensuring data integrity as related to component failures or disk related bit rot (bit errors). BitSpread also reduces the storage footprint by half or more, as compared to other “high-durability” storage schemes such as triple mirroring, or mirrored RAID schemes. In conjunction with an automated, self-healing architecture and a commodity-based platform, this enables a comprehensive low Total Cost of Ownership solution for large-scale disk based storage.

Scale-out architecture

AmpliStor is built on a scale-out architecture, consisting of high-density, low power Storage Nodes, and high-performance Storage Controller nodes, connected over a 10GbE/1GbE network fabric. Controller nodes provide fully shared access to the global storage pool for applications. Applications access AmpliStor over a variety of language-specific object interfaces as well as industry standard http/REST protocols all with the full performance of 10 GbE network interfaces. This allows the system to scale-out both throughput and capacity to meet any application requirements. AmpliStor Controllers provide the patent-pending BitSpread encoder, a scalable and durable Metadata store, and the AmpliStor management framework. The back-end storage pool can be spread over as many Storage Nodes as is required to meet the application’s capacity requirements, with the system supporting an essentially unbounded number of Storage Nodes. The system automatically manages balancing of storage allocation for optimal utilization, performance and to provide the ultimate in data durability (protection against both data loss and data corruption to the highest levels). See more information on our Platform here.

Performance

AmpliStor provides high-throughput for big file applications. Each AmpliStor Controller can individually deliver up to 750MB/sec of aggregate throughput over two x 10GbE ports, when matched up with at least 16 AmpliStor Storage Nodes (containing a total of 160 SATA spindles). In a standard rack, with three Controllers and thirty-two Storage Nodes, the system will deliver 2.1 Gigabytes / sec of aggregate throughput. Systems can be scaled up to multiple racks consisting of many more controllers, to deliver any level of throughput required by applications. This performance makes AmpliStor an ideal fit for many big file archiving applications.

BitSpread Encoder for Storage Durability

AmpliStor features patent-pending BitSpread erasure coding technology, which was designed as a highly durable and available replacement for traditional RAID5 and RAID6. Due to growing disk densities, RAID can no longer address the data reliability requirements of modern storage infrastructures. Users have turned to managing multiple copies of files in order to address these shortcomings of RAID. BitSpread enables massively better durability than RAID on multi-terabyte drives and is remarkably more efficient than replication or maintaining multiple file copies. For example, BitSpread does not only protect data against any number of simultaneous disk or Storage Node failures (not limited to just 2 failures as in RAID6), it also protects every stored object against 1000’s of bit errors, with fully data integrity and availability. BitSpread can also “Geo-Spread” a single instance of data geographically across multiple data centers, and tolerate the failure of (for example) 1 in 3 data centers and still provide full data availability.

BitSpread takes a different approach to how data is written to disk: it employs an encoding technology to first split and encode each data object into many sub-blocks. These are in-turn spread over as many disks in the system as possible. As the system only requires a subset of the sub-blocks to restore the original data object, it can survive the failure of multiple disks or even entire storage nodes. BitSpread dynamically spreads data across the entire storage pool, with an aim of maximizing data durability, balancing allocation and resources, as well as performance.