XtreemFS

Hardware virtualization is used to share hardware devices between several users to increase utilization and reduce cost for the individual users. Most cloud providers provide storage services with guaranteed capacity and performance for random access patterns. Performance guarantees for streaming access are much less common. However, using a reservation for random access for streaming access results in sub-optimal performance. In HARNESS, we implemented a more general storage reservation system. Applications can describe their storage requirements in a detailed way. The application can ask for a volume optimized for random or streaming access. It can also specify performance and capacity.

Storage devices come with a fixed set of performance properties. For example, a hard disk with 4TB of capacity can perform about 120 random access operations per second (IOPS). To optimize utilization of that device, applications must reserve 33GB (4TB/120) of storage capacity for each reserved IOP. Any deviations will either waste capacity or IOPS. For streaming access, a hard disk can easily achieve 150 MB/s for a wide range of reservation sizes. SSDs on the other hand are much more flexible. They can easily mix random and streaming access on the same device.

Approach

In HARNESS, we extended XtreemFS, a cloud file system[1], with mechanisms for virtualising and reserving storage resources. Before HARNESS, XtreemFS had no notion of performance for storage devices or performance isolation for multi-tenancy scenarios. For HARNESS, we developed a resource allocation service and scheduler. In addition, we extended the storage servers with mechanisms for performance isolation.

File access with XtreemFS using a Fuse/CBFS client

When users want to reserve storage, they have to specify the capacity and the expected performance. For performance, they can specify the throughput and the access pattern of their application (e.g., streaming or random access). XtreemFS will reserve the necessary resources from a pool of different devices (e.g., solid state disks, rotating disks, or large RAID systems). Taking existing reservations and the requirements of the new reservation into account, it will reserve storage and performance capacity on the most suitable devices.

We also developed a monitoring service. It analyses all accesses to a volume and provides feedback to the user. E.g. a user reserved a volume with 1 TB capacity and 100 MB/s bandwidth. The monitoring service can tell the user that at most 800 GB were used and except for a few spikes the used bandwidth was below 50 MB/s. For the next run, the user can ask for a smaller, slower, and cheaper volume.

The techniques developed in HARNESS are part of XtreemFS since release 1.5.1. It  includes the reservation system, the monitoring system, user quotas, and an improved Hadoop adapter.

Further Reading

  1. J. Stender, M. Berlin and A. Reinefeld. XtreemFS – a File System for the Cloud. Data Intensive Storage Services for Cloud Environments. 2013
  2. C. Kleineweber, A. Reinefeld and T. Schütt. QoS-aware Storage Virtualization for Cloud File Systems. International Workshop on Programmable File Systems. 2014.