From File Systems to the Cloud and Back

Cloud storage solutions today are a great alternative for storing data on a local computer or in NAS storage. Started with Amazon S3, such solutions are offered by a dozen companies, including Microsoft with Azure Blob Storage.

The advantages of cloud storage are nearly infinite storage capacity (use as much as you need, not as you have), the distance between the storage and your location (the data won't be lost in an accident and access of third parties to your data is severely limited), and lowered cost of data management.

At the same time, cloud storages work in a way that doesn't match the regular approaches to storage access, such as hierarchical file systems and relational databases. Internally designed as huge tables with an index and BLOB field for data, they don't match the flexibility that file systems or database management systems can offer to the developer and user. The developer needs to perform translation between the data he has in the application and the back-end cloud storage.

One more significant disadvantage is the differences between APIs offered by different services. While most services offer a so-called REST API, this API is in fact a format for requests and responses sent over HTTP. The request commands, parameters, and functions offered by the services differ significantly. Due to this, cloud services require writing separate code for each API.

Finally, the main factor of the (in)feasibility of cloud storage-based solutions is the question of guaranteeing data safety. Though service providers tell us about encryption used on their side, such encryption is performed on their systems and there's no guarantee that it's really reliable and if it is even performed.

CBFS Storage offers the missing pieces that fit well into the cloud storage architecture.

As are most file systems, CBFS Storage is page based. This means that it doesn't operate with random sequences of bytes, but with blocks (sectors on the disk, pages in memory) of fixed size. This makes it easy to back CBFS Storage with almost any storage.

To make such backing possible, CBFS Storage supports callback mode, in which it asks your application to store or retrieve the block to or from the back-end storage. So all you need to do is implement two simple functions "put the page #X to the cloud storage" and "retrieve the page #X from the storage" in your code, and that's all — you have a file system in the cloud!

But that's the least that CBFS Storage can offer. The file system offers several advanced features, such as built-in encryption and compression (performed on your side, if you remember the cloud security problem referenced above), nearly unlimited possibilities for storing metadata (various supplementary information about the main file or data), and to perform SQL-like searches for files. Moreover, if you need custom encryption (e.g., using keys stored on cryptographic hardware tokens), this is possible with two other callbacks - "encrypt page #X" and "decrypt page #X".

And what if you don't need a file system, but a relational database? You can use your favorite DBMS and have it store its files on the virtual disk, created by CBFS Storage (OS Edition). This way the database files are stored in the cloud storage, and your application works with them via the database management system of your choice.

One more benefit of CBFS Storage is that moving from one cloud storage service to another is as simple as rewriting two basic functions for storing and retrieving pages to and from the cloud storage.

CBFS Storage simplifies the code you need to write using the cloud service's APIs: it's much easier to write the code that stores and retrieves fixed-sized files (each page has the same size) by the page numbers, than to try to implement a relational database or a file system in the cloud yourself.

Ready to get started?

Learn more about Callback Technologies or download a free trial.

Download Now