7 Best Self-Hosted S3-Compatible Object Storage Software
Geekflare is supported by our audience. We may earn affiliate commissions from buying links on this site.
Data drives the online businesses. Isn’t it?
The data can be images, audio, videos, and other documents and files. And the thing is, data keeps on increasing.
It means data storage can become complicated, time-consuming, and expensive if you go by traditional methods. Thankfully we have cloud technologies in place that makes your life easier with proper data storage in an affordable manner.
Finding the right storage solution is crucial to protect your data and get better accessibility, among other factors.
You can find many storage solutions, and object storage is one of them.
What is object-Storage?
Object storage is designed to store static or flat files. Object files include data, individual identifiers, and metadata that are highly durable and customizable. These data are accessible via HTTP and arranged through associated information like creation date, size, name, file type, etc.
Object storage is the preferred choice of developers and businesses due to ease of access and searchability with metadata. It is also a cost-efficient solution that businesses can benefit from.
There are many cloud object storage software available in the market, and the top ones are Amazon S3, GCS, Azure, etc. But, not everyone would like to store the data in a cloud due to various reasons, but that doesn’t mean you can’t take advantage of object storage.
You can go for a self-hosted S3-compatible software. You can host them on your server, in your datacenter, on-premises.
So, let’s talk about some object storage software, and then you can decide which one is best suitable for your unique requirements.
MinIO
The Kubernetes-native and high-performance object storage platform of MinIO are designed to cater to the hybrid cloud demands. It is capable of delivering stable functionality for your applications.
MinIO supports multiple use cases for wide-ranging environments, and it is cloud-native ever since it came into existence. The software-definite suite of MinIO runs in the public and private clouds seamlessly at the edge and establishes itself as a front-runner in the hybrid cloud object storage.
With industry-leading scalability and performance, MinIO delivers multiple use cases for data analytics, AI, ML, modern mobile and web applications, backups, and restores.
It is native to cloud architectures and technologies such as orchestration using Kubernetes, containerization, multi-tenancy, and microservices.
MinIO is one of the fastest object storage platforms globally, with a read/write speed of 183GB/s-171GB/s if you use standard hardware. It can function as the main storage tier for many workloads like Spark, TensorFlow, Presto, Hadoop HDFS, and H2O.
It is open-source. Through minimalism principles, MinIO helps reduce the possibility of errors, delivers reliability, and improves uptime.
You can install and configure it within minutes with no confusing variations and options, resulting in fewer failure rates and minimum administration tasks. Alternatively, if you don’t have time to install and manage, you can get Minio ready on Kamatera VM.
The object storage software interface of Ceph is built over librados that provides a client application with RESTful access to the Ceph Storage Clusters. It also establishes a foundation where you can leverage its advanced features such as RADOS gateway (RGW), RADOS Block Device (RBD), and Ceph File System (CephFS).
Apart from being S3 compatible, Ceph also offers object storage for an OpenStack Swift API compatible interface. Ceph’s librados libraries support applications written in Java, C, C++, PHP, Python, and more. It also enables these applications to access its object storage platform via a native API.
The advanced features included in the librados library are:
- Snapshots
- Object-level key-value mappings
- Complete or partial writes and reads
- Atomic transactions, including features such as truncate, clone range, and append
Zenko
Design and integrate your applications faster using the S3-compatible platform of Zenko and store your objects and data anywhere you want. They provide 360-degree access to a cloud of your choice along with an S3 API set.
Zenko offers one single interface that unifies multiple operations in one place and supports multi-cloud data storage on-premises and Amazon S3’s public cloud and other services such as Docker and Scality RING.
You have the full suite of S3 language-specific wrappers and bindings, which includes SDKs so that you can develop apps in any language. The Zenko CloudServer also helps developers access data trapped in layers and stored on-premises or public clouds like Azure, S3, or GCP.
Riak S2
Riak S2 is an easy-to-operate, readily available, and highly scalable storage software optimized to store objects.
It can be a powerful yet simple storage solution for larger objects designed for public, private, and hybrid cloud environments. Riak S2 offers a cost-effective solution that you can use to require object storage for your apps or any other service offerings.
The software is compatible not only with Amazon S3 but also with OpenStack Swift. Riak has powerful APIs, and it is easily scalable and handles petabytes of data through commodity software capable of boosting the performance on adding more capacity.
Riak S2 comes with robust functionalities that help you run and manage your Big Data apps smoothly. It replicates all the objects intelligently in the cluster, ensuring they are always available for your needs. It is developer-friendly as developers can use its available tools and libraries whenever they want due to being S3 and OpenStack compatible.
Riak S2 monitors continuously and repairs data automatically on finding inconsistencies. You get per-tenant reporting over data usage and statistics that enable metering and billing for a multi-tenant deployment. It allows you to optimize the server for low-latency at affordable rates by displaying frequently-accessed data over its fastest media.
Utilizing the multi-part uploading feature, Riak S2 enables easy and fast storage of large files in gigabytes and terabytes. Installing Riak S2 is simple, and you can quickly increase its capacity by adding more nodes to the server cluster. It uses multi-cluster replication and low-latency storage to maintain higher availability in case of site failures.
Riak S2 offers an enterprise-ready solution.
Triton
Control your data effectively with the object storage platform of Triton by Joyent. It comes with a minimalist file manager with cool color combinations without harboring any confusion.
You can easily add files, create new folders, download files, get data information, delete files, etc. Triton is developer-friendly and simple for users with familiarity with Unix. You can interact by using a simple API and CLI.
Triton has robust built-in security that includes deep role-based access control, object-level access and security, data encryption, and client SSH. It is scalable, durable, and proven even at the production level. It delivers accurate data replication, failover, backup, recovery capabilities, and clustering.
You can perform search and transform along with CRUD operations by using a REST API that supports JSON. Triton is a highly scalable, clustered, and distributed object storage platform with object-level granularity. It performs replications across different data centers with better controls per object.
You can store any number, size, or type of object as it is provisioned with linearly scalable infrastructure. Triton enables Read after Write consistency to protect your data from corruption due to file disconnections or data loss. Other capabilities of Triton include arbitrary object versioning and higher durability with ZFS RAID-Z storage.
LeoFS
LeoFS is a consistent, highly available, and distributed object storage platform. It is perfect for you to store a large amount of data of various sizes and types in their native format.
It provides a high cost-performance ratio and lets you craft LEOFS clusters by utilizing commodity hardware over a Linux OS, and still provides sturdy performance. LeoFS needs a smaller server cluster compared to other storage platforms and still works great in addition to offering easy operation and setup.
You get high reliability due to its excellent design over the Erland/OTP capabilities, delivering up to 99.9999999% uptime. Even though a hardware failure or software issue comes up within the cluster, LeoFS will be available for you to use.
In addition to that, you get higher scalability with this software, where adding or removing modes is quick and simple. As a result, it helps you react promptly based on your needs. Think of the LeoFS cluster as elastic object storage, stretching as often and as much as you require.
It is built in the object-cache mechanism and can handle HTTP requests and responses effectively. LeoFS also consists of a replicator, queuing, and recovery mechanisms to provide consistency and keep running the storage nodes. For higher uptime, LeoFS monitors node status as well as RING’s checksum.
Other features of LeoFS are RESTful interface, multi-protocol support, Amazon S3 API, multiple data centers, data lake solution, cloud integration, bucket and user management, support for custom metadata and AWS signature v4, and improved Spark integration.
HyperStore
Cloudian’s S3-compatible object storage solution, HyperStore, solves your entire storage requirements and challenges. You can deploy it wherever you want to increase the capacity storage and then scale it seamlessly.
Utilize HDD-based platforms that come at the lowest cost-to-ownerships (CTO), or use all-flash drives and achieve 3x faster performance. Cloudian HyperStore reduces all your storage complexities and provides you with a simple and effective storage solution with these advanced technologies.
You can even combine flash and HDD inside an adaptive hybrid environment with smart data placement. HyperStore allows you to choose any platform you prefer and any virtual machine or bare metal server. Regardless of your choice, you get all the functionalities and features of HyperStore.
They also offer you storage appliances with plug-and-play deployment in addition to end-to-end support. Their capacities are ranging from 77 TB to 1.5 PB per appliance or more. They configure these appliances for topmost performance available at an affordable cost.
Moreover, HyperStore has a proven S3 API to protect your investment and NFS & SMB support on top of the HyperFile NAS controller. It uses Hyperscale data fabric to provide limitless growth, promotes modular growth with additional nodes, geo-distribution, and cloud integration for added capacity.
Other features included in it are multi-tenancy, QoS, encryption, compression, 100% native S3, interoperability, and data durability. You can try Cloudian HyperStore free for 45 days and get 100 TB of storage.
Conclusion
Data storage can be critical with traditional methods, which is why we have cloud storage now. Object storage software leverages the cloud capabilities and stores your data of any size and types effectively. So, go ahead, get a cloud VM and try the above-listed software to see what works for you.
Источник
Эластичное избыточное S3-совместимое хранилище за 15 минут
S3 сегодня не удивишь наверное никого. Его используют и как бэкенд хранилище под веб сервисы, и как хранилище файлов в медиа индустрии, так и как архив для бэкапов.
Рассмотрим небольшой пример развертывания S3-совместимого хранилища на основе объектного хранилища Ceph
Краткая справка
Ceph — это open source разработка эластичного легко масштабируемого петабайтного хранилища. В основе лежит объединение дисковых пространств нескольких десятков серверов в объектное хранилище, что позволяет реализовать гибкую многократную псевдослучайную избыточность данных. Разработчики Ceph дополняют такое объектное хранилище еще тремя проектами:
- RADOS Gateway — S3- и Swift-совместимый RESTful интерфейс
- RBD — блочное устройство с поддержкой тонкого роста и снапшотами
- Ceph FS — распределенная POSIX-совместимая файловая система
Описание примера
В моем примере я продолжаю использовать 3 сервера по 3 SATA диска в каждом: /dev/sda как системный и /dev/sdb и /dev/sdc под данные объектного хранилища. В качестве клиента могут выступать различные программы, модули, фреймворки для работы с S3 совместимым хранилищем. Я успешно протестировал DragonDisk, CrossFTP и S3Browser.
Также в этом примере я использую всего один RADOS Gateway на ноде node01. S3 интерфейс будет доступен по адресу s3.ceph.labspace.studiogrizzly.com .
Стоит отметить что на данный момент Ceph поддерживает такие S3 операции http://ceph.com/docs/master/radosgw/s3/.
Приступим
Шаг 0. Подготовка Ceph
Так как я продолжаю использовать уже развернутый кластер Ceph, мне необходимо только немного поправить конфигурацию /etc/ceph/ceph.conf — дописать определение для RADOS Gateway
и обновить ее на других нодах
Шаг 1. Инсталлируем Apache2, FastCGI и RADOS Gateway
Шаг 2. Конфигурация Apache
Включаем необходимы модули
Создаем VirtualHost для RADOS Gateway /etc/apache2/sites-available/rgw.conf
Включаем созданный VirtualHost и выключаем дефолтный
Создаем FastCGI скрипт /var/www/s3gw.fcgi :
и делаем его исполняемым
Шаг 3. Подготавливаем RADOS Gateway
Создаем необходимую директорию
Генерируем ключ для нового сервиса RADOS Gateway
и добавляем его в кластер
Шаг 4. Запуск
Рестартуем Apache2 и RADOS Gateway
Шаг 5. Создаем первого пользователя
Что бы использовать S3 клиент нам необходимо получить ключи access_key и secret_key для нового пользователя
смотрите вывод команды и скопируйте ключи в ваш клиент
Шаг 6. DNS
Для того что бы заработали buckets нам необходимо что бы DNS сервер при запросе любого субдомена для s3.ceph.labspace.studiogrizzly.com указывал на IP адрес хоста где запущен RADOS Gateway.
Например, при создании bucket с названием mybackups — домен mybackups.s3.ceph.labspace.studiogrizzly.com. должен указывать на IP адрес node01, что есть — 192.168.2.31.
В моем случае я просто добавлю CNAME запись
Послесловие
За 15 минут мы успели развернуть S3-совместимое хранилище. Теперь попробуйте подключить ваш любимый S3 клиент.
Бонусная часть
Я попросил sn00p рассказать о его опыте использовании RADOS Gateway в продакшн в компании 2GIS. Ниже его отзыв:
Общее описание
У нас стоит варниш, варнишу бекендами подцеплены 4 апача радосгейтвеев. Приложение сначала лезет в варниш, если там облом, то раундробином ломится напрямую в апачи. Эта штука жмет 20000 рпс без проблем по синтетическим тестам jmeter c access логом за месяц. Внутри полмиллиона фоточек, рабочая нагрузка на фронтенд около 300 рпс.
Ceph пока что на 5 машинах, там отдельный диск под osd и для журнала отдельный ssd. Репликация дефолтная, ^2. Система без проблем переживает падение двух нод одновременно и там дальше с вариациями. За полгода ни одной ошибки еще клиенту не показали.
Нет проблем с гибкостью — размер хранилища, иноды, раскладка по каталогам — это все в прошлом осталось.
Особенности решения
Все это работает у нас уже полгода и вообще не требует вмешательства администратора ))
Планы на будущее
4-200кб. С S3 это весьма не удобно — там нет операций bulk-copy, нельзя удалить bucket с данными, чтобы первоначально хранилище наполнить — это медленно ппц. Исследуем как это подкрутить.
Но главная задача — геокластер, мы сами в Сибири и хотим отдавать данные из географически близкой точки клиенту. До Москвы у нас контент летит с задержкой уже — до 100мс плюсом, это не годится. Ну у разработчиков Ceph вроде все в планах такое.
Источник