Empowering Generative AI with MinIO: Unveiling AIStor
Generative AI is full of promise to transform our lives in countless ways. However, it also brings new challenges, especially around the rising costs of storing, accessing, and managing the massive data and compute resources required for training and inference workloads. Although the cost of tokens has notably decreased from GPT-3 to GPT-4, demand has surged to unprecedented levels, pushing businesses to seek strong returns on investment (ROI).
At KubeCon and CloudNativeCon North America in November 2024, I had the privilege of seeing MinIO showcase an impressive lineup of new innovations. Known for its cloud-native, cutting-edge approach to object storage, MinIO is now making significant strides in the AI space with solutions aimed at addressing critical challenges around cost, security, and reliability of datasets and AI models’ storage.
In this blog, I’ll share insights from MinIO’s latest announcements and their potential impact on generative AI. But before diving into the details, let’s set the stage with a brief history of MinIO for those who might be unfamiliar with the company.
MinIO Background
MinIO is an object storage server designed for high-performance workloads. Its lightweight, distributed architecture seamlessly enables horizontal scaling, with a focus on three core principles: simplicity, scalability, and speed. Founded in 2014 by Anand Babu Periasamy, Harshavardhana, and Garima Kapoor, MinIO set out to streamline data storage for the cloud-native world.
Written in Go, MinIO is fully compatible with Amazon S3, allowing organizations to modernize storage infrastructure without disrupting existing S3-based applications. MinIO also integrates seamlessly with Kubernetes, supporting containerization and orchestration, which enables flexible deployment across on-premises environments, both public and private clouds, and edge devices.
MinIO’s developer-friendly design has fostered a robust community, boasting almost two billion Docker pulls, making it one of the most widely adopted object storage solutions with millions of active instances globally. In fact, MinIO claims that it has more S3 deployments than even Amazon Web Services!
Enterprise-grade features like security, low latency, and high-throughput data processing have attracted over six million deployments worldwide, with prominent clients such as Mastercard, PayPal, and Salesforce. Many users now manage petabyte-scale data on MinIO, and some even operate at exabyte scale — collectively, MinIO manages multiple exabytes of data across diverse industries across the globe.
What’s new at MinIO?
In its ten years of operation, MinIO has built an impressive suite of features, but at KubeCon North America 2024, it unveiled one of its most significant releases to date: an object storage platform specifically designed to tackle the exascale data infrastructure challenges of modern AI workloads. Named AIStor, this platform introduces a range of new features tailored to meet the unique demands of AI at scale.
Figure 1 highlights these innovative features, and underscores how AIStor sets itself apart from current object storage solutions.
MinIO’s object storage is its core offering, and many of the capabilities discussed in this report apply directly to this flagship product. Unlike legacy SAN/NAS providers who initially focused on file and block storage before adding object storage, MinIO was built as a native object store from the ground up. This focus means it carries no legacy code, resulting in a highly efficient, lightweight platform. For instance, MinIO’s streamlined binary is less than 100 MB. This allows MinIO to be deployed from cameras in retail stores to exabyte-scale lakehouses.
In the sections that follow, we’ll explore each of AIStor’s attributes, beginning with the use case criteria to ensure it meets the specific needs and goals of real-world AI challenges. Once AIStor’s relevance is established, we’ll examine its expanded support for unstructured data, performance, reliability, ecosystem compatibility, and other key features.
Use Cases
AI and machine learning models require processing massive datasets at high speeds, ideally close to where the data originates or is stored. Moving petabytes or even exabytes of data across systems incurs high costs, adds latency, and increases the risk of security breaches. Additionally, creating multiple copies of data can lead to silos, inconsistencies, and proliferation of different tools for data management and analysis.
AIStor enables organizations to consolidate diverse business use cases within a single object store under a unified namespace, significantly simplifying data management. As part of supporting AI, AIStor empowers data lakehouses by storing Apache Iceberg, Apache Hudi, and Delta Lake open table formats and integrating with best in class query engines to facilitate table creation, transformation, and efficient querying for large Iceberg tables.
MinIO’s architecture is engineered to provide high throughput and low-latency data access, making it ideal for training and inference of large AI models, as well as meeting the high-frequency data demands of data lakehouse environments. Its design is IO-bound and optimized for minimal CPU and RAM usage, allowing it to maximize performance with minimal resource overhead. With the speeds provided by today’s NVMe drives, a few of these drives can fully utilize 400GbE/800GbE networks, ensuring efficient data handling across high-speed infrastructures. As network infrastructure continues to advance and compute clusters become increasingly data-intensive, MinIO is well-positioned to support these faster, data-hungry environments.
In a data lakehouse, the open table formats support data versioning, allowing modification and rollback. MinIO’s support for versioned objects aligns with this rollback feature perfectly, allowing scalable, efficient management of historical versions and incremental updates without performance trade-offs.
AIStor embraces a multi-compute engine paradigm for the data lakehouse, allowing users to leverage Spark, Flink, Trino, DuckDB, Dremio, and more to interact seamlessly with files on the object store. AIStor’s strong support for unstructured and structured data, and complete compatibility with Amazon S3 remain core advantages, both of which have seen major upgrades, as detailed in the rest of the blog.
Unstructured Data
MinIO’s S3-API compatible, cloud-native deployments enable organizations to store and access unstructured data with low latency and high throughput. With its latest release, AIStor has significantly added advanced features to help with AI workloads. Two such enhancements include:
- A new S3 API known as promptObject, that enables users to interact with unstructured objects.
- A private cloud repository, known as AI Hub, for securely storing both data and AI models.
The promptObject API reimagines the traditional REST API operations of PUT and GET into a new paradigm: PUT and PROMPT. Just as we interact with large language models (LLMs) through prompts, we can now use similar prompts to access unstructured objects, such as an MRI image.
The promptObject API is seamlessly integrated for user applications, requiring no prior knowledge of retrieval-augmented generation (RAG) models, vector databases, or other AI concepts. It functions out of the box with the default multimodal LLM provided in AIStor and also supports multi-agent architectures with built-in orchestration, allowing smooth interactions with smaller, domain-specific AI models.
The new AI Hub feature within AIStor provides a private cloud repository for models and datasets, functioning as a more secure, controllable version of Hugging Face. It’s especially beneficial for organizations in regulated industries or those handling sensitive data, as it allows them to fine-tune open-source models locally without compromising privacy. Hugging Face has become a widely-used platform for sharing AI models and datasets, which enables companies to kickstart their machine learning projects using pre-trained models, particularly for complex tasks such as large language models (LLMs) that are otherwise costly and resource-intensive to develop from scratch.
AI Hub acts as a proxy server that mimics Hugging Face’s APIs, letting developers pull models from either Hugging Face or MinIO as needed. If a model or dataset doesn’t exist in AI Hub, it can be downloaded from Hugging Face and saved to an AIStor bucket for subsequent use. This setup enables engineers to continue using the Hugging Face API without modifications, making it a turnkey solution for AI teams. Through this, organizations gain a secure, scalable hub where they can store, manage, and fine-tune models and datasets — all while controlling access and protecting sensitive data.
Reliability / Security
MinIO provides several advanced features for data protection and storage integrity, making it a strong choice for enterprise-grade applications that require a secure, fault tolerant, and reliable object storage solution.
MinIO’s erasure coding splits objects into multiple pieces and stores them across different nodes or disks in a distributed environment for higher data durability. Even if some of the data nodes or disks fail, the system can still reconstruct the original data using the remaining pieces. This provides fault tolerance without the overhead of replication, making it a more efficient way to ensuring data integrity in large-scale storage environments and offering an optimal balance between redundancy and storage efficiency.
MinIO’s object immutability feature allows users to lock objects, making them tamper-proof for a specified retention period. This is crucial for compliance with regulatory requirements, such as those governing data integrity, retention, and protection. Once an object is marked as immutable, it cannot be deleted or altered, ensuring that the data remains intact for as long as necessary.
MinIO provides bit rot protection, a feature that helps ensure the long-term integrity of stored data. Bit rot refers to the gradual corruption of data over time due to various factors, such as hardware wear and tear or environmental issues. MinIO’s checksum-based approach continuously checks for data corruption, ensuring that any detected issues are flagged and the system can take corrective actions like data repair.
MinIO’s object-level encryption ensures that each object stored in the system is encrypted individually, protecting data from unauthorized access, both in transit and at rest. By encrypting each object independently, MinIO also allows fine-grained control over the access and management of encrypted data. It uses industry-standard encryption algorithms, such as AES-256.
Together, these features make MinIO a powerful and secure object storage solution suitable for organizations handling large volumes of sensitive data that require high levels of governance, integrity, and security.
Hardware / Network Optimization
Traditional network speeds are no longer enough to handle data-intensive, time-sensitive AI applications like large language models, real-time analytics, and computer vision. Here’s where MinIO’s latest advancements come in, leveraging 400GbE/800GbE networking along with RDMA (Remote Direct Memory Access) to offer a high-performance, future-ready storage solution for modern AI infrastructure.
RDMA allows data to move directly between the memory of two systems without CPU or OS involvement, reducing latency and increasing throughput — ideal for high-speed data transfers required in AI workflows. As Ethernet speeds advance, MinIO’s use of RDMA mitigates traditional TCP/IP limitations by reducing CPU bottlenecks, latency, and memory bandwidth constraints.
Key characteristics of MinIO’s S3 Over RDMA include:
- Low Latency: RDMA’s memory-to-memory data transfer enables MinIO to handle S3 requests (GET and PUT) with minimal delay, accelerating data retrieval for AI training and analytics.
- High Throughput: By bypassing the CPU, MinIO can manage highly parallel data transfers needed for GPU-intensive AI tasks without bottlenecks.
- Better resource allocation: RDMA-enabled NICs handle data movement, freeing up CPU resources, which improves efficiency and lowers operational costs.
- Cost-Effectiveness using open standards: RDMA over Converged Ethernet (RoCE) brings RDMA’s benefits to Ethernet — a widely compatible and cost-efficient option for enterprises to future-proof infrastructure needed for building AI-ready workloads.
Another popular option for high-speed network traffic has been the InfiniBand standard. However, this standard requires specialized hardware, which increases costs and requires specific skills and knowledge for setup and maintenance. Also, it lacks the flexibility to operate across IP-based networks, which makes it challenging to scale beyond individual clusters or single data center environments. Finally, Ethernet has been evolving faster. Even Nvidia, which has invested heavily in InfiniBand since its 2019 acquisition of Mellanox, has ramped up its Ethernet support.
Ecosystem Integration
MinIO easily integrates with cloud platforms, on-premises infrastructure, and various application frameworks. Its Kubernetes-based architecture enables it to fit effortlessly into containerized and cloud-native deployments in hybrid or multi-cloud environments without significant modifications, ensuring flexibility in storage architecture. Products like TensorFlow, MLFlow, and the vector database Milvus are all built on top of MinIO.
As mentioned earlier in this blog, MinIO’s support for open standards, like S3 and the open table formats like Apache Iceberg enable high-performance data processing directly on object storage via any key analytical framework.
In terms of security and access management, MinIO offers integrations with widely used identity management systems such as OpenID Connect, LDAP, and Active Directory, enabling streamlined user authentication and authorization. It also offers interoperability with popular IAM (Identity and Access Management) systems, with connectors for identity providers like Okta and Keycloak. This enables organizations to enforce unified access policies across their entire storage infrastructure.
One of the most exciting recent additions to MinIO’s ecosystem is Hugging Face. Thanks to its API compatibility, any code designed to work with Hugging Face will run seamlessly on AIStor’s AI Hub, allowing users to work with datasets and models without any modifications.
Deployment
MinIO can be deployed on bare metal or as a container. With native Kubernetes support and a lightweight, stateless design, MinIO is frequently deployed with platforms like Red Hat OpenShift, VMware Tanzu, and various public cloud Kubernetes offerings in hybrid and multi-cloud environments. This architecture enables flexibility and adaptability without significant modifications, making MinIO an ideal choice for modern, microservices-based applications. Its containerized approach facilitates seamless integration within orchestrated environments, streamlining deployment and management across complex infrastructures.
MinIO has also updated its Kubernetes Operator to simplify the management of large-scale data infrastructure, supporting the demands of AI workloads in exascale environments. The operator enhances automation, scalability, and ease of use, addressing the evolving needs of enterprises engaged in high-performance, data-intensive workloads.
The MinIO Global Console serves as a centralized management interface providing streamlined access to storage resources, monitoring, and administrative tools across multiple MinIO deployments. This intuitive console consolidates data management and operational insights, allowing administrators to monitor and manage large-scale, distributed storage infrastructure from a single, unified dashboard. Multi-tenancy support enables organizations to create secure, isolated environments tailored to different departments, teams, or clients.
In its latest release, the Global Console features a completely redesigned user interface with expanded capabilities, including IAM, Information Lifecycle Management (ILM), load balancing, firewall, security, caching, and orchestration — all accessible through a single pane of glass. This comprehensive toolkit empowers administrators to oversee and efficiently optimize storage operations across a unified, integrated platform.
Conclusion
MinIO is natively designed with speed and simplicity principles. It is lightweight and built to just manage objects as efficiently as possible. It completely saturates the network. Its speed is only limited by the hardware it is deployed on.
MinIO has positioned itself as a solution for handling the data demands of generative AI, large-scale machine learning applications, and data lakehouses. Through AIStor, MinIO not only meets the technical and performance requirements for AI workloads but also addresses cost-efficiency, security, and scalability — three critical factors in today’s data-driven economy. With innovations such as the AI Hub for secure, private AI model storage, the promptObject API for advanced unstructured data interactions, and its cutting-edge RDMA over Ethernet capabilities, MinIO demonstrates a forward-thinking approach to storage infrastructure.
MinIO’s commitment to open standards and seamless ecosystem integrations further enhances its appeal, allowing organizations to leverage familiar tools and platforms while benefiting from MinIO’s high-performance, scalable object storage. This dedication to compatibility and ease of use makes it an ideal choice for enterprises who aim to future-proof their data infrastructure to meet evolving demands of AI. As we continue to witness the rapid advancement of generative AI, solutions like MinIO’s AIStor will undoubtedly play a crucial role in enabling organizations to unlock the full potential of their data while maintaining control over cost, security, and reliability.
To learn more about MinIO, be sure to watch theCUBE session recorded at KubeCon North America 2024.