Groundbreaking Insights from AWS re:Invent 2024

Sanjeev Mohan
17 min readDec 28, 2024

--

AWS re:Invent 2024 was one of the most rewarding editions since I began attending in 2017. While earlier conferences were marked by a deluge of standalone product announcements, recent years have focused on integrating AWS’s expansive, purpose-built — and often overlapping — offerings. In 2024, that evolution culminated in key announcements underscoring the significance of open standards for data foundations and the seamless integration of generative AI.

Make no mistake: AWS unveiled enough innovations to leave anyone reeling. However, this year’s announcements were deeply rooted in a strategy aimed at enhancing user experience, simplifying complex workloads, and democratizing generative AI to its full potential.

Note: The audio version of this blog is available here.

Figure 1 shows how AWS is unifying its extensive ecosystem with the evolution of SageMaker into a comprehensive, centralized platform for data, artificial intelligence (AI), and machine learning (ML). By integrating tools for data preparation, feature engineering, model training, deployment, and monitoring, SageMaker bridges the gap between data, infrastructure, and ML workflows. This unified approach allows businesses to seamlessly transition from raw data stored in services like Amazon S3, SageMaker Lakehouse, and Redshift to actionable insights, leveraging comprehensive data governance. This holistic integration enables organizations to scale business initiatives with consistency, efficiency, and security across the cloud ecosystem.

Figure 1: AWS unifies its data and AI ecosystem through Amazon SageMaker

This was Matt Garman’s first keynote as CEO, and he laid out a clear and compelling vision for AWS’s future. Adding to the excitement, Andy Jassy’s return to the stage brought its own energy. True to form, Jassy announced cutting-edge frontier models, firmly placing AWS alongside leading model providers like Meta, OpenAI, Google, and Anthropic.

Read on for a comprehensive breakdown of the key data and AI announcements from AWS re:Invent 2024. As always, refer to the latest AWS documentation for up-to-date information on the status of these products or capabilities — whether in beta, preview, or general availability (GA).

Given the complexity and interconnectedness of AWS’s data and AI offerings, I’ve categorized the re:Invent 2024 announcements into thematic groups to provide a clearer and more insightful analysis. As shown in Figure 2, my framework diverges from AWS’s traditional go-to-market classifications, yet comprehensively captures the scope of these announcements, offering a fresh perspective on their strategic implications.

Figure 2: Key AWS re:Invent 2024 data and AI announcements

Let’s begin with the Lakehouse category, as it exemplifies AWS’s approach to simplifying and unifying its data architecture while embracing open standards to foster greater adoption and interoperability.

Unified Analytics

AWS is making a significant push to unify its analytics services, offering a more integrated and streamlined experience across data, analytics, ML, and AI. Tight integration between services like Amazon S3, Redshift, EMR, and QuickSight into a “lake house” architecture is meant to serve the needs of multiple personas like data analysts, data engineers, data scientists, and AI engineers using a common plane.

By simplifying access to structured and unstructured data, AWS aims to provide a comprehensive analytics solution that empowers businesses to derive faster insights from their enterprise data which may be spread across multiple different sources. This move emphasizes a unified approach to data ingestion, storage, processing, and visualization, ultimately accelerating data-driven decision-making.

The designation of SageMaker for this broader purpose was unforeseen, as it had previously been associated solely with data science workflows centered on machine learning. This nomenclature reflects AWS’s intention to converge the user experience across its data, analytics, and machine learning offerings.

Figure 3 shows the technical architecture of AWS’ unified analytics approach.

Figure 3: AWS unified analytics layers

Let’s dive into each of these critical advancements starting from the storage layer.

S3 Tables

Amazon S3, a cornerstone of AWS since its 2006 debut, has achieved unparalleled success, now hosting over 400 trillion objects. Its evolution has been equally impressive, with innovations like Intelligent Tiering, which has saved customers an estimated $4 billion by optimizing storage costs.

The latest enhancement, S3 Tables, introduces a new type of storage bucket, designed to deliver 3x higher performance and up to 10x more transactions per second for Iceberg Tables compared to self-managed Iceberg tables stored in standard S3 buckets.

To understand the need for S3 Tables, consider the evolution of querying data in object stores like Amazon S3. Initially, systems like Apache Hive simplified querying Parquet files using SQL, with metadata stored in relational databases such as PostgreSQL or MySQL. However, Hive’s scalability and functionality were limited. Newer open table formats like Iceberg, Delta, and Hudi emerged to address these gaps, offering features like transactionality, schema evolution, and time travel.

Databricks, Uber, and Netflix developed Delta, Hudi, and Iceberg respectively. However, Iceberg’s original specification struggled with the “small files” problem — frequent ingestion runs created numerous small files, increasing query latency and cost due to multiple read operations.

To address the challenges of managing small Parquet files (e.g., 1MB), organizations commonly use a process called compaction, which consolidates these files into larger ones (e.g., 512MB). Traditionally, this task is performed by compute engines like Apache Spark or embedded in query engines of databases such as Snowflake and Crunchy Data. These systems handle compaction during data processing, often adding complexity and resource demands.

With S3 Tables, AWS redefines this approach by moving compaction to the storage layer. This fully managed solution streamlines workflows, reduces latency, and eliminates the need for orchestration of compaction tasks. Additionally, it automatically detects and handles orphan files, ensuring data consistency and performance. S3 Tables seamlessly integrate with any compute engine that supports Iceberg Tables.

S3 Table storage incurs costs approximately 15% higher than S3 Standard, along with additional fees for object monitoring and compaction — tasks users would otherwise need to perform. Customers retain the option to use standard S3 buckets.

AWS implementation of S3 Tables adheres to REST APIs. To do safe commits to S3 Tables, AWS introduced a low-level catalog called S3 Tables Catalog. Amazon S3 Tables Catalog for Apache Iceberg is an open sourced library hosted by AWS Labs. It is a client-sde JAR file.

Users can access S3 tables from open-source query engines by using the Amazon S3 Tables Catalog for Apache Iceberg client catalog. It works by translating Apache Iceberg operations in your query engines (such as table discovery, metadata updates, and adding or removing tables) into S3 Tables API operations.

Optionally, users can also connect to S3 Tables through the Sagemaker Lakehouse, which is covered next.

SageMaker Lakehouse & Iceberg REST API

Compute engines need to efficiently discover Parquet files, their contents, and associated partitions. Apache Hive, though pioneering in its time, fell short by requiring users to manually define and manage partition details, making it cumbersome for dynamic or large-scale workloads.

Modern Iceberg REST Catalogs (IRC) overcome these limitations by abstracting file details and offering a standardized REST API interface, enabling any supported compute engine to seamlessly interact with and manipulate the files. This shift signifies a critical evolution: the center of gravity in data processing has moved from the storage layer to the metadata catalog or metastore, which now acts as the backbone for efficient file management and querying.

Similar to Snowflake’s Polaris and Databricks’ Unity Catalog, AWS offers SageMaker Lakehouse as its technical metadata catalog. In addition to existing Glue Data Catalog APIs, SageMaker Lakehouse provides Iceberg REST APIs, enabling connectivity for any compute engine supporting Iceberg REST specifications, including AWS services like Amazon EMR and Athena, as well as third-party tools.

Amazon Redshift

Amazon Redshift was the OG cloud data warehouse when it was introduced at re:Invent 2013. A constant stream of updates have modernized Redshift over the last few years including over 100 features and enhancements in just 2024. Redshift started a few years ago with decoupling compute from storage, data sharing, multi-warehouse writes, and culminated in 2024 with its integration with SageMaker Lakehouse.

Now, native Redshift Managed Storage data can be “published” in the Lakehouse as Iceberg Tables which allows any other Redshift compute cluster or 3rd party analytical engine to query this data. This creates an entry into the SageMaker Lakehouse (based on Glue Catalog as mentioned earlier). The Redshift compute (Serverless or Provisioned clusters) automatically mount the catalogs that have been published in the SageMaker Lakehouse.

Redshift’s analytics-optimized managed storage, query engine, and C++ implementation, result in efficient data reads and writes. As a result, EMR Spark on Redshift Managed Storage is 3.5x faster for small writes and 50% faster reads compared to Spark on native Iceberg Tables. Redshift has upgraded its serverless compute engine from Apache Impala to the native engine.

Data sources from operational databases, streaming services, and applications can be ingested into SageMaker Lakehouse and stored in the native Redshift Managed Storage. In 2023, AWS had introduced a number of zero-ETL integrations from operational databases like Aurora and RDS. This year, some of the new zero-ETL announcements with SageMaker Lakehouse as the destination include:

  • GA of Amazon DynamoDB’s operational data can now be analyzed in SageMaker Lakehouse without building custom pipelines.
  • Enterprise applications. Ingest data from applications like Salesforce, ServiceNow, SAP, and Zendesk, Facebook Ads, Instagram Ads, Zoho CRM, etc. directly into Amazon SageMaker Lakehouse and Amazon Redshift via AWS Glue.
  • Observability and security data via integration with Amazon CloudWatch and Amazon OpenSearch.

The lakehouse supports querying data in place via federated query. SageMaker Lakehouse Federation, recently announced, includes six connectors, like Google BigQuery and Snowflake. These complement the mid-2024 general availability (GA) of integration with Salesforce Data Cloud, enabling bidirectional data sharing between Salesforce and customer data lakes.

Data and AI Governance

A key component of SageMaker data and AI governance is SageMaker Catalog that is built on Amazon DataZone. This new capability simplifies how organizations discover, govern, and collaborate on data and AI initiatives across their data lakehouse, AI models, and applications. It uses Amazon Q to build a business metadata catalog.

Amazon Q, an AI-powered business assistant, can automatically generate metadata, making it easier for business users to understand and find relevant data. This bridges the gap between technical metadata and business terminology. In addition, it combines the technical governance capabilities of DataZone with the business-friendly features of Amazon Q to enable a unified and accessible governance framework. Organizations can define and enforce access policies consistently using a single permission model with fine-grained access controls.

SageMaker Unified Studio

While AWS emphasizes user choice, it has also adopted a data fabric approach, abstracting multiple engines through a single interface: SageMaker Unified Studio. As shown in Figure 1, Unified Studio acts as the experience layer to help customers build their data fabric through concepts such as domains, data products, and the unification of customer data across operational systems, data lakes, data warehouses and federated data sources. It integrates seven services, four of which are currently available:

  • Data processing using EMR, Glue and Athena
  • Analytics using Redshift
  • Model development using SageMaker
  • Gen AI app development including Bedrock IDE

Other three services, such as streaming, BI, and search will be added in the future. Unified Studio is in preview. Each service is available through the unified interface.

SageMaker Unified Studio provides a comprehensive web-based integrated development environment (IDE) for the entire machine learning lifecycle. It streamlines workflows for data preparation, model building, training, deployment, and monitoring, all within a single interface. Notably, SageMaker Unified Studio now integrates the Bedrock IDE, enabling developers to seamlessly build generative AI applications directly within the familiar Studio environment, further simplifying the process of working with large language models and other generative AI technologies.

Operational Databases

AWS has several purpose-built databases for supporting various relational (RDS, Aurora, and Redshift) and nonrelational (DynamoDB, Neptune, Document, Keyspaces, TimeStream, Valkey) use cases. The era of launching a brand new database at every re:Invent is history these days but the existing databases are undergoing massive upgrades. Some of the changes pertain to better optimization and use of the new instance types that provide better price performance. Other updates are more foundational, such as Aurora DSQL which we shall look at next.

Aurora Distributed SQL (DSQL)

Aurora is AWS’ optimized RDBMS with MySQL and PostgreSQL compatibility but using its own storage to provide higher performance, scalability and reliability than the open-source implementation in Relational Database Service (RDS).

Aurora Serverless V2 scales its compute down to only 0.25 vACU (Aurora Capacity Units) leading many to question its use of the term serverless. In 2024, AWS has introduced serverless that releases all compute resources when they are not being used.

In 2023, Aurora Limitless added horizontal sharding of data across multiple nodes so that multiple writers could write to the single database in parallel. It has a single serverless endpoint that routes requests to three types of tables — standard PostgreSQL, sharded (distributed across shards), and reference (replicates data on every shard for faster joins). This capability went to GA in 2024.

However, this year witnessed Aurora’s biggest release — Aurora DSQL (preview). It combines the serverless and distributed features and provides a multi-region active-active highly available database with PostgreSQL compatibility. In its initial release, it doesn’t support all the PostgreSQL commands but those will be added in subsequent releases.

To achieve low-latency, multi-region strong consistency, Aurora DSQL decouples transaction processing from the storage layer. It uses Amazon Time Sync Service, which adds hardware reference clocks on every EC2 instance, synchronizing them to satellite-connected atomic clocks to provide microseconds level accurate time. Aurora DSQL is available in two formats:

  • Single Region: Requires a minimum of 3 availability zones (2 must be active and 1 witness)
  • Multi-Region: Requires a minimum of 3 regions (2 must be active and 1 witness)

Aurora is brimming with new features requiring its own blog, but here are some key highlights:

  • Pgvector has been enhanced to support distributed options. This extension is needed to provide vector embeddings. The code used to distribute pgvector has been shared with the community. AWS reported 8x performance gains over standard pgvector.
  • Bedrock integration
  • Optimization via storing buffer pool cache in NVMe, storage checkpoints and caching dirty pages.

Amazon Neptune GraphRAG

Graph databases biggest challenge is user adoption. Although the graph data model is highly intuitive, users have been hesitant to adopt it at scale. So, the Neptune team at AWS embarked on a journey to find ways to provide users advantages of graphs without actually having to use them directly. That strategy shows up in embedding graphs into RAG flows aka GraphRAG.

GraphRAG helps increase accuracy of RAG workflows by 5–15%. It does the normal chunking and embedding but instead of just storing the embeddings in a vector store, it also creates a lexical graph of terms. So, now, the user performs vector search and then GraphRAG traverses the graph and reranks the results. For example, an AWS client was using RAG on emails to check for compliance violations but this approach was not adequate when looking across multiple emails. Now, with GraphRAG, the client is able to create a graph that spans multiple emails.

Behind the scenes, Neptune Analytics uses an in-memory property graph. This feature is available through Amazon Bedrock Knowledge Bases which is mentioned later in the document.

Amazon DynamoDB

Before we end this section, it is important to note that Aurora DSQL isn’t the only distributed offering this year. Amazon DynamoDB Global Tables also now supports (preview) multi-region strong consistency. Interestingly, it uses the same technology as Aurora DSQL and provides 5 9’s availability.

Vector Search

Vector search capabilities are now available across many different AWS databases as shown in Figure 3.

Figure 3: AWS is enabling vector search in a comprehensive suite of tools and services.

AI Infrastructure

Although generative AI space is only two years old, it has reached the limits of scaling law. Even more data — synthetic or real, is not able to make models accurate and reliable. The next breakthrough is going to come through “post training” contextually rich models and execution through smart agents. AWS is addressing these needs through a combination of advancements in the compute, models capable of reasoning, and agentic aspects.

Hardware and instances

AWS deepened their relationship with NVidia by announcing Balckwell GPU-based new EC2 instance, called P6, with an incredible 6 9‘s of availability. It will be available in early 2025.

Meanwhile, Graviton 4 went GA. It was also incredible to learn that the amount of commute driven on Graviton chips now exceeds all of AWS’ 2019 compute workloads.

AWS showcased the new Tranium2 GPUs for training and inference workloads and claimed to be 70% more efficient than equivalent Nvidia GPUs. Amazon EC2 Trn2 instances are powered by 16 Trainium2 chips. Trn2 instances offer 30–40% better price performance than the current generation of GPU-based EC2 P5e and P5en instances.

Trn2 UltraServers use NeuronLink, AWS’ proprietary chip-to-chip interconnect, to connect 64 Trainium2 chips across four Trn2 instances. Trn 2 Ultra Servers are being used by Anthropic to train Claude models. AWS is already thinking ahead as 3 nm Trainium3 was announced.

Frontier Models

AWS announced a family of six Nova models available through Amazon Bedrock and covering a wide range of text and multi-modal use cases. These models provide affordability and performance. AWS claims that they are 75% more cost effective than similar models. Amazon Nova models superseded the older Titan models due to their completely re-architecture.

Models include four text models:

  • Micro: Focus is on text input and output. It supports 200 languages and is the fastest of all the Nova models. Token size is 128K.
  • Lite: Handles text, image and video inputs. It has a higher token size of 300K.
  • Pro: Offers higher accuracy in tasks like Q&A, summarization, and analyzing documents. It also has a token size of 300.
  • Premier: Built for complex tasks like acting as a teacher model to train smaller models. Amazon plans to increase token size to 2 million in the future.

And two creative models:

  • Canvas: Used for image generation and editing
  • Reel: Used for video generation. The realistic videos are up to 30s but AWS is increasing the video length to 2 minutes in the future.

These models include watermarking and content moderation features. Although, AWS doesn’t disclose the training data set used, they have an indemnity policy to protect users from copyright materials. These are just the first of many new models. AWs plans to release a speech-to-speech model in early 2025.

AI Development

Through the AWS Generative AI Innovation Center (GenAIIC), AWS aims to help businesses accelerate their Gen AI initiatives. Its major announcements pertain to Amazon Bedrock, Q Developer and Q Business.

Amazon Bedrock

Amazon Bedrock provides fully managed access to foundation models from AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API. It allows users to build generative AI applications with security, privacy, and responsible AI.

Amazon Bedrock is geared for higher-level AI tasks than data scientist-focussed ML development tool, Amazon SageMaker which is covered in the next section. Some of the new announcements include:

  • Intelligent Prompt Routing and Prompt Caching: AWS can help users choose the right type of models within a family of models and help reduce costs up to 30%. In addition, Bedrock now caches repeated prompts, further reducing token costs up to 90% and reducing latency up to 85%.
  • Model Distillation: While the world is focused on building larger models, AWS is promoting smaller, customized models that serve organizational needs through accurate prediction. In 2025, we will see a rise of “post training” small language models (SLMs) over “fine-tuning” large models. Amazon Bedrock Model Distillation enables large models to train efficient smaller models. For example, Nvidia’s 4 billion Llama 3.1 Minitron model used model distillation on the 405 billion Meta Llama 3.3 model.
  • Automated Reasoning: Hallucinations is the biggest roadblock in adoption of generative AI apps. Automated Reasoning uses mathematical validation to prevent factual inaccuracies. It detects errors and suggests corrections. This self-verification helps increase trust in generative AI apps.
  • Guardrails: This capability is used to provide responsible AI when models are fine-tuned.
  • Multi-agent orchestration: As 2025 gears up to be the year of agents, an orchestrator is needed to chain agents performing specific tasks by sharing state, memory and context. Q Developer is an example of an agent and is covered below.
  • Knowledge Bases support for GraphRAG: Knowledge Bases offers fully managed RAG workflows with citations. In addition to Amazon OpenSearch, developers building RAG applications can now choose Neptune Analytics as their vector store. Bedrock will automatically create vector embeddings in Amazon Neptune and a graph representation of entities and their relationships.
  • Knowledge Bases support for Amazon Redshift and Amazon Sagemaker Lakehouse: Users can build fully managed RAG pipelines on structured data. Previously, RAG pipelines were limited to text data using OpenSearch. Now, you can select Redshift databases or Glue Catalog as your data source. During the keynote, Swami Sivasubramanain highlighted key benefits, stating that this new feature “adjusts to your schema and data, it learns from your query patterns, and provides the customization options for enhanced accuracy.”

Amazon Bedrock is being embedded across many other services inside AWS. For example, Bedrock-based assistant is now being used by the Schema Conversion Tool (SCT) inside Data Migration Services (DMS) to achieve up to 90% automated conversions when the target is PostgreSQL compatible databases. On a related note, AWS is claiming to have used DMS to migrate 1.45 million databases.

Amazon Q Developer

Q Developer is a generative artificial intelligence (AI) powered conversational assistant that helps developers build, manage, and operate applications. At AWS re:Invent 2024, Q Developer was pervasive — from being embedded in the AWS management console to various different AWS products. It doesn’t just help simplify software development life cycle, but its capabilities include:

  • Helping developers with coding, unit testing and documentation tasks
  • Modernization of legacy code on Windows .Net, VMWare, and mainframes in a matter of days to run on EC2 Linux instances inside AWS.
  • Ability to query security logs using natural interfaces. The logs may be produced natively by CloudWatch or by third-parties like Wiz and Datadog.

Why should AI be left to developers only? AWS is not just for the builders but to democratize AI for the business teams.

Amazon Q Business

Q Business is a generative AI-powered assistant for finding information, gaining insight, and taking action at work. It indexes corporate data in places like Microsoft Office, Gmail, PDFs in SharePoint, and other sources.

Using natural language, employees can request information or assistance to generate content or create lightweight apps that automate workflows from web browsers, Amazon QuickSight, and applications like Slack and Microsoft Teams. For example, users can ask complex questions and receive quick, accurate, and relevant answers from documents, images, files, and other application data, as well as data stored in databases and data warehouses.

Organizations like Hearst are reporting a 70% reduction in the volume of requests for guidance and support from the various business units through their self-service assistant using Amazing Q Business implementation. The Hearst team is now able to focus on more strategic business initiatives rather than “repetitive, routine requests.”

ML Development

We have already discussed how Amazon SageMaker is becoming a one-stop shop for data, analytics and AI. In this section, we look at the announcements catering to its original role — as an essential tool for data scientists who are looking to train AI models.

As reported in the re:Invent 2023 announcements, SageMaker HyperPod technology speeds up training models through parallelization. This year, AWS introduced HyperPod Task Governance to allocate resources intelligently across multiple AI workloads. This is especially essential to eliminate underutilized GPU resources.

SageMaker HyperPod Flexible Training Plan designed to streamline and optimize model training timelines, resource utilization, and cost for generative AI workload. It is built on EC2 Capacity Blocks to dynamically allocate resources to meet customers’ desired completion date, budget, and maximum compute resources for their AI training jobs.

Partner apps are now being deployed in SageMaker. It now allows easy discovery, deployment, and use of AI development applications from AWS partners like Comet, Deepchecks, Fiddler AI, and Lakera directly within the platform.

As we have seen, Bedrock IDE is integrated into the SageMaker Unified Studio which enables end-to-end model development and generative AI application creation within a unified environment. We began this document by highlighting AWS’s efforts to unify user experiences across workloads and personas, and it’s fitting that we end on the same note.

It will be exciting to see how AWS continues to push the boundaries of innovation and integration at re:Invent 2025.

--

--

Sanjeev Mohan
Sanjeev Mohan

Written by Sanjeev Mohan

Sanjeev researches the space of data and analytics. Most recently he was a research vice president at Gartner. He is now a principal with SanjMo.

Responses (1)