AWS re:Invent 2023 — Key Findings

Sanjeev Mohan
8 min readDec 3, 2023

--

Yet another AWS re:Invent has come and gone and made an indelible mark in the technology sector. There were so many fresh developments that the keynotes barely scratched the surface. In this blog, I cover the ones pertaining to data, analytics and AI — areas that I am most passionate about.

I am a veteran of conferences and am used to the craziness these events bring. But AWS re:Invent takes that craziness to new heights. From jam-packed Expo Hall to oversubscribed sessions, there are lines rivaling Starbucks everywhere one turns. But this is also the best place to have chance encounters with your acquaintances.

So, with no further delay, I present the key announcements that stood out for me. As I have done for other conference announcements, I have not mentioned their beta/preview/GA status. So, please check the relevant AWS sources for the latest release information.

Compute

AWS continued to showcase its proficiency in silicon. Graviton4 is the 4th gen ARM-based processor for cloud workload in its five-year lifespan and claims to be 30% faster than its predecessor and with 96 cores, instead of 64 in Graviton3. A new instance type, R8g was announced for memory-intensive workloads to take advantage of it.

AWS has processors for training (Trainium) and inference (Inferentia). Both are now in their 2nd generation. The new Trn2 instance promises to reduce training of large language models to weeks.

Jensen Huang, CEO of Nvidia, made yet another keynote appearance and regaled us with massive GPU farms coming to an AWS data center near you. However, AWS is pursuing a dual path — one with Nvidia and the other using its own chips. For example, it has a migration service for PyTorch apps on Nvidia to AWS GPUs. Also, Anthropic CEO mentioned it will use Trainium and Inferentia processors for its AI models.

EC2 Capacity Blocks now allow users to reserve GPUs for periods of 1–14 days. This is great news for people who need to use GPUs for ephemeral workloads as they have been in short supply. The cost of Capacity Blocks is based on supply and demand.

Storage and Data Security

S3 Express One Zone is the newest storage class that uses custom hardware to deliver 10x faster performance than S3 standard and is 50% cheaper. But the tradeoff is that data is only in 1 AZ (which users can choose for the first time). Over the years, S3 has become the de-facto file storage with key capabilities like strong consistency. Now, with single millisecond access time and coupled with table formats like Iceberg, this one announcement can significantly transform and simplify future data infrastructure.

S3 Access Grants is the get-out of-IAM-role jail. IAM policies have complex authorization logic and are hard to express. S3 Access Grants is a new security model for access to S3 objects that maps identities in directories like Active Directory (AD) to datasets in S3. It has a control plane and data plane and is part of the Trusted Identity Propagation. It simplifies attribute-based access control (ABAC) on S3 permissions by reducing the number of policies required.

Operational Databases

Amazon RDS for Db2 is the newest addition to the RDS family which includes Oracle, SQL Server, MySQL, PostgreSQL, and MariaDB. AWS also added data migration support (DMS) for the Db2 LUW version on Linux and Windows and Db2 Mainframe z/OS Series. But, DMS translates Java stored procedures, not the COBOL stored procedures. Presumably, one can use IBM’s Granite LLM to translate COBOL to Java first. On a side note, Db2 is unique compared to other databases as all its users are operating system users and not database users.

A continuing trend is to make various services serverless. The latest salvo is AWS ElastiCache for Redis and Memcached for microsecond response.

Amazon Aurora is the fastest adopted AWS service. One of the biggest announcements in this category was Aurora Serverless Limitless Database. It automates horizontal sharding while maintaining vertical scaling at partition level. It is a single writer but can now scale writes significantly.

Zero-ETL continues its march. Aurora PostgreSQL to Redshift was introduced last year. Now the following have Zero-ETL capabilities to Redshift:

  • Aurora MySQL
  • RDS PostgreSQL
  • DynamoDB

In addition to Redshift, DynamoDB also has Zero-ETL capability to Amazon OpenSearch. The latter can search (both lexical and semantic) DynamoDB’s data. AWS has expanded Zero-ETL to other non-AWS properties like Salesforce Data Cloud.

Although the feature is called Zero-ETL, it is really just “EL”. There is minimal transformation of data as it lands in Redshift. However, once in Redshift, performance features like AutoMV can monitor query patterns and automatically create materialized views with incremental refresh.

Analytical Database

Redshift Serverless now runs on Graviton processor. It has new settings to prevent runaway queries, a workload slider to choose between cost and performance, and ability to sort on query predicates. Redshift Serverless AI Optimizations further automates workload management. Redshift spins up a warm pool of nodes for concurrency scaling. Now it has added 2 new scaling options — data volumes and query complexity.

Redshift ML does not get vector support as yet, but it now has an ability to create a user defined function to call SageMaker JumpStart which is a ML hub of foundation models and built-in functions to perform tasks like summarization, translation, sentiment analysis, etc.

Redshift Data Sharing now supports writes, besides reads. It can share 3rd party data via its integration with Amazon Data Exchange.

Amazon Neptune Analytics is AWS’s latest analytics database engine. It enhnaces AWS’s property and RDF graph use cases. Its traditional use cases were customer 360 identity, fraud and security graphs, but now knowledge graphs are being leveraged for Gen AI applications and vector searches. It has integrations with tools like LangChain. Neptune supports three query languages — OpenCypher, Gremlin and SparQL, but not the upcoming standard, Graph Query Language (GQL).

AI — LLMs / Vectors

Amazon SageMaker is used to build and deploy Gen AI apps. SageMaker Hyperpod uses parallel training to speed up model training. It also has features like auto checkpoint and failover.

The coolest word in AI is optionality. Hugging Face, at the time of writing has 420K models! Amazon Bedrock provides a range of models from its partners, like Anthropic (Claude 2.1), Cohere, and Meta, besides its own Titan family of models. The Titan family grew to become multimodal. AWS also introduced an image generation model with invisible tamper-proof watermark. Bedrock Model Evaluation helps with selecting the right model for your workload.

Knowledge Bases for Bedrock is used for the retrieval augmented generation (RAG). It fetches, chunks and embeds data from Amazon S3. The embeddings can be stored in your choice of Amazon database like Aurora, or externally in Pinecone, Redis, and soon MongoDB Atlas Vector Search.

Where are vector embeddings stored in AWS? Thus far:

  1. RDS and Aurora PostgreSQL editions- they benefit from the open-source pgvector. RDS Optimized Read allows the use of local NVMe-based SSDs in lieu of EBS to store vector embeddings. This allows AWS to accelerate vector searches by 20%.
  2. DocumentDB — this is the MongoDB compatible document store
  3. OpenSearch — this is the open-source fork of search giant, Elastic
  4. MemoryDB for Redis — this in-memory key-value store provides millisecond access
  5. Amazon Neptune — this is a graph database

So far no vector support for MySQL RDS and Aurora.

AI — Applications

Amazon Q will probably go down as historically the most significant announcements. Following the footsteps of Google’s Duet AI and Microsoft’s Copilot, Q is a natural language assistant that is being embedded across AWS’ entire portfolio — from Redshift, to Amazon Glue, to applications like Amazon Connect (for contact centers). Q has been around for a long time in Amazon’s BI tool, QuickSight. Now, powered by AI, it has expanded its wings.

Amazon Q connects to about 40 sources, indexes data and captures semantics as vectors. It supports sources like Google Drive, Gmail, Slack, Amazon S3, and Office365. Using Agents, it can open tickets in tools like Jira and ServiceNow.

Amazon Q can dramatically change our nature of work. For example, we can ask it to write SQL queries in Redshift or ETL pipelines in Glue. AWS demonstrated how they used Q to migrate 1000 Java8 apps to Java17 in just 2 days. AWS announced that they will soon have .Net to Linux capabilities.

Amazon Q comes in two flavors — business ($20/month) and builder ($25/month). It prices lower than DuetAI and Copilot which cost $30/month.

AWS CodeWhisperer also writes code, but unlike Amazon Q, lacks context, security, privacy and IP safeguarding features. It is still used to write code for services like DynamoDB.

PartyRock reveals Amazon’s fun side as it launches an interactive playground to create generative AI applications. No AWS account is required and AWS claims tens of thousands of apps have already been created.

Governance and Operations

Amazon DataZone is used to discover, catalog, and govern data on AWS, on-premises and 3rd party sources. Now, the AWS Titan model automatically creates business descriptions.

CloudWatch now has an API that can ease the observability processes.

Operation can be further simplified as Amazon Q can pick the right instance type, troubleshoot network issues, create policies, firewall rules, etc. and automatically open tickets in Jira, ServiceNow.

Conclusion

An estimated 65,000 people attended the event in Las Vegas, from Nov 27 — Dec 1, 2023. The keynote speakers were informative and inspiring. The Analyst Summit had almost 150 of us analysts and was the most professionally run to date. Although we went nonstop with sessions and one-on-ones, we still could not cover the gamut of announcements.

AWS maturity is evident in two areas. One, it is no longer releasing a rampant number of new services, choosing to focus more on improving performance, cost, ease of use, and reliability. Two, it is embracing the wider external ecosystem via connectors and integrations. However, on the flip side, its messaging is starting to directly make more aggressive comparisons with the competition. Interestingly, while AWS is starting to support on-premises and other clouds, it still shuns the term, multicloud!

Finally, I want to summarize how many different patterns I learned for performing generative AI tasks:

  1. Natively embed vector embeddings that a GenAI app can use (e.g. Aurora PostgreSQL)
  2. Replicate data to a service that provides vector search (e.g. DynamoDB to OpenSearch)
  3. Embed a user defined function to perform the needed task in an external service (e.g. Amazon Redshift ML SageMaker integration)
  4. Call an application that performs all the Gen AI lifecycle tasks (e.g. Amazon Q and Knowledge Bases for Bedrock)

My wish for the next re:Invent is to see a unification of the various data stores and the data catalog. For a conversational bot to deliver trusted responses, it needs to be able to see all the places where a particular data is stored and map it to a data catalog that has the business glossary and a semantic layer.

Thank you for reading this all the way to the end. Please watch an informative and entertaining analysis of AWS re:Invent 2023 between three former Gartner analysts — Donald Feinberg, Merv Adrian and Sanjeev Mohan. It will mean a lot for your help in subscribing to the Medium and YouTube channels. Thank you.

--

--

Sanjeev Mohan

Sanjeev researches the space of data and analytics. Most recently he was a research vice president at Gartner. He is now a principal with SanjMo.