What are Key AI Requirements and How Google Cloud is Paving the Way
The generative AI ecosystem continues to innovate at a frantic pace with new models and approaches being unleashed every day. Each new development underscores the importance of data to make generative AI meaningful and effective, and the challenges organizations face to access, manage, and activate all their data across multiple systems.
Organizations, in 2024, want to move beyond experimentation and prototypes and deploy their AI workloads to generate significant value. The initial workloads have pertained to using AI as a co-pilot or as an assistant. These use cases are valuable as they increase productivity of many personas ranging from developers to customer support agents. Now the focus is shifting to deploying AI applications that can infer context, perform reasoning capabilities, and take actions.
In this blog, I preview the latest developments from Google Cloud pertaining to AI and its Data Cloud. I anticipate we’ll continue to see further developments at Google Cloud Next in April 2024 where I plan to dive deeper into new announcements and hear from customers on the ground.
So, let’s go.
Requirements and Strategies for Personalizing AI
One of the most important requirements is to have AI workloads leverage organizational data in a safe, cost-effective, and reliable manner. Wait. That was a very loaded statement! If we were to unpack it, then it translates into:
- Accuracy & Contextual. Gen AI is non-deterministic, and that is its kryptonite. The goal for us is to use existing deterministic approaches to help improve accuracy of probabilistic AI models and get the best of both worlds.
- Security & Governance. No organizations want to expose its crown jewels, aka proprietary data, to the outside world for multiple reasons, like regulatory compliance, customer trust and competition. Continuous monitoring of AI models and usage in production is critical.
- Performance & Scale. If our AI workloads are going to take orders of seconds to return results, then user adoption will suffer.
- Sustainable Cost. AI model training and inference perform better when they use specialized hardware, which is in short supply. This has driven up the cost of running AI workloads. If organizations are going to use AI at scale, then the cost needs to be sustainable.
Interestingly, all these requirements are ‘must-haves’ and we can’t trade-off one for the sake of another. Hence, a few critical strategies are in order:
- Avoid moving data across multiple data and AI stacks. We don’t want to get into the traditional approach of having separate silos of data for different workloads and personas. Instead, we want to have a single copy of data and a disaggregated compute which may use multiple approaches — SQL, PySpark, conversational, natural language interface, accelerated analytical query engines, Ray, etc.
- Diligently identify candidate use cases. Businesses don’t need to be sold on the potential of AI. But, because of the developing maturity of AI, they need to work closely with IT and run hackathons to identify strategic use cases that will deliver the most significant business value.
- Select appropriate technology. Users have multiple choices — DIY assembly of multiple best-of-breed tools or choose homogenous purpose-built and integrated solutions. This is the common tussle between ‘build’ versus ‘buy’ and the choice depends upon the outcomes of your hackathons. Also, technologies that provide AI capabilities in serverless manner tend to be more cost-effective.
To enable these strategies, Google Cloud has enhanced its AI capabilities in both its operational and analytical offerings and across the ecosystem. Some of these new features are in preview mode, and readers should check the Google Cloud documentation for the latest updates. Let’s look at these one product at a time.
BigQuery
Google Cloud has introduced tighter integration of BigQuery and Vertex AI so that data engineers and analysts can directly call Gemini 1.0 Pro models and bring multimodal and advanced reasoning capabilities on their existing data. The models can be called via SQL statements or Python through the embedded DataFrame API. This opens up a range of use cases from text summarization to analyzing unstructured data, such as speech to text analysis of call-center recordings. This data can further be leveraged in the RAG approach using BigQuery vector search for improving recommendations and remediating client requests faster.
Interestingly, data never leaves their data warehouse and hence is secure and adheres to the existing regulatory compliance guidelines. BigQuery ML text embeddings enable embedding generation and vector search using SQL commands and without data leaving your security perimeter.
On Google’s roadmap is the addition of the Gemini 1.0 Pro Vision model to analyze images and videos using SQL to generate descriptions, categorize them and annotate features.
AlloyDB
Not to be left behind, Google’s PostgreSQL-compatible transaction database, AlloyDB, introduced pgvector-based capabilities at Next ’23, and quickly moved AlloyDB AI from Preview into General Availability last week. AlloyDB AI has added enhancements that increase the speed of vector queries by 10x and prompt size by 4x, resulting in increased context windows to 32K. Because operational databases typically store a majority of application data, they play a key role in how developers build AI-assisted user experiences.
AlloyDB Omni allows the database to run on-premises and in other cloud providers. It benefits from AlloyDB AI going GA as gen AI apps can now be developed wherever your data resides.
Ecosystem
Two key developments are notable in the ecosystem category. First, Google Cloud is expanding vector search capabilities beyond BigQuery and AlloyDB and have added these capabilities across all Google Cloud databases — Spanner, Cloud SQL for MySQL, and Memorystore from Redis by natively integrating Vertex AI. Firestore and Bigtable also get vector search capabilities through integration with Vertex AI Vector Search.
Second, a robust ecosystem is needed to deliver and deploy AI workloads. The best platforms are modular ones that allow users to add specialized applications and models in a plug-and-play manner. Google Cloud’s Model Garden, for example, gives access to many open-source and proprietary models. It has also expanded its integration with LangChain for all Google Cloud databases, which will allow developers to create context-aware gen AI applications.
Conclusion
The worlds of data and AI are colliding and becoming indiscernible. This is true for both the technology stack and the usage. At the stack level, the platforms for exploiting structured, semistructured and unstructured data are unifying. At the usage level, data developers are now doing tasks that hitherto were in the domain of data scientists.
With the democratization of AI, more business users are now able to ask questions in natural language and extract insights. However, this raises stakes as the business users expect fast and accurate responses from all their data assets in one place. Personalization of AI and faster delivery of such use-cases is critical for enterprises to succeed with their AI initiatives.
Enterprise use cases are often contextual to the proprietary data that enterprises have. Ability to leverage a variety of LLMs, ability to fine-tune the same, and ability to deploy the same is crucial for the success of the specific use-case. Google brings the ability to leverage both its own AI models and open-source models along with best-in class interfaces and physical GPU clusters to fine-tune and deploy open-source, and fine-tuned models. This enables development teams to bring use-cases to production faster and seamlessly.
New AI developments from Google Cloud databases are a positive step in helping increase adoption of AI and delight the users. Google’s approach of bringing AI to exist on top of wherever your data currently resides is the closest implementation of an Intelligent Data Platform.
I will report on further developments at Google Cloud Next in April 2024. So, stay tuned. We are living in historic times and it’s a great time to be alive. Data rules…