Google Cloud's data agents promise to end the 80% toil problem plaguing enterprise data teams

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now

Data doesn’t just magically appear in the right place for enterprise analytics or AI, it has to be prepared and directed with data pipelines. That’s the domain of data engineering and it has long been one of the most thankless and tedious tasks that enterprises need to deal with.

Today, Google Cloud is taking direct aim at the tedium of data preparation with the launch of a series of AI agents. The new agents span the entire data lifecycle. The Data Engineering Agent in BigQuery automates complex pipeline creation through natural language commands. A Data Science Agent transforms notebooks into intelligent workspaces that can autonomously perform machine learning workflows. The enhanced Conversational Analytics Agent now includes a Code Interpreter that handles advanced Python analytics for business users.

“When I think about who is doing data engineering today, it’s not just engineers, data analysts, data scientists, every data persona complains about how hard it is to find data, how hard it is to wrangle data, how hard it is to get access to high quality data,”Yasmeen Ahmad, managing director, data cloud at Google Cloud, told VentureBeat. “Most of the workflows that we hear about from our users are 80% mired in those toilsome jobs around data wrangling, data, engineering and getting to good quality data they can work with.”

Targeting the data preparation bottleneck

Google built the Data Engineering Agent in BigQuery to create complex data pipelines through natural language prompts. Users can describe multi-step workflows and the agent handles the technical implementation. This includes ingesting data from cloud storage, applying transformations and performing quality checks.

The AI Impact Series Returns to San Francisco – August 5

The next phase of AI is here – are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows – from real-time decision-making to end-to-end automation.

Secure your spot now – space is limited: https://bit.ly/3GuuPLF

The agent writes complex SQL and Python scripts automatically. It handles anomaly detection, schedules pipelines and troubleshoots failures. These tasks traditionally require significant engineering expertise and ongoing maintenance.

The agent breaks down natural language requests into multiple steps. First it understands the need to create connections to data sources. Then it creates appropriate table structures, loads data, identifies primary keys for joins, reasons over data quality issues and applies cleaning functions.

“Ordinarily, that entire workflow would have been writing a lot of complex code for a data engineer and building this complex pipeline and then managing and iterating that code over time,” Ahmad explained. “Now, with the data engineering agent, it can create new pipelines for natural language. It can modify existing pipelines. It can troubleshoot issues.”

How enterprise data teams will work with the data agents

Data engineers are often a very hands-on group of people.

The various tools that are commonly used to build a data pipeline including data streaming, orchestration, quality and transformation, don’t go away with the new data engineering agent.

“Engineers still are aware of those underlying tools, because what we see from how data people operate is, yes, they love the agent, and they actually see this agent as an expert, partner and a collaborator,” Ahmad said. “But often our engineers actually want to see the code, they actually want to visually see the pipelines that have been created by these agents.”

As such while the data engineering agents can work autonomously, data engineers can actually see what the agent is doing. She explained that data professionals will often look at the code written by the agent and then make additional suggestions to the agent to further adjust or customize the data pipeline.

Building an data agent ecosystem with an API foundation

There are multiple vendors in the data space that are building out agentic AI workflows.

Startups like Altimate AI are building out specific agents for data workflows. Large vendors including Databricks, Snowflake and Microsoft are all building out their own respective agentic AI technologies that can help data professionals as well.

The Google approach is a little different in that it is building out its agentic AI services for data with its Gemini Data Agents API. It’s an approach that can enable developers to embed Google’s natural language processing and code interpretation capabilities into their own applications. This represents a shift from closed, first-party tools to an extensible platform approach.

“Behind the scenes for all of these agents, they’re actually being built as a set of APIs,” Ahmad said. “With those API services, we increasingly intend to make those APIs available to our partners.”

The umbrella API service will publish foundational API services and agent APIs. Google has lighthouse preview programs where partners embed these APIs into their own interfaces, including notebook providers and ISV partners building data pipeline tools.

What it means for enterprise data teams

For enterprises looking to lead in AI-driven data operations, this announcement signals an acceleration toward autonomous data workflows. These capabilities could provide significant competitive advantages in time-to-insight and resource efficiency. Organizations should evaluate their current data team capacity and consider pilot programs for pipeline automation.

For enterprises planning later AI adoption, the integration of these capabilities into existing Google Cloud services changes the landscape. The infrastructure for advanced data agents becomes standard rather than premium. This shift potentially raises baseline expectations for data platform capabilities across the industry.

Organizations must balance the efficiency gains against the need for oversight and control. Google’s transparency approach may provide a middle ground, but data leaders should develop governance frameworks for autonomous agent operations before widespread deployment.

The emphasis on API availability indicates that custom agent development will become a competitive differentiator. Enterprises should consider how to leverage these foundational services to build domain-specific agents that address their unique business processes and data challenges.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link