Overview

Debugging agents is frustrating. You need to find failing agent runs, identify the step(s) where the trajectory broke down, and isolate the cause. It slows you down when developing your agent app, and become untenable when you’re serving it in prod a thousand times a day.

Fortunately, Synth already does this work for you under the hood, and we’re releasing our Errors product as the best way to interact with and action on this data.

How it Works

  • You upload agent logs to Synth. Synth applies standard algorithms to score individual trajectories and identify common failure modes across all available data. Those become error clusters.
  • Each error cluster comes with high-level information and a list of individual trajectories we believe demonstrate the problem. This information populates the Errors dashboard.
    • Paid customers can configure agents to review and update these errors offline to reflect their priorities.
    • All customers can configure slack notifications to be alerted to errors as they’re detected.
  • To deep dive into an error, select the cluster into context. You’ll see its details, along with its instances in the left panel, and it will be added to context in the AI panel on the right.
    • For queries that can be quickly answered just with the high-level data in the error cluster / instances, throw queries into the chat panel.
    • For queries that require more context - many instances, comparing clusters, or reviewing un-clustered traces - query our search agent, Cuvier. It comes equipped with vector search, SQL access to the underlying data, and plenty of compute to throw at the problem. Cuvier introduces intelligent error analysis to help teams debug AI agents at scale. As agents run thousands of times per day, manually reviewing logs becomes impractical. Cuvier automatically identifies error patterns and provides tools to investigate them efficiently.

Errors will alert you to issues sooner and help you fix them faster.

Key Features

Flexible Compute

  • Configure agents to ignore errors you’ve descoped and collect more data on priority errors you care about.
  • Deploy search agents to answer questions using the amount of compute you need - faster results on quick questions, thorough findings on deep dives.

Information-Dense User Interface

  • Many error clusters contain a vast amount of log data. Use Synth analyses and AI chat to quickly assess what’s worth looking at.
  • Quickly see the most common errors Synth has detected at a glance.

Process and Alerting

  • Set up slack alerts to be notified about errors on a cadence that makes sense for you.

What’s in the Name?

Cuvier’s Beaked Whale is the deepest-diving mammal, and likely one of the most intelligent. We think it’s a fitting name!

Getting Started - Demo

For guidance on getting started with Synth, see:

Feedback

Cuvier is in public beta. Give us feedback and tell us what you want in the slack