Skip to main content

Research Agent Dashboard

The Research Agent Dashboard at usesynth.ai/research provides a visual interface for creating, monitoring, and managing automated prompt optimization jobs.

Overview

Research Agents are autonomous systems that analyze your code and apply MIPRO optimization to improve prompt performance. The dashboard provides:
  • Job Creation: Configure and launch optimization jobs
  • Real-time Monitoring: Watch agent progress through optimization phases
  • Artifact Inspection: View optimized prompts, diffs, and reports
  • Job History: Track past optimizations and results

Creating a Job

1. Connect Your Repository

Click “New Job” to open the job kickoff form:
  1. Select Repository: Choose from your connected GitHub repositories
  2. Select Branch: Pick the branch containing your code
  3. Optional Commit: Pin to a specific commit or use latest
To connect GitHub repositories, you’ll need to install the Synth GitHub App on your account or organization.

2. Configure the Task

Describe what you want to optimize:
Task Description:
Optimize the prompt for Banking77 intent classification.

The Banking77 dataset contains customer banking queries that need to be
classified into 77 intent categories (e.g., "card_arrival", "lost_or_stolen_card").

Goals:
1. Load the Banking77 dataset from /app/data/
2. Create a task app that evaluates classification prompts
3. Use MIPRO to optimize the system prompt for better accuracy
4. Save the best prompt and results to /app/artifacts/

3. Select Dataset

Choose a dataset source:
  • HuggingFace: Select from public datasets (e.g., PolyAI/banking77)
  • Upload: Upload your own JSONL files
  • Inline: Paste small datasets directly

4. Configure Optimization

SettingDescriptionRecommended
AlgorithmOptimization methodMIPRO
IterationsNumber of optimization rounds10
Primary MetricWhat to optimizeaccuracy

5. Agent Settings

SettingDescriptionRecommended
ModelAgent reasoning modelgpt-5.1-codex-mini
Reasoning EffortThinking depthmedium
Max Agent SpendLimit for agent LLM calls$25
Max Synth SpendLimit for optimization$150

6. Launch

Click Start Job to begin. You’ll see a job ID like ra_abc123def456.

Monitoring Progress

Job List

The main dashboard shows all your jobs with:
  • Status: Queued, Running, Succeeded, Failed
  • Progress: Current iteration and elapsed time
  • Metric: Best achieved metric value
Filter jobs by status using the status tabs.

Job Detail Panel

Click a job to open the detail panel showing:

Agent Timeline

A visual timeline of optimization phases:
[✓] Sandbox Setup      (00:00 - 00:15)
[✓] Repository Clone   (00:15 - 00:23)
[✓] Code Analysis      (00:23 - 01:02)
[→] MIPRO Optimization (01:02 - ...)
    Iteration 3/10 - accuracy: 0.801
[ ] Artifact Export
[ ] Cleanup

Live Logs

Stream agent reasoning and actions in real-time:
[Agent] Analyzing repository structure...
[Agent] Found DSPy pipeline in src/pipeline.py
[Agent] Setting up evaluation harness...
[Agent] Running baseline evaluation: 0.723
[Agent] Starting MIPRO optimization with 15 trials...

Training Jobs

If the agent spawns child training/evaluation jobs, they appear here with:
  • Mini training dashboard
  • Loss curves
  • Metric progression

Viewing Results

Artifacts

When a job succeeds, click Artifacts to view:
ArtifactDescription
optimized_prompt.txtThe best-performing prompt
optimization_report.mdDetailed optimization report with metrics
changes.diffGit diff of all code changes made
result.jsonStructured results with metrics

Artifact Viewer

Click any artifact to open the inline viewer:
  • Diffs: Syntax-highlighted with additions/deletions
  • JSON: Pretty-printed and collapsible
  • Text/Markdown: Rendered with formatting

Download

Click the download icon to save artifacts locally.

Job Actions

Cancel

Click Cancel on a running job to stop it. The agent will:
  1. Complete any in-progress operation
  2. Save partial results to artifacts
  3. Clean up the sandbox

Retry

For failed jobs, click Retry to start a new job with the same configuration.

View Configuration

Click Config to see the full job configuration as TOML, which you can copy for CLI use.

Best Practices

Task Descriptions

Write detailed task descriptions:
✓ Good:
"Optimize the prompt for Banking77 intent classification.
The dataset has 77 intent categories. Output must be EXACTLY
one of the category labels. Focus on improving accuracy on
edge cases like 'card_arrival' vs 'card_delivery'."

✗ Bad:
"Make classification better"

Iteration Count

Dataset SizeRecommended Iterations
Small (<1K samples)5-10
Medium (1K-10K)10-15
Large (>10K)15-20

Spend Limits

Start conservative and increase if needed:
Task ComplexityAgent SpendSynth Spend
Simple (3-class)$10$50
Medium (10-50 class)$25$150
Complex (50+ class)$50$300

Troubleshooting

Job Stuck in “Queued”

Jobs may queue during high demand. Typical queue time is < 5 minutes.

Job Failed During Setup

Check that:
  • Repository is accessible
  • Branch exists
  • Required files are present

Low Optimization Improvement

Try:
  • More detailed task description
  • More iterations
  • Higher reasoning effort
  • Larger dataset sample

See Also