Research Agent Dashboard
The Research Agent Dashboard at usesynth.ai/research provides a visual interface for creating, monitoring, and managing automated prompt optimization jobs.
Overview
Research Agents are autonomous systems that analyze your code and apply MIPRO optimization to improve prompt performance. The dashboard provides:
- Job Creation: Configure and launch optimization jobs
- Real-time Monitoring: Watch agent progress through optimization phases
- Artifact Inspection: View optimized prompts, diffs, and reports
- Job History: Track past optimizations and results
Creating a Job
1. Connect Your Repository
Click “New Job” to open the job kickoff form:
- Select Repository: Choose from your connected GitHub repositories
- Select Branch: Pick the branch containing your code
- Optional Commit: Pin to a specific commit or use latest
To connect GitHub repositories, you’ll need to install the Synth GitHub App on your account or organization.
Describe what you want to optimize:
Task Description:
Optimize the prompt for Banking77 intent classification.
The Banking77 dataset contains customer banking queries that need to be
classified into 77 intent categories (e.g., "card_arrival", "lost_or_stolen_card").
Goals:
1. Load the Banking77 dataset from /app/data/
2. Create a task app that evaluates classification prompts
3. Use MIPRO to optimize the system prompt for better accuracy
4. Save the best prompt and results to /app/artifacts/
3. Select Dataset
Choose a dataset source:
- HuggingFace: Select from public datasets (e.g.,
PolyAI/banking77)
- Upload: Upload your own JSONL files
- Inline: Paste small datasets directly
| Setting | Description | Recommended |
|---|
| Algorithm | Optimization method | MIPRO |
| Iterations | Number of optimization rounds | 10 |
| Primary Metric | What to optimize | accuracy |
5. Agent Settings
| Setting | Description | Recommended |
|---|
| Model | Agent reasoning model | gpt-5.1-codex-mini |
| Reasoning Effort | Thinking depth | medium |
| Max Agent Spend | Limit for agent LLM calls | $25 |
| Max Synth Spend | Limit for optimization | $150 |
6. Launch
Click Start Job to begin. You’ll see a job ID like ra_abc123def456.
Monitoring Progress
Job List
The main dashboard shows all your jobs with:
- Status: Queued, Running, Succeeded, Failed
- Progress: Current iteration and elapsed time
- Metric: Best achieved metric value
Filter jobs by status using the status tabs.
Job Detail Panel
Click a job to open the detail panel showing:
Agent Timeline
A visual timeline of optimization phases:
[✓] Sandbox Setup (00:00 - 00:15)
[✓] Repository Clone (00:15 - 00:23)
[✓] Code Analysis (00:23 - 01:02)
[→] MIPRO Optimization (01:02 - ...)
Iteration 3/10 - accuracy: 0.801
[ ] Artifact Export
[ ] Cleanup
Live Logs
Stream agent reasoning and actions in real-time:
[Agent] Analyzing repository structure...
[Agent] Found DSPy pipeline in src/pipeline.py
[Agent] Setting up evaluation harness...
[Agent] Running baseline evaluation: 0.723
[Agent] Starting MIPRO optimization with 15 trials...
Training Jobs
If the agent spawns child training/evaluation jobs, they appear here with:
- Mini training dashboard
- Loss curves
- Metric progression
Viewing Results
Artifacts
When a job succeeds, click Artifacts to view:
| Artifact | Description |
|---|
optimized_prompt.txt | The best-performing prompt |
optimization_report.md | Detailed optimization report with metrics |
changes.diff | Git diff of all code changes made |
result.json | Structured results with metrics |
Artifact Viewer
Click any artifact to open the inline viewer:
- Diffs: Syntax-highlighted with additions/deletions
- JSON: Pretty-printed and collapsible
- Text/Markdown: Rendered with formatting
Download
Click the download icon to save artifacts locally.
Job Actions
Cancel
Click Cancel on a running job to stop it. The agent will:
- Complete any in-progress operation
- Save partial results to artifacts
- Clean up the sandbox
Retry
For failed jobs, click Retry to start a new job with the same configuration.
View Configuration
Click Config to see the full job configuration as TOML, which you can copy for CLI use.
Best Practices
Task Descriptions
Write detailed task descriptions:
✓ Good:
"Optimize the prompt for Banking77 intent classification.
The dataset has 77 intent categories. Output must be EXACTLY
one of the category labels. Focus on improving accuracy on
edge cases like 'card_arrival' vs 'card_delivery'."
✗ Bad:
"Make classification better"
Iteration Count
| Dataset Size | Recommended Iterations |
|---|
| Small (<1K samples) | 5-10 |
| Medium (1K-10K) | 10-15 |
| Large (>10K) | 15-20 |
Spend Limits
Start conservative and increase if needed:
| Task Complexity | Agent Spend | Synth Spend |
|---|
| Simple (3-class) | $10 | $50 |
| Medium (10-50 class) | $25 | $150 |
| Complex (50+ class) | $50 | $300 |
Troubleshooting
Job Stuck in “Queued”
Jobs may queue during high demand. Typical queue time is < 5 minutes.
Job Failed During Setup
Check that:
- Repository is accessible
- Branch exists
- Required files are present
Low Optimization Improvement
Try:
- More detailed task description
- More iterations
- Higher reasoning effort
- Larger dataset sample
See Also