Rubric mode (default)
Userubric when you can describe quality with criteria.
- Add criteria per‑task via
tasks[].rubricand/or globally viadefault_rubric. - Criteria arrays are merged: task criteria first, then defaults.
- Best for classification, extraction, and structured outputs.
Contrastive mode
Usecontrastive for open‑ended generation where “good” is about style or feel.
- Provide gold outputs that represent the target distribution.
- The judge compares candidates to gold examples and scores closeness.
- Best for writing style, creative text, image/video generation, and other subjective outputs.
- Text style matching: outlines in, reference essays out. See
cookbooks/workflows/style-matching. - Image style matching: prompts in, reference images out (base64 data URLs) with a VLM‑capable judge. See
cookbooks/workflows/image-style-matching.
Gold examples mode
Usegold_examples when gold outputs should be shown as reference context rather than used for direct comparison.
- Gold outputs are included as few‑shot references to the judge.
- Helpful when you want “match these patterns” but still rely on rubric‑like scoring.