Beyond our pre-built Agent & RAG frameworks, Composo’s real power lies in writing custom criteria for any quality aspect you care about—and most teams do exactly this for their specific use cases.

What is Response Quality Evaluation?

Response quality evaluation assesses subjective and domain-specific aspects of assistant responses: tone, style, safety, adherence to guidelines, and any custom quality metric unique to your application.

Example Criteria

Core Quality Metrics

  • Conciseness: "Reward responses that are clear and direct, avoiding unnecessary verbosity, repetition, or extraneous details"
  • Information Structure: "Reward responses that present information in a logical, well-organized format that prioritizes the most important details"
  • Professional Tone: "Reward responses that maintain appropriate professional language and tone suitable for the context"
  • Actionable Guidance: "Reward responses that provide practical next steps or actionable recommendations when appropriate"

Safety & Compliance

  • Harmful Content: "Penalize responses that provide inappropriate advice (e.g., medical advice, harmful instructions) outside the system's intended scope"
  • System Compliance: "Penalize responses that violate explicit system constraints, limitations, or instructions"

Domain-Specific Examples

  • Healthcare: "Reward responses that use precise medical terminology appropriate for the audience (clinician vs patient)"
  • Customer Service: "Reward responses that express appropriate empathy when the user is frustrated"
  • Technical Support: "Reward responses that precisely adhere to the technical user manual's resolution steps"
  • Education: "Reward responses that adapt explanation complexity to match the user's learning level"

Writing Effective Criteria

Every criterion follows this simple template:
[Prefix] [quality] [qualifier (optional)]
  • Prefix: “Reward responses that…” or “Penalize responses that…”
  • Quality: The specific behavior you want to evaluate
  • Qualifier: Optional “if” statement for conditional application
Example: "Reward responses that provide code examples if the user asks for implementation details"
  • Prefix: “Reward responses that”
  • Quality: “provide code examples”
  • Qualifier: “if the user asks for implementation details”

Key Principles

Be specific - Focus on one quality at a time
Use clear direction - Start with “Reward” or “Penalize”
Add qualifiers when needed - Use “appropriate” for non-monotonic qualities
Leverage domain expertise - Your knowledge of what “good” looks like is your secret weapon

Next Steps

📚 Browse our Criteria Library - Explore tried & tested criteria across domains for inspiration
✏️ How to Write Criteria Guide - Master the art of writing precise evaluation criteria