Evals API Overview
Superhuman quality evaluation tools for complex LLM applications
The Composo Evals API empowers you with superhuman quality evaluations for your LLM applications. Leveraging proprietary, research-backed models that learn from expert feedback, our evaluation tools replace manual human assessments with scalable, automated solutions tailored to your application’s specific needs. Composo provides the highest quality evaluation methods, designed to handle the most complex and subjective applications with precision.
Key Evaluation Endpoints
The Evals API offers three primary evaluation mechanisms:
-
Continuous Evaluation: Use for fine-grained assessments based on custom, subjective criteria. Ideal for optimizing responses during development. Learn more »
-
Binary Evaluation: Use for simple pass/fail assessments against specific rules or policies. Perfect for content moderation and compliance checks. Learn more »
-
Context Verification: Use to ensure responses are accurate and faithful to provided context, preventing misinformation. Essential for applications utilizing external knowledge bases. Learn more »
Getting Started
To use the Evals API:
- Create a Composo account here
- Create an API key
- Run a quickstart example below
Why Choose Composo
Composo provides hyper-personalized evaluation that you can rely on:
-
Simple Setup: Integrate our evaluation API with minimal code changes—just three lines of code to link your app.
-
Scalable Solutions: Efficiently handle high volumes of data for real-time production monitoring and offline development testing.
-
Actionable Metrics: Gain detailed insights across dimensions like relevance, adherence to guidelines, factual accuracy, and empathy.
-
Adaptability and Continuous Learning: Our custom models learn from expert feedback and adapt over time, ensuring consistent quality aligned with evolving standards.
-
Works with Any Application: Our solution supports any application, from chatbots to copilots, including complex setups like agents and retrieval-augmented generation (RAG).
-
Industry-Leading Research: We go beyond standard evaluations, incorporating state-of-the-art hallucination detection and custom-trained models to deliver the best performance.
Get Started with Composo
Ready to take your LLM application evaluation to the next level? Sign up to create a Composo account and start using our evaluation tools today.
For any questions or to learn more about how Composo can support your needs, reach out to us at contact@composo.ai.