Evals API
Binary Evaluation
Rule-based assessment of LLM outputs
Binary Evaluation allows you to perform rule-based assessments of LLM outputs against specific criteria, resulting in a simple pass or fail outcome.
When to Use Binary Evaluation
Use Binary Evaluation when you need straightforward compliance checks, such as:
- Content moderation
- Policy compliance
- Ensuring responses meet specific safety guidelines
Example: Policy Compliance Check
Suppose your application must ensure that the assistant does not provide medical advice.
Interpreting the Results
- Passed:
True
if the response meets the criteria;False
otherwise. - Feedback: Optional detailed feedback explaining the evaluation outcome.
Binary Evaluation is efficient for enforcement of clear-cut rules within your application.