Evals API
Binary Evaluation
Rule-based assessment of LLM outputs
Binary Evaluation allows you to perform rule-based assessments of LLM outputs against specific criteria, resulting in a simple pass or fail outcome.
When to Use Binary Evaluation
Use Binary Evaluation when you need straightforward compliance checks, such as:
- Strict adherence to safety guidelines
- Assessing mathemtical correctness
Example: Policy Compliance Check
Suppose your application must ensure that the assistant does not provide medical advice.
Interpreting the Results
- Passed:
True
if the response meets the criteria;False
otherwise. Anull
score indicates the evaluation criteria was deemed not applicable to the application output. - Explanation: Explanation of the evaluation outcome.
Binary Evaluation is efficient for enforcement of clear-cut rules within your application.