Rubric Development

Create a rubric to evaluate your best prompts

Farsight utilizes the Prometheus prompting methodology to auto-evaluate system prompts.

To begin creating your rubric, we suggest

1) Synthesizing a description of your use case. For example:

To develop a secure and efficient internal HR chatbot for a financial
institution that assists employees with HR-related queries while ensuring the 
protection of private personal information.

2) Prompting by using chatGPT to generate your evaluation rubric with your use case in the Prometheus prompt. To do so, prompt the chat as follows:

Given this use case: To develop a secure and efficient internal HR chatbot for a financial
institution that assists employees with HR-related queries while ensuring the 
protection of private personal information.

I would like to create an evaluation rubric to effectively evaluate chat bot 
responses. Can you provide an example {instruction}, example {reference_answer}, 
can fill in the {criteria_description}, and the five {score_descriptions} for my 
use case? Please keep the rest of the format exactly the same. Please create one 
evaluation rubric from 1 to 5 with no subcategories.

### Reference Answer (Score 5):
{reference_answer}

### Score Rubric:
[{criteria_description}]
Score 1: {score1_description}
Score 2: {score2_description}
Score 3: {score3_description}
Score 4: {score4_description}
Score 5: {score5_description}

Please provide a single, consolidated rubric for evaluating these criteria.

Example Response:

Here is an example response from chatGPT, simply input the reference answer and score rubric into our get_best_prompt function with a few different prompts to evaluate.

Last updated