Overview
Check out Farsight's Capabilities
Get evaluating now! Follow a few simple steps to improve your LLMs.
Want to integrate even quicker? Try out Farsight AI on a Colab notebook here.
Note: While you have the flexibility to assess the results of any Language Model (LLM), we specifically leverage OpenAI for some of the evaluation functions in Farsight AI. To utilize our package, you must have access to an OpenAI API Key.
Overview
Prompt Optimization
For prompt optimization, we offer two distinct approaches - one with manual oversight and one with full automation. Choose the one that aligns best with your use case, workflows and anticipated functionality:
Step by Step Approach: Generate multiple system prompts for evaluation and testing purposes. Tailor them based on context and optional system guidelines.
Fully Automated Approach: Leverage our comprehensive automated prompt optimization function. This feature not only generates prompts but also evaluates and iteratively improves them. It operates based on your provided shadow traffic, evaluation rubric, and optional ground truth outputs.
Prompt Generation
Prompt Generation: Our platform also includes a straightforward prompt generation function designed to assist you in creating system prompts tailored to your specific use case and other relevant information.
Prompt Evaluation
For prompt evaluation, we offer two alternative methods:
Rubric Evaluation: Define your rubric, and our system will assess your prompts, identifying the optimal one for your needs. This method involves generating multiple system prompts for evaluation and testing, considering context and optional system guidelines.
Metrics Evaluation: We provide both standard, off-the-shelf metrics (consistency, conciseness, factuality, quality) and customizable metrics. These metrics enable you to thoroughly evaluate your system prompts and ensure they meet your desired criteria.
Last updated