# Standard Metrics

Metrics are the basis of our open-source starter SDK.

Farsight AI's evaluation suite consists of 4 standard metrics: **Factuality**, **Consistency**, **Quality**, and **Conciseness**.

You are able to test your LLM systems with these metrics either individually or in combination. See the examples below for more information on the setup of each metric.

{% hint style="info" %}
Make sure you have your [OpenAI API Key](https://platform.openai.com/account/api-keys) before you begin.&#x20;
{% endhint %}

### Factuality

`factuality_score()`

Evaluate the factuality of your LLM's response and get a result of `true` (factual) or `false` (hallucination or generally unsubstantiated)

<table><thead><tr><th width="155">Param</th><th width="303">Type</th><th>Description</th></tr></thead><tbody><tr><td>query</td><td>str</td><td>instruction given to LLM</td></tr><tr><td>output</td><td>str</td><td>response from LLM</td></tr><tr><td>knowledge</td><td>str</td><td><em>optional:</em> additional context to help evaluate factuality</td></tr></tbody></table>

<table><thead><tr><th width="156">Output Type</th><th>Output Definition</th></tr></thead><tbody><tr><td>boolean</td><td><p>factuality of given query based on knowledge </p><p>(True means output is correct)</p></td></tr></tbody></table>

<pre class="language-python"><code class="lang-python">from farsightai import FarsightAI
<strong>
</strong><strong># Replace with your openAI credentials
</strong>OPEN_AI_KEY = "&#x3C;openai_key>"

query = "Who is the president of the United States"
farsight = FarsightAI(openai_key=OPEN_AI_KEY)

# Replace this with the actual output of your LLM application
output = "As of my last knowledge update in January 2022, Joe Biden is the President of the United States. However, keep in mind that my information might be outdated as my training data goes up to that time, and I do not have browsing capabilities to check for the most current information. Please verify with up-to-date sources."
knowledge = None # optional param to include additional knowledge

factuality_score = farsight.factuality_score(query, output, knowledge)
print("output: ", factuality_score)
# output: true
</code></pre>

***

### Consistency

`consistency_score()`

Evaluate your LLM's response consistency on a scale of zero to one, with zero being entirely inconsistent and one being entirely consistent.

<table><thead><tr><th width="155">Param</th><th width="84">Type</th><th>Description</th></tr></thead><tbody><tr><td>instruction</td><td>str</td><td>instruction given to LLM</td></tr><tr><td>response</td><td>str</td><td>response from LLM</td></tr><tr><td>num_samples</td><td>int</td><td><em>optional</em>: amount of samples you want to evaluate the llm's consistency against. Increases num_samples increases latency The default value is 3.</td></tr></tbody></table>

<table><thead><tr><th width="156">Output Type</th><th>Output Definition</th></tr></thead><tbody><tr><td>float</td><td>score from zero to one, with zero being entirely inconsistent and one being entirely consistent</td></tr></tbody></table>

<pre class="language-python"><code class="lang-python"><strong>from farsightai import FarsightAI
</strong>
# Replace with your openAI credentials
OPEN_AI_KEY = "&#x3C;openai_key>"

query = "Who is the president of the United States"
farsight = FarsightAI(openai_key=OPEN_AI_KEY)

# Replace this with the actual output of your LLM application
output = "As of my last knowledge update in January 2022, Joe Biden is the President of the United States. However, keep in mind that my information might be outdated as my training data goes up to that time, and I do not have browsing capabilities to check for the most current information. Please verify with up-to-date sources."

consistency_score = farsight.consistency_score(query, output)

print("score: ", consistency_score)
# score: 1.0
</code></pre>

### Quality

`quality_score()`

Evaluate your LLM's response quality on a score of 1-5 with 1 being subpar / unusable quality and 5 being exemplary, human-like quality.

<table><thead><tr><th width="155">Param</th><th width="84">Type</th><th>Description</th></tr></thead><tbody><tr><td>instruction</td><td>str</td><td>instruction given to LLM</td></tr><tr><td>response</td><td>str</td><td>response from LLM</td></tr></tbody></table>

<table><thead><tr><th width="156">Output Type</th><th>Output Definition</th></tr></thead><tbody><tr><td>int</td><td>score from 1-5, 1 being subpar / unusable quality and 5 being exemplary, human-like quality</td></tr></tbody></table>

```python
from farsightai import FarsightAI

# Replace with your openAI credentials
OPEN_AI_KEY = "<openai_key>"

query = "Who is the president of the United States"
farsight = FarsightAI(openai_key=OPEN_AI_KEY)

# Replace this with the actual output of your LLM application
output = "As of my last knowledge update in January 2022, Joe Biden is the President of the United States. However, keep in mind that my information might be outdated as my training data goes up to that time, and I do not have browsing capabilities to check for the most current information. Please verify with up-to-date sources."

quality_score = farsight.quality_score(query, output)
print("score: ", quality_score)
# score: 4
```

***

### Conciseness

`conciseness_score()`

Evaluate your LLM response conciseness on a score of 1-5 from 1 being extremely verbose and 5 being very concise. Note that conciseness does not only capture word count, but also, how much information-rich each incremental word is.

<table><thead><tr><th width="155">Param</th><th width="84">Type</th><th>Description</th></tr></thead><tbody><tr><td>query</td><td>str</td><td>instruction given to LLM</td></tr><tr><td>output</td><td>str</td><td>response from LLM</td></tr></tbody></table>

<table><thead><tr><th width="156">Output Type</th><th>Output Definition</th></tr></thead><tbody><tr><td>int</td><td>score from 1-5, 1 being extremely verbose and 5 being very concise</td></tr></tbody></table>

```python
from farsightai import FarsightAI

# Replace with your openAI credentials
OPEN_AI_KEY = "<openai_key>"

query = "Who is the president of the United States"
farsight = FarsightAI(openai_key=OPEN_AI_KEY)

# Replace this with the actual output of your LLM application
output = "As of my last knowledge update in January 2022, Joe Biden is the President of the United States. However, keep in mind that my information might be outdated as my training data goes up to that time, and I do not have browsing capabilities to check for the most current information. Please verify with up-to-date sources."

conciseness_score = farsight.conciseness_score(query, output)
print("score: ", conciseness_score)
# score: 3
```
