This is the multi-page printable view of this section. Click here to print.
How Tos
- 1: Configuring RAG
- 2: Use Anthropic
- 3: Use Azure OpenAI
- 4: Use Ollama
- 5: BigQuery
- 6: Honeycomb
1 - Configuring RAG
What You’ll Learn
How to configure Learning in Foyle to continually learn from human feedback
How It Works
- As you use Foyle, the AI builds a dataset of examples (input, output)
- The input is a notebook at some point in time ,
t
- The output is one more or cells that were then added to the notebook at time
t+1
- Foyle uses these examples to get better at suggesting cells to insert into the notebook
Configuring RAG
Foyle uses RAG to improve its predictions using its existing dataset of examples. You can control
the number of RAG results used by Foyle by setting agent.rag.maxResults
.
foyle config set agent.rag.maxResults=3
Disabling RAG
RAG is enabled by default. To disable it run
foyle config set agent.rag.enabled=false
To check the status of RAG get the current configuration
foyle config get
Sharing Learned Examples
In a team setting, you should build a shared AI that learns from the feedback of all team members and assists all members. To do this you can configure Foyle to write and read examples from a shared location like GCS. If you’d like S3 support please vote up issue #153.
To configure Foyle to use a shared location for learned examples
Create a GCS bucket to store the learned examples
gsutil mb gs://my-foyle-examples
Configure Foyle to use the GCS bucket
foyle config set learner.exampleDirs=gs://${YOUR_BUCKET}
Optionally you can configure Foyle to use a local location as well if you want to be able to use the AI without an internet connection.
foyle config set learner.exampleDirs=gs://${YOUR_BUCKET},/local/training/examples
2 - Use Anthropic
What You’ll Learn
How to configure Foyle to use Anthropic Models
Prerequisites
- You need a Anthropic account
Setup Foyle To Use Anthropic Models
Get an API Token from the Anthropic Console and save it to a file
Configure Foyle to use this API key
foyle config set anthropic.apiKeyFile=/path/to/your/key/file
- Configure Foyle to use the desired Antrhopic Model
foyle config set agent.model=claude-3-5-sonnet-20240620
foyle config set agent.modelProvider=anthropic
How It Works
Foyle uses 2 Models
- A Chat model to generate completions
- An embedding model to compute embeddings for RAG
Anthropic doesn’t provide embedding models so Foyle continues to use OpenAI for the embedding models. At some, we may add support for Voyage AI’s embedding models.
3 - Use Azure OpenAI
What You’ll Learn
How to configure Foyle to use Azure OpenAI
Prerequisites
- You need an Azure Account (Subscription)
- You need access to Azure Open AI
Setup Azure OpenAI
You need the following Azure OpenAI resources:
Azure Resource Group - This will be an Azure resource group that contains your Azure OpenAI resources
Azure OpenAI Resource Group - This will contain your Azure OpenAI model deployments
You can use the Azure CLI to check if you have the required resources
az cognitiveservices account list --output=table Kind Location Name ResourceGroup ------ ---------- -------------- ---------------- OpenAI eastus ResourceName ResourceGroup
Note You can use the pricing page to see which models are available in a given region. Not all models are available an all regions so you need to select a region with the models you want to use with Foyle.
Foyle currently uses gpt-3.5-turbo-0125
A GPT3.5 deployment
Use the CLI to list your current deployments
az cognitiveservices account deployment list -g ${RESOURCEGROUP} -n ${RESOURCENAME} --output=table
If you need to create a deployment follow the instructions
Setup Foyle To Use Azure Open AI
Set the Azure Open AI BaseURL
We need to configure Foyle to use the appropriate Azure OpenAI endpoint. You can use the CLI to determine the endpoint associated with your resource group
az cognitiveservices account show \
--name <myResourceName> \
--resource-group <myResourceGroupName> \
| jq -r .properties.endpoint
Update the baseURL in your Foyle configuration
foyle config set azureOpenAI.baseURL=https://endpoint-for-Azure-OpenAI
Set the Azure Open AI API Key
Use the CLI to obtain the API key for your Azure OpenAI resource and save it to a file
az cognitiveservices account keys list \
--name <myResourceName> \
--resource-group <myResourceGroupName> \
| jq -r .key1 > ${HOME}/secrets/azureopenai.key
Next, configure Foyle to use this API key
foyle config set azureOpenAI.apiKeyFile=/path/to/your/key/file
Specify model deployments
You need to configure Foyle to use the appropriate Azure deployments for the models Foyle uses.
Start by using the Azure CLI to list your deployments
az cognitiveservices account deployment list --name=${RESOURCE_NAME} --resource-group=${RESOURCE_GROUP} --output=table
Configure Foyle to use the appropriate deployments
foyle config set azureOpenAI.deployments=gpt-3.5-turbo-0125=<YOUR-GPT3.5-DEPLOYMENT-NAME>
Troubleshooting:
Rate Limits
If Foyle is returning rate limiting errors from Azure OpenAI, use the CLI to check the rate limits for your deployments
az cognitiveservices account deployment list -g ${RESOURCEGROUP} -n ${RESOURCENAME}
Azure OpenAI sets the default values to be quite low; 1K tokens per minute. This is usually much lower than your allotted quota. If you have available quota, you can use the UI or to increase these limits.
4 - Use Ollama
What You’ll Learn
How to configure Foyle to use models served by Ollama
Prerequisites
- Follow [Ollama’s docs] to download Ollama and serve a model like
llama2
Setup Foyle to use Ollama
Foyle relies on Ollama’s OpenAI Chat Compatability API to interact with models served by Ollama.
Configure Foyle to use the appropriate Ollama baseURL
foyle config set openai.baseURL=http://localhost:11434/v1
- Change the server and port to match how you are serving Ollama
- You may also need to change the scheme to https; e.g. if you are using a VPN like Tailscale
Configure Foyle to use the appropriate Ollama model
foyle config agent.model=llama2
- Change the model to match the model you are serving with Ollama
You can leave the
apiKeyFile
unset since you aren’t using an API key with OllamaThat’s it! You should now be able to use Foyle with Ollama
5 - BigQuery
Why Use Foyle with BigQuery
If you are using BigQuery as your data warehouse then you’ll want to query that data. Constructing the right query is often a barrier to leveraging that data. Using Foyle you can train a personalized AI assistant to be an expert in answering high level questions using your data warehouse.
Prerequisites
Install the Data TableRenderers extension in vscode
- This extension can render notebook outputs as tables
- In particular, this extension can render JSON as nicely formatted, interactive tables
How to integrate BigQuery with RunMe
Below is an example code cell illustrating the recommended pattern for executing BigQuery queries within RunMe.
cat <<EOF >/tmp/query.sql
SELECT
DATE(created_at) AS date,
COUNT(*) AS pr_count
FROM
\`githubarchive.month.202406\`
WHERE
type = "PullRequestEvent"
AND
repo.name = "jlewi/foyle"
AND
json_value(payload, "$.action") = "closed"
AND
DATE(created_at) BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 28 DAY) AND CURRENT_DATE()
GROUP BY
date
ORDER BY
date;
EOF
export QUERY=$(cat /tmp/query.sql)
bq query --format=json --use_legacy_sql=false "$QUERY"
As illustrated above, the pattern is to use cat
to write the query to a file. This allows
us to write the query in a more human readable format. Including the entire SQL query in the code
cell is critical for enabling Foyle to learn the query.
The output is formatted as JSON. This allows the output to be rendered using the Data TableRenderers extension.
Before executing the code cell click the configure button in the lower right hand side of the cell and then uncheck the box under “interactive”. Running the cell in interactive mode prevents the output from being rendered using Data TableRenders. For more information refer to the RunMe Cell Configuration Documentation.
We then use bq query
to execute the query.
Controlling Costs
BigQuery charges based on the amount of data scanned. To prevent accidentally running expensive queries you can
use the --maximum_bytes_billed
to limit the amount of data scanned. BigQuery currently charges
$6.25 per TiB.
Troubleshooting
Output Isn’t Rendered Using Data TableRenderers
If the output isn’t rendered using Data TableRenderers there are a few things to check
Click the ellipsis to the left of the the upper left hand corner and select change presentation
- This should show you different mime-types and options for rendering them
- Select Data table
Another problem could be that
bq
is outputting status information while running the query and this is interfering with the rendering. You can work around this by redirecting stderr to/dev/null
. For example,bq query --format=json --use_legacy_sql=false "$QUERY" 2>/dev/null
Try explicitly configuring the mime type by opening the cell configuration and then
- Go to the advanced tab
- Entering “application/json” in the mime type field
6 - Honeycomb
Why Use Honeycomb with Foyle
Honeycomb is often used to query and visualize observability data.
Using Foyle and RunMe you can
- Use AI to turn a high level intent into an actual Honeycomb query
- Capture that query as part of a playbook or incident report
- Create documents that capture the specific queries you are interested in rather than rely on static dashboards
How to integrate Honeycomb with RunMe
This section explains how to integrate executing Honeycomb queries from RunMe.
Honeycomb supports embedding queries directly in the URL. You can use this feature to define queries in your notebook and then generate a URL that can be opened in the browser to view the results.
Honeycomb queries are defined in JSON. A suggested pattern to define queries in your notebook are
- Create a code cell which uses contains the query to be executed
- To make it readable the recommended pattern is to treat it as a multi-line JSON string and write it to a file
- Use the CLI hccli to generate the URL and open it in the browser
- While the CLI allows passing the query as an argument this isn’t nearly as human readable as writing it and reading it from a temporary file Here’s an example code cell
cat << EOF > /tmp/query3.json
{
"time_range": 604800,
"granularity": 0,
"calculations": [
{
"op": "AVG",
"column": "prediction.ok"
}
],
"filters": [
{
"column": "model.name",
"op": "=",
"value": "llama3"
}
],
"filter_combination": "AND",
"havings": [],
"limit": 1000
}
EOF
hccli querytourl --query-file=/tmp/query3.json --base-url=https://ui.honeycomb.io/YOURORG/environments/production --dataset=YOURDATASET --open=true
- This is a simple query to get the average of the
prediction.ok
column for thellama3
model for the last week - Be sure to replace
YOURORG
andYOURDATASET
with the appropriate values for your organization and dataset- You can determine
base-url
just by opening up any existing query in Honeycomb and copying the URL
- You can determine
- When you execute the cell, it will print the query and open it in your browser
- You can use the share query feature to encode the query in the URL
Training Foyle to be your Honeycomb Expert
To train Foyle to be your Honeycomb you follow these steps
Make sure you turned on learning by following the guide
Create a markdown cell expressing the high level intent you want to accomplish
Ask Foyle to generate a completion
- The first time you do this Foyle will probably provide a widely inaccurate answer because it has no prior knowledge of your infrastructure
- Edit one of the generated code cells to contain the Honeycomb query and
hccli
command - Execute the cell to generate the URL
Each time you ask for a completion and edit and execute the cell you are providing feedback to Foyle
- Foyle uses the feedback to learn and improve the completions it generates
Important When providing feedback to Foyle its important to do so by editing a code cell that was generated by Foyle. If you create a new code cell Foyle won’t be able to learn from it. Only cells that were generated by Foyle are linked to the original query.
If you need help boostrapping some initial Honeycomb JSON queries you can use Honeycomb’s Share feature to generate the JSON from a query constructed in the UI.
Producing Reports
RunMe’s AutoSave Feature creates a markdown document that contains the output of the code cells. This is great for producing a report or writing a postmortem. When using this feature with Honeycomb you’ll want to capture the output of the query. There are a couple ways to do this
Permalinks
To generate permalinks, you just need to use can use start_time
and
end_time
to specify a fixed time range for your queries (see Honeycomb query API). Since the
hccli prints the output URL it will be saved in the session outputs generated
by RunMe. You could also copy and past the URL into a markdown cell.
Grabbing a Screenshot
Unfortunately a Honeycomb enterprise plan is required to access query results and graphs via API. As a workaround, hccli supports grabbing a screenshot of the query results using browser automation.
You can do this as follows
Restart chrome with a debugging port open
chrome --remote-debugging-port=9222
Login into Honeycomb
Add the
--out-file=/path/to/file.png
tohccli
to specify a file to save it tohccli querytourl --query-file=/tmp/query3.json --base-url=https://ui.honeycomb.io/YOURORG/environments/production --dataset=YOURDATASET --open=true --out-file=/path/to/file.png
Warning Unfortunately this tends to be a bit brittle for the following reasons
You need to restart chrome with remote debugging enabled
hccli
will use the most recent browser window so you need to make sure your most recent browser window is the one with your Honeycomb credentials. This may not be the case if you have different accounts logged into different chrome sessions