This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Integrations

Integrations

These pages describe how to use RunMe and Foyle to interact with various systems.

1 - BigQuery

BigQuery

Why Use Foyle with BigQuery

If you are using BigQuery as your data warehouse then you’ll want to query that data. Constructing the right query is often a barrier to leveraging that data. Using Foyle you can train a personalized AI assistant to be an expert in answering high level questions using your data warehouse.

Prerequisites

Install the Data TableRenderers extension in vscode

  • This extension can render notebook outputs as tables
  • In particular, this extension can render JSON as nicely formatted, interactive tables

How to integrate BigQuery with RunMe

Below is an example code cell illustrating the recommended pattern for executing BigQuery queries within RunMe.

cat <<EOF >/tmp/query.sql
SELECT
  DATE(created_at) AS date,
  COUNT(*) AS pr_count
FROM
  \`githubarchive.month.202406\`
WHERE
  type = "PullRequestEvent"
  AND 
  repo.name = "jlewi/foyle"
  AND
  json_value(payload, "$.action") = "closed"
  AND
  DATE(created_at) BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 28 DAY) AND CURRENT_DATE()
GROUP BY
  date
ORDER BY
  date;
EOF

export QUERY=$(cat /tmp/query.sql)
bq query --format=json --use_legacy_sql=false "$QUERY"

As illustrated above, the pattern is to use cat to write the query to a file. This allows us to write the query in a more human readable format. Including the entire SQL query in the code cell is critical for enabling Foyle to learn the query.

The output is formatted as JSON. This allows the output to be rendered using the Data TableRenderers extension.

Before executing the code cell click the configure button in the lower right hand side of the cell and then uncheck the box under “interactive”. Running the cell in interactive mode prevents the output from being rendered using Data TableRenders. For more information refer to the RunMe Cell Configuration Documentation.

We then use bq query to execute the query.

Controlling Costs

BigQuery charges based on the amount of data scanned. To prevent accidentally running expensive queries you can use the --maximum_bytes_billed to limit the amount of data scanned. BigQuery currently charges $6.25 per TiB.

Troubleshooting

Output Isn’t Rendered Using Data TableRenderers

If the output isn’t rendered using Data TableRenderers there are a few things to check

  1. Click the ellipsis to the left of the the upper left hand corner and select change presentation

    • This should show you different mime-types and options for rendering them
    • Select Data table
  2. Another problem could be that bq is outputting status information while running the query and this is interfering with the rendering. You can work around this by redirecting stderr to /dev/null. For example,

       bq query --format=json --use_legacy_sql=false "$QUERY" 2>/dev/null
    
  3. Try explicitly configuring the mime type by opening the cell configuration and then

    1. Go to the advanced tab
    2. Entering “application/json” in the mime type field

2 - Honeycomb

Honeycomb

Why Use Honeycomb with Foyle

Honeycomb is often used to query and visualize observability data.

Using Foyle and RunMe you can

  • Use AI to turn a high level intent into an actual Honeycomb query
  • Capture that query as part of a playbook or incident report
  • Create documents that capture the specific queries you are interested in rather than rely on static dashboards

How to integrate Honeycomb with RunMe

This section explains how to integrate executing Honeycomb queries from RunMe.

Honeycomb supports embedding queries directly in the URL. You can use this feature to define queries in your notebook and then generate a URL that can be opened in the browser to view the results.

Honeycomb queries are defined in JSON. A suggested pattern to define queries in your notebook are

  1. Create a code cell which uses contains the query to be executed
    • To make it readable the recommended pattern is to treat it as a multi-line JSON string and write it to a file
  2. Use the CLI hccli to generate the URL and open it in the browser
    • While the CLI allows passing the query as an argument this isn’t nearly as human readable as writing it and reading it from a temporary file Here’s an example code cell
cat << EOF > /tmp/query3.json
{
  "time_range": 604800,
  "granularity": 0,  
  "calculations": [
      {
          "op": "AVG",
            "column": "prediction.ok"
      }
  ],
  "filters": [
      {
          "column": "model.name",
          "op": "=",
          "value": "llama3"
      }
  ],
  "filter_combination": "AND",
  "havings": [],
  "limit": 1000
}
EOF
hccli querytourl --query-file=/tmp/query3.json --base-url=https://ui.honeycomb.io/YOURORG/environments/production --dataset=YOURDATASET --open=true
  • This is a simple query to get the average of the prediction.ok column for the llama3 model for the last week
  • Be sure to replace YOURORG and YOURDATASET with the appropriate values for your organization and dataset
    • You can determine base-url just by opening up any existing query in Honeycomb and copying the URL
  • When you execute the cell, it will print the query and open it in your browser
  • You can use the share query feature to encode the query in the URL

Training Foyle to be your Honeycomb Expert

To train Foyle to be your Honeycomb you follow these steps

  1. Make sure you turned on learning by following the guide

  2. Create a markdown cell expressing the high level intent you want to accomplish

  3. Ask Foyle to generate a completion

    • The first time you do this Foyle will probably provide a widely inaccurate answer because it has no prior knowledge of your infrastructure
    • Edit one of the generated code cells to contain the Honeycomb query and hccli command
    • Execute the cell to generate the URL
  4. Each time you ask for a completion and edit and execute the cell you are providing feedback to Foyle

    • Foyle uses the feedback to learn and improve the completions it generates

Important When providing feedback to Foyle its important to do so by editing a code cell that was generated by Foyle. If you create a new code cell Foyle won’t be able to learn from it. Only cells that were generated by Foyle are linked to the original query.

If you need help boostrapping some initial Honeycomb JSON queries you can use Honeycomb’s Share feature to generate the JSON from a query constructed in the UI.

Producing Reports

RunMe’s AutoSave Feature creates a markdown document that contains the output of the code cells. This is great for producing a report or writing a postmortem. When using this feature with Honeycomb you’ll want to capture the output of the query. There are a couple ways to do this

To generate permalinks, you just need to use can use start_time and end_time to specify a fixed time range for your queries (see Honeycomb query API). Since the hccli prints the output URL it will be saved in the session outputs generated by RunMe. You could also copy and past the URL into a markdown cell.

Grabbing a Screenshot

Unfortunately a Honeycomb enterprise plan is required to access query results and graphs via API. As a workaround, hccli supports grabbing a screenshot of the query results using browser automation.

You can do this as follows

  1. Restart chrome with a debugging port open

    chrome --remote-debugging-port=9222
    
  2. Login into Honeycomb

  3. Add the --out-file=/path/to/file.png to hccli to specify a file to save it to

    hccli querytourl --query-file=/tmp/query3.json --base-url=https://ui.honeycomb.io/YOURORG/environments/production --dataset=YOURDATASET --open=true --out-file=/path/to/file.png
    

Warning Unfortunately this tends to be a bit brittle for the following reasons

  • You need to restart chrome with remote debugging enabled

  • hccli will use the most recent browser window so you need to make sure your most recent browser window is the one with your Honeycomb credentials. This may not be the case if you have different accounts logged into different chrome sessions

Reference