SQLite
This example and walkthrough has the following features:
- querychat interaction with a SQLite database using SQLAlchemy
- Reads in a data description file
- Reads in a greeting file
Data
This example uses the seaborn
library to load up the titanic
dataset, and then write the dataframe into a SQLite database, titanic.db
. It then uses SQLAlchemy to connect to the SQLite database.
If the titanic.db
file does not exist in the same directory as the app.py
file, it will create the SQLite database file.
Greeting file
Save this file as greeting.md
:
Hello! I'm here to assist you with analyzing the Titanic dataset.
Here are some examples of what you can ask me to do:
- **Filtering and Sorting:**
- Show only passengers who boarded in Cherbourg.
- Sort passengers by age in descending order.
- **Data Analysis:**
- What is the survival rate for each passenger class?
- How many children were aboard the Titanic?
- **General Statistics:**
- Calculate the average age of female passengers.
- Find the total fare collected from passengers who did not survive.
Data description file
Save this file as data_description.md
:
# Data Dictionary
- **survival**: Survival status
- 0 = No
- 1 = Yes
- **pclass**: Ticket class
- 1 = 1st class
- 2 = 2nd class
- 3 = 3rd class
- **sex**: Sex of the passenger
- **age**: Age in years
- **sibsp**: Number of siblings or spouses aboard the Titanic
- **parch**: Number of parents or children aboard the Titanic
- **ticket**: Ticket number
- **fare**: Passenger fare
- **cabin**: Cabin number
- **embarked**: Port of embarkation
- C = Cherbourg
- Q = Queenstown
- S = Southampton
## Variable Notes
- **pclass** is a proxy for socio-economic status (SES):
- 1st = Upper class
- 2nd = Middle class
- 3rd = Lower class
- **age**:
- If less than 1 year old, age is fractional.
- Estimated ages are represented as `xx.5`.
- **sibsp**: Family relations are defined as:
- Sibling = brother, sister, stepbrother, stepsister
- Spouse = husband, wife (mistresses and fiancés were ignored)
- **parch**: Family relations are defined as:
- Parent = mother, father
- Child = daughter, son, stepdaughter, stepson
- Some children traveled only with a nanny, so `parch = 0` for them.
The application
Our application will read the the greeting.md
and data_description.md
files and pass them along to the querychat.init()
function. Also, instead of passing in a dataframe object to the data_source
parameter in querychat.init()
, we pass in the database connection, along with the table in the database as table_name
.
Here is our SQLite example app, save the contents to app.py
.
This example does not use the default OpenAI model directly from OpenAI, which would require you to create an OpenAI API key and save it as an environment variable named OPENAI_API_KEY
. Instead we are using GitHub Models as a free way to access the latest LLMs, with a rate-limit. You can follow the instructions on the GitHub Docs or Axure AI Demo on creating a PAT.
We suggest you save your PAT into 2 environment variables: GITHUB_TOKEN
, and GITHUB_PAT
.
from pathlib import Path
import chatlas
from seaborn import load_dataset
from shiny import App, render, ui
from sqlalchemy import create_engine
import querychat as qc
# Load titanic data and create SQLite database
= Path(__file__).parent / "titanic.db"
db_path = create_engine("sqlite:///" + str(db_path))
engine
if not db_path.exists():
# For example purposes, we'll create the database if it doesn't exist. Don't
# do this in your app!
= load_dataset("titanic")
titanic "titanic", engine, if_exists="replace", index=False)
titanic.to_sql(
= (Path(__file__).parent / "greeting.md").read_text()
greeting = (Path(__file__).parent / "data_description.md").read_text()
data_desc
# 1. Configure querychat
def use_github_models(system_prompt: str) -> chatlas.Chat:
# GitHub models give us free rate-limited access to the latest LLMs
# you will need to have GITHUB_PAT defined in your environment
return chatlas.ChatGithub(
="gpt-4.1",
model=system_prompt,
system_prompt
)
= qc.init(
querychat_config
engine,"titanic",
=greeting,
greeting=data_desc,
data_description=use_github_models,
create_chat_callback
)
# Create UI
= ui.page_sidebar(
app_ui # 2. Place the chat component in the sidebar
"chat"),
qc.sidebar(# Main panel with data viewer
ui.card("data_table"),
ui.output_data_frame(=True,
fill
),="querychat with Python (SQLite)",
title=True,
fillable
)
# Define server logic
def server(input, output, session):
# 3. Initialize querychat server with the config from step 1
= qc.server("chat", querychat_config)
chat
# 4. Display the filtered dataframe
@render.data_frame
def data_table():
# Access filtered data via chat.df() reactive
return chat["df"]()
# Create Shiny app
= App(app_ui, server) app