SQLite

This example and walkthrough has the following features:

Data

This example uses the seaborn library to load up the titanic dataset, and then write the dataframe into a SQLite database, titanic.db. It then uses SQLAlchemy to connect to the SQLite database.

If the titanic.db file does not exist in the same directory as the app.py file, it will create the SQLite database file.

Greeting file

Save this file as greeting.md:

Hello! I'm here to assist you with analyzing the Titanic dataset.
Here are some examples of what you can ask me to do:

- **Filtering and Sorting:**
  - Show only passengers who boarded in Cherbourg.
  - Sort passengers by age in descending order.

- **Data Analysis:**
  - What is the survival rate for each passenger class?
  - How many children were aboard the Titanic?

- **General Statistics:**
  - Calculate the average age of female passengers.
  - Find the total fare collected from passengers who did not survive.

Data description file

Save this file as data_description.md:

# Data Dictionary

- **survival**: Survival status  
  - 0 = No  
  - 1 = Yes  

- **pclass**: Ticket class  
  - 1 = 1st class  
  - 2 = 2nd class  
  - 3 = 3rd class  

- **sex**: Sex of the passenger  

- **age**: Age in years  

- **sibsp**: Number of siblings or spouses aboard the Titanic  

- **parch**: Number of parents or children aboard the Titanic  

- **ticket**: Ticket number  

- **fare**: Passenger fare  

- **cabin**: Cabin number  

- **embarked**: Port of embarkation  
  - C = Cherbourg  
  - Q = Queenstown  
  - S = Southampton  

## Variable Notes

- **pclass** is a proxy for socio-economic status (SES):  
  - 1st = Upper class  
  - 2nd = Middle class  
  - 3rd = Lower class  

- **age**:  
  - If less than 1 year old, age is fractional.  
  - Estimated ages are represented as `xx.5`.  

- **sibsp**: Family relations are defined as:  
  - Sibling = brother, sister, stepbrother, stepsister  
  - Spouse = husband, wife (mistresses and fiancés were ignored)  

- **parch**: Family relations are defined as:  
  - Parent = mother, father  
  - Child = daughter, son, stepdaughter, stepson  
  - Some children traveled only with a nanny, so `parch = 0` for them.  

The application

Our application will read the the greeting.md and data_description.md files and pass them along to the querychat.init() function. Also, instead of passing in a dataframe object to the data_source parameter in querychat.init(), we pass in the database connection, along with the table in the database as table_name.

Here is our SQLite example app, save the contents to app.py.

GitHub Models and GitHub Personal Access Tokens

This example does not use the default OpenAI model directly from OpenAI, which would require you to create an OpenAI API key and save it as an environment variable named OPENAI_API_KEY. Instead we are using GitHub Models as a free way to access the latest LLMs, with a rate-limit. You can follow the instructions on the GitHub Docs or Axure AI Demo on creating a PAT.

We suggest you save your PAT into 2 environment variables: GITHUB_TOKEN, and GITHUB_PAT.

from pathlib import Path

import chatlas
from seaborn import load_dataset
from shiny import App, render, ui
from sqlalchemy import create_engine

import querychat as qc

# Load titanic data and create SQLite database
db_path = Path(__file__).parent / "titanic.db"
engine = create_engine("sqlite:///" + str(db_path))

if not db_path.exists():
    # For example purposes, we'll create the database if it doesn't exist. Don't
    # do this in your app!
    titanic = load_dataset("titanic")
    titanic.to_sql("titanic", engine, if_exists="replace", index=False)

greeting = (Path(__file__).parent / "greeting.md").read_text()
data_desc = (Path(__file__).parent / "data_description.md").read_text()

# 1. Configure querychat

def use_github_models(system_prompt: str) -> chatlas.Chat:
    # GitHub models give us free rate-limited access to the latest LLMs
    # you will need to have GITHUB_PAT defined in your environment
    return chatlas.ChatGithub(
        model="gpt-4.1",
        system_prompt=system_prompt,
    )

querychat_config = qc.init(
    engine,
    "titanic",
    greeting=greeting,
    data_description=data_desc,
    create_chat_callback=use_github_models,
)

# Create UI
app_ui = ui.page_sidebar(
    # 2. Place the chat component in the sidebar
    qc.sidebar("chat"),
    # Main panel with data viewer
    ui.card(
        ui.output_data_frame("data_table"),
        fill=True,
    ),
    title="querychat with Python (SQLite)",
    fillable=True,
)


# Define server logic
def server(input, output, session):
    # 3. Initialize querychat server with the config from step 1
    chat = qc.server("chat", querychat_config)

    # 4. Display the filtered dataframe
    @render.data_frame
    def data_table():
        # Access filtered data via chat.df() reactive
        return chat["df"]()


# Create Shiny app
app = App(app_ui, server)