Data Sources

querychat supports several different data sources, including:

  1. Any narwhals-compatible data frame.
  2. Any SQLAlchemy database.
  3. A custom DataSource interface/protocol.

The sections below describe how to use each type of data source with querychat.

Data frames

You can use any narwhals-compatible data frame as a data source in querychat. This includes popular data frame libraries like pandas, polars, pyarrow, and many more.

pandas-app.py
import pandas as pd
from querychat import QueryChat

mtcars = pd.read_csv(
    "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
)

qc = QueryChat(mtcars, "mtcars")
app = qc.app()
polars-app.py
import polars as pl
from querychat import QueryChat

mtcars = pl.read_csv(
    "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
)

qc = QueryChat(mtcars, "mtcars")
app = qc.app()
pyarrow-app.py
import pyarrow as pa
import pyarrow.csv as pv
from querychat import QueryChat

mtcars = pv.read_csv(
    "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
).to_table()

qc = QueryChat(mtcars, "mtcars")
app = qc.app()

If you’re building an app, note you can read the queried data frame reactively using the df() method, which returns a narwhals.DataFrame. Call .to_native() on the result to get the underlying pandas or polars DataFrame.

Databases

You can also connect querychat directly to a table in any database supported by SQLAlchemy. This includes popular databases like SQLite, DuckDB, PostgreSQL, MySQL, and many more.

Assuming you have a database set up and accessible, you can pass a SQLAlchemy database URL to create_engine(), and then pass the resulting engine to querychat. Below are some examples for common databases.

pip install duckdb duckdb-engine
duckdb-app.py
from pathlib import Path
from sqlalchemy import create_engine
from querychat import QueryChat

# Assumes my_database.duckdb is in the same directory as this script
db_path = Path(__file__).parent / "my_database.duckdb"
engine = create_engine(f"duckdb:///{db_path}")

qc = QueryChat(engine, "my_table")
app = qc.app()
sqlite-app.py
from pathlib import Path
from sqlalchemy import create_engine
from querychat import QueryChat

# Assumes my_database.db is in the same directory as this script
db_path = Path(__file__).parent / "my_database.db"
engine = create_engine(f"sqlite:///{db_path}")

qc = QueryChat(engine, "my_table")
app = qc.app()
pip install psycopg2-binary
postgresql-app.py
from sqlalchemy import create_engine
from querychat import QueryChat

engine = create_engine("postgresql+psycopg2://user:password@localhost:5432/mydatabase")
qc = QueryChat(engine, "my_table")
app = qc.app()
pip install pymysql
mysql-app.py
from sqlalchemy import create_engine
from querychat import QueryChat

engine = create_engine("mysql+pymysql://user:password@localhost:3306/mydatabase")
qc = QueryChat(engine, "my_table")
app = qc.app()

If you don’t have a database set up, you can easily create a local DuckDB database from a CSV file using the following code:

create-duckdb.py
import duckdb

conn = duckdb.connect("my_database.duckdb")

conn.execute("""
    CREATE TABLE my_table AS
    SELECT * FROM read_csv_auto('path/to/your/file.csv')
""")

Or, if you have a pandas DataFrame, you can create the DuckDB database like so:

create-duckdb-from-pandas.py
import duckdb
import pandas as pd
from querychat.data import titanic

conn = duckdb.connect("my_database.duckdb")
conn.register('titanic_df', titanic())
conn.execute("""
    CREATE TABLE titanic AS
    SELECT * FROM titanic_df
""")

Then you can connect to this database using the DuckDB example above (changing the table name as appropriate):

Custom sources

If you have a custom data source that doesn’t fit into the above categories, you can implement the DataSource interface/protocol. This requires implementing methods for getting schema information and executing queries.