Data Sources

querychat supports many types of data sources, including:

  1. Any narwhals-compatible data frame.
  2. Any SQLAlchemy database.
  3. A custom DataSource interface/protocol.

The sections below describe how to use each type of data source with querychat.

Data frames

You can use any narwhals-compatible data frame as a data source in querychat. This includes popular data frame libraries like pandas, polars, pyarrow, and many more.

pandas-app.py
import pandas as pd
from querychat import QueryChat

mtcars = pd.read_csv(
    "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
)

qc = QueryChat(mtcars, "mtcars")
app = qc.app()
polars-app.py
import polars as pl
from querychat import QueryChat

mtcars = pl.read_csv(
    "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
)

qc = QueryChat(mtcars, "mtcars")
app = qc.app()
pyarrow-app.py
import pyarrow as pa
import pyarrow.csv as pv
from querychat import QueryChat

mtcars = pv.read_csv(
    "https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
).to_table()

qc = QueryChat(mtcars, "mtcars")
app = qc.app()

If you’re building an app, note you can read the queried data frame reactively using the df() method, which returns a pandas.DataFrame by default.

Databases

You can also connect querychat directly to any database supported by SQLAlchemy. This includes popular databases like SQLite, DuckDB, PostgreSQL, MySQL, and many more.

Assuming you have a database set up and accessible, you can pass a SQLAlchemy database URL to create_engine(), and then pass the resulting engine to QueryChat. Below are some examples for common databases.

pip install duckdb duckdb-engine
duckdb-app.py
from pathlib import Path
from sqlalchemy import create_engine
from querychat import QueryChat

# Assumes my_database.duckdb is in the same directory as this script
db_path = Path(__file__).parent / "my_database.duckdb"
engine = create_engine(f"duckdb:///{db_path}")

qc = QueryChat(engine, "my_table")
app = qc.app()
sqlite-app.py
from pathlib import Path
from sqlalchemy import create_engine
from querychat import QueryChat

# Assumes my_database.db is in the same directory as this script
db_path = Path(__file__).parent / "my_database.db"
engine = create_engine(f"sqlite:///{db_path}")

qc = QueryChat(engine, "my_table")
app = qc.app()
pip install psycopg2-binary
postgresql-app.py
from sqlalchemy import create_engine
from querychat import QueryChat

engine = create_engine("postgresql+psycopg2://user:password@localhost:5432/mydatabase")
qc = QueryChat(engine, "my_table")
app = qc.app()
pip install pymysql
mysql-app.py
from sqlalchemy import create_engine
from querychat import QueryChat

engine = create_engine("mysql+pymysql://user:password@localhost:3306/mydatabase")
qc = QueryChat(engine, "my_table")
app = qc.app()

If you don’t have a database set up, you can easily create a local DuckDB database from a CSV file using the following code:

create-duckdb.py
import duckdb

conn = duckdb.connect("my_database.duckdb")

conn.execute("""
    CREATE TABLE my_table AS
    SELECT * FROM read_csv_auto('path/to/your/file.csv')
""")

Or, if you have a pandas DataFrame, you can create the DuckDB database like so:

create-duckdb-from-pandas.py
import duckdb
import pandas as pd
from querychat.data import titanic

conn = duckdb.connect("my_database.duckdb")
conn.register('titanic_df', titanic())
conn.execute("""
    CREATE TABLE titanic AS
    SELECT * FROM titanic_df
""")

Then you can connect to this database using the DuckDB example above (changing the table name as appropriate):

Custom sources

If you have a custom data source that doesn’t fit into the above categories, you can implement the DataSource interface/protocol. This requires implementing methods for getting schema information and executing queries.