Data Sources
querychat supports several different data sources, including:
- Any narwhals-compatible data frame.
- Any SQLAlchemy database.
- A custom DataSource interface/protocol.
The sections below describe how to use each type of data source with querychat.
Data frames
You can use any narwhals-compatible data frame as a data source in querychat. This includes popular data frame libraries like pandas, polars, pyarrow, and many more.
pandas-app.py
import pandas as pd
from querychat import QueryChat
mtcars = pd.read_csv(
"https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
)
qc = QueryChat(mtcars, "mtcars")
app = qc.app()polars-app.py
import polars as pl
from querychat import QueryChat
mtcars = pl.read_csv(
"https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
)
qc = QueryChat(mtcars, "mtcars")
app = qc.app()pyarrow-app.py
import pyarrow as pa
import pyarrow.csv as pv
from querychat import QueryChat
mtcars = pv.read_csv(
"https://gist.githubusercontent.com/seankross/a412dfbd88b3db70b74b/raw/5f23f993cd87c283ce766e7ac6b329ee7cc2e1d1/mtcars.csv"
).to_table()
qc = QueryChat(mtcars, "mtcars")
app = qc.app()If you’re building an app, note you can read the queried data frame reactively using the df() method, which returns a narwhals.DataFrame. Call .to_native() on the result to get the underlying pandas or polars DataFrame.
Databases
You can also connect querychat directly to a table in any database supported by SQLAlchemy. This includes popular databases like SQLite, DuckDB, PostgreSQL, MySQL, and many more.
Assuming you have a database set up and accessible, you can pass a SQLAlchemy database URL to create_engine(), and then pass the resulting engine to querychat. Below are some examples for common databases.
pip install duckdb duckdb-engine
duckdb-app.py
from pathlib import Path
from sqlalchemy import create_engine
from querychat import QueryChat
# Assumes my_database.duckdb is in the same directory as this script
db_path = Path(__file__).parent / "my_database.duckdb"
engine = create_engine(f"duckdb:///{db_path}")
qc = QueryChat(engine, "my_table")
app = qc.app()sqlite-app.py
from pathlib import Path
from sqlalchemy import create_engine
from querychat import QueryChat
# Assumes my_database.db is in the same directory as this script
db_path = Path(__file__).parent / "my_database.db"
engine = create_engine(f"sqlite:///{db_path}")
qc = QueryChat(engine, "my_table")
app = qc.app()pip install psycopg2-binary
postgresql-app.py
from sqlalchemy import create_engine
from querychat import QueryChat
engine = create_engine("postgresql+psycopg2://user:password@localhost:5432/mydatabase")
qc = QueryChat(engine, "my_table")
app = qc.app()pip install pymysql
mysql-app.py
from sqlalchemy import create_engine
from querychat import QueryChat
engine = create_engine("mysql+pymysql://user:password@localhost:3306/mydatabase")
qc = QueryChat(engine, "my_table")
app = qc.app()If you don’t have a database set up, you can easily create a local DuckDB database from a CSV file using the following code:
create-duckdb.py
import duckdb
conn = duckdb.connect("my_database.duckdb")
conn.execute("""
CREATE TABLE my_table AS
SELECT * FROM read_csv_auto('path/to/your/file.csv')
""")Or, if you have a pandas DataFrame, you can create the DuckDB database like so:
create-duckdb-from-pandas.py
import duckdb
import pandas as pd
from querychat.data import titanic
conn = duckdb.connect("my_database.duckdb")
conn.register('titanic_df', titanic())
conn.execute("""
CREATE TABLE titanic AS
SELECT * FROM titanic_df
""")Then you can connect to this database using the DuckDB example above (changing the table name as appropriate):
Custom sources
If you have a custom data source that doesn’t fit into the above categories, you can implement the DataSource interface/protocol. This requires implementing methods for getting schema information and executing queries.