import pointblank as pb
= pb.load_dataset()
small_table
pb.preview(small_table)
PolarsRows13Columns8 |
||||||||
Load a dataset hosted in the library as specified DataFrame type.
dataset : Literal['small_table', 'game_revenue'] = 'small_table'
The name of the dataset to load. Current options are "small_table"
and "game_revenue"
.
tbl_type : Literal['polars', 'pandas', 'duckdb'] = 'polars'
The type of DataFrame to generate from the dataset. The named options are "polars"
, "pandas"
, and "duckdb"
.
: FrameT | Any
The dataset for the Validate
object. This could be a Polars DataFrame, a Pandas DataFrame, or a DuckDB table as an Ibis table.
There are two included datasets that can be loaded using the load_dataset()
function:
small_table
: A small dataset with 13 rows and 8 columns. This dataset is useful for testing and demonstration purposes.game_revenue
: A dataset with 2000 rows and 11 columns. Provides revenue data for a game development company. For the particular game, there are records of player sessions, the items they purchased, ads viewed, and the revenue generated.The tbl_type=
parameter can be set to one of the following:
"polars"
: A Polars DataFrame."pandas"
: A Pandas DataFrame."duckdb"
: An Ibis table for a DuckDB database.Load the small_table
dataset as a Polars DataFrame by calling load_dataset()
with its defaults:
PolarsRows13Columns8 |
||||||||
date_time Datetime |
date Date |
a Int64 |
b String |
c Int64 |
d Float64 |
e Boolean |
f String |
|
---|---|---|---|---|---|---|---|---|
1 | 2016-01-04 11:00:00 | 2016-01-04 | 2 | 1-bcd-345 | 3 | 3423.29 | True | high |
2 | 2016-01-04 00:32:00 | 2016-01-04 | 3 | 5-egh-163 | 8 | 9999.99 | True | low |
3 | 2016-01-05 13:32:00 | 2016-01-05 | 6 | 8-kdg-938 | 3 | 2343.23 | True | high |
4 | 2016-01-06 17:23:00 | 2016-01-06 | 2 | 5-jdo-903 | None | 3892.4 | False | mid |
5 | 2016-01-09 12:36:00 | 2016-01-09 | 8 | 3-ldm-038 | 7 | 283.94 | True | low |
9 | 2016-01-20 04:30:00 | 2016-01-20 | 3 | 5-bce-642 | 9 | 837.93 | False | high |
10 | 2016-01-20 04:30:00 | 2016-01-20 | 3 | 5-bce-642 | 9 | 837.93 | False | high |
11 | 2016-01-26 20:07:00 | 2016-01-26 | 4 | 2-dmx-010 | 7 | 833.98 | True | low |
12 | 2016-01-28 02:51:00 | 2016-01-28 | 2 | 7-dmx-010 | 8 | 108.34 | False | low |
13 | 2016-01-30 11:23:00 | 2016-01-30 | 1 | 3-dka-303 | None | 2230.09 | True | high |
Note that the small_table
dataset is a simple Polars DataFrame and using the preview()
function will display the table in an HTML viewing environment.
The game_revenue
dataset can be loaded as a Pandas DataFrame by specifying the dataset name and setting tbl_type="pandas"
:
import pointblank as pb
game_revenue = pb.load_dataset(dataset="game_revenue", tbl_type="pandas")
pb.preview(game_revenue)
PandasRows2000Columns11 |
|||||||||||
player_id object |
session_id object |
session_start datetime64[ns, UTC] |
time datetime64[ns, UTC] |
item_type object |
item_name object |
item_revenue float64 |
session_duration float64 |
start_day datetime64[ns] |
acquisition object |
country object |
|
---|---|---|---|---|---|---|---|---|---|---|---|
1 | ECPANOIXLZHF896 | ECPANOIXLZHF896-eol2j8bs | 2015-01-01 01:31:03+00:00 | 2015-01-01 01:31:27+00:00 | iap | offer2 | 8.99 | 16.3 | 2015-01-01 00:00:00 | Germany | |
2 | ECPANOIXLZHF896 | ECPANOIXLZHF896-eol2j8bs | 2015-01-01 01:31:03+00:00 | 2015-01-01 01:36:57+00:00 | iap | gems3 | 22.49 | 16.3 | 2015-01-01 00:00:00 | Germany | |
3 | ECPANOIXLZHF896 | ECPANOIXLZHF896-eol2j8bs | 2015-01-01 01:31:03+00:00 | 2015-01-01 01:37:45+00:00 | iap | gold7 | 107.99 | 16.3 | 2015-01-01 00:00:00 | Germany | |
4 | ECPANOIXLZHF896 | ECPANOIXLZHF896-eol2j8bs | 2015-01-01 01:31:03+00:00 | 2015-01-01 01:42:33+00:00 | ad | ad_20sec | 0.76 | 16.3 | 2015-01-01 00:00:00 | Germany | |
5 | ECPANOIXLZHF896 | ECPANOIXLZHF896-hdu9jkls | 2015-01-01 11:50:02+00:00 | 2015-01-01 11:55:20+00:00 | ad | ad_5sec | 0.03 | 35.2 | 2015-01-01 00:00:00 | Germany | |
1996 | NAOJRDMCSEBI281 | NAOJRDMCSEBI281-j2vs9ilp | 2015-01-21 01:57:50+00:00 | 2015-01-21 02:02:50+00:00 | ad | ad_survey | 1.332 | 25.8 | 2015-01-11 00:00:00 | organic | Norway |
1997 | NAOJRDMCSEBI281 | NAOJRDMCSEBI281-j2vs9ilp | 2015-01-21 01:57:50+00:00 | 2015-01-21 02:22:14+00:00 | ad | ad_survey | 1.35 | 25.8 | 2015-01-11 00:00:00 | organic | Norway |
1998 | RMOSWHJGELCI675 | RMOSWHJGELCI675-vbhcsmtr | 2015-01-21 02:39:48+00:00 | 2015-01-21 02:40:00+00:00 | ad | ad_5sec | 0.03 | 8.4 | 2015-01-10 00:00:00 | other_campaign | France |
1999 | RMOSWHJGELCI675 | RMOSWHJGELCI675-vbhcsmtr | 2015-01-21 02:39:48+00:00 | 2015-01-21 02:47:12+00:00 | iap | offer5 | 26.09 | 8.4 | 2015-01-10 00:00:00 | other_campaign | France |
2000 | GJCXNTWEBIPQ369 | GJCXNTWEBIPQ369-9elq67md | 2015-01-21 03:59:23+00:00 | 2015-01-21 04:06:29+00:00 | ad | ad_5sec | 0.12 | 18.5 | 2015-01-14 00:00:00 | organic | United States |
The game_revenue
dataset is a more real-world dataset with a mix of data types, and it’s significantly larger than the small_table
dataset at 2000 rows and 11 columns.