Data Pipeline
Build a complete ETL pipeline from raw data to validated output.
A step-by-step guide to building a data pipeline.
Step 1: Extract
raw = [100, None, 200, None, 300]Step 2: Transform
from gdtest_sec_index_hero import process
clean = process(raw, strict=True) # [100, 200, 300]Step 3: Load
from gdtest_sec_index_hero import summarize
output = summarize(clean)
print(output) # {'count': 3, 'sum': 600, 'mean': 200.0}