zarrwhals
DataFrame-agnostic Zarr storage powered by Narwhals
Bring your DataFrame to Zarr.
Currently works with:
- Pandas DataFrames
- Polars (DataFrames & LazyFrames)
- Dask DataFrames
Quick Start¶
import pandas as pd
import zarrwhals as zw
df = pd.DataFrame({
"id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
"value": [10.5, 20.3, 30.1],
"category": pd.Categorical(["A", "B", "A"])
})
zw.to_zarr(df, "data.zarr")
df_loaded = zw.from_zarr("data.zarr")
import polars as pl
import zarrwhals as zw
df = pl.DataFrame({
"id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
"value": [10.5, 20.3, 30.1]
})
zw.to_zarr(df, "data.zarr")
df_loaded = zw.from_zarr("data.zarr", backend="polars")
Installation¶
pip install git+https://github.com/srivarra/zarrwhals.git@main
uv add git+https://github.com/srivarra/zarrwhals.git@main
How It Works¶
pandas / polars / dask → Narwhals → zarrwhals → Zarr Storage
zarrwhals serializes DataFrames via Narwhals into Zarr stores, preserving type information in an interchange format. This allows reading back as pandas, Polars, or Dask—regardless of what library wrote the data.
Next Steps¶
- API Reference — Function documentation
- Architecture — How zarrwhals works
- Contributing — Get involved