DataFrame-agnostic Zarr storage powered by Narwhals

Bring your DataFrame to Zarr.

Currently works with:

  • Pandas DataFrames
  • Polars (DataFrames & LazyFrames)
  • Dask DataFrames

Quick Start

import pandas as pd
import zarrwhals as zw

df = pd.DataFrame({
    "id": [1, 2, 3],
    "name": ["Alice", "Bob", "Charlie"],
    "value": [10.5, 20.3, 30.1],
    "category": pd.Categorical(["A", "B", "A"])
})

zw.to_zarr(df, "data.zarr")
df_loaded = zw.from_zarr("data.zarr")
import polars as pl
import zarrwhals as zw

df = pl.DataFrame({
    "id": [1, 2, 3],
    "name": ["Alice", "Bob", "Charlie"],
    "value": [10.5, 20.3, 30.1]
})

zw.to_zarr(df, "data.zarr")
df_loaded = zw.from_zarr("data.zarr", backend="polars")

Installation

pip install git+https://github.com/srivarra/zarrwhals.git@main
uv add git+https://github.com/srivarra/zarrwhals.git@main

How It Works

pandas / polars / dask  →  Narwhals  →  zarrwhals  →  Zarr Storage

zarrwhals serializes DataFrames via Narwhals into Zarr stores, preserving type information in an interchange format. This allows reading back as pandas, Polars, or Dask—regardless of what library wrote the data.


Next Steps