# allbirds.com — catalog pull

**Generated**: 2026-04-20  •  **Source**: public Shopify `/products.json` endpoint (no auth, no key)

**Method**: paginate through `/products.json?page=N&limit=250` and normalize each variant's first price + category + handle.

---

## Headline numbers

- **Products**: 987
- **With price**: 987
- **Price band**: $2.0 – $160.0 (median $75.0)

## Price distribution (USD)

| Percentile | Price |
|---|---:|
| p10 | $22.0 |
| p25 | $49.0 |
| median | $75.0 |
| p75 | $105.0 |
| p90 | $135.0 |

## Top categories by product count

- `Shoes` — 813
- `Apparel` — 95
- `Socks` — 42
- `Accessories` — 34
- `Promotion` — 2
- `Underwear` — 1

## What the pipeline does with this output

In a real product-research pilot, this CSV feeds:

1. **Review ingest** — pull public reviews per product for sentiment + pain-point extraction.
2. **Competition scoring** — cross-market margin benchmarking against comparable Shopify / Amazon / Lazada catalogs.
3. **Opportunity brief** — Claude-drafted summary per SKU cluster: pricing gaps, review themes, angle ideas.

Those downstream steps aren't run for this sample — the goal is to show the raw data shape the system produces on a real store.

## Caveats

- Only products exposed via `/products.json` are included — locked SKUs, wholesale-only items, and some draft products are hidden.
- `price` reflects the first variant; multi-variant SKUs (size/color) aren't broken out in this sample.
- `category` is the store's own `product_type` field — inconsistent across stores, useful within a single catalog.

---

*Dataset schema: 7 columns — id, title, price, currency, category, source_url, external_id. Full variant + inventory data is preserved in the raw JSONB column in Postgres.*