# CLAUDE.md

Project conventions for this dataset.

## Data

- `data/clean/ipeds_panel.csv`: one row per (`unitid`, `year`).
- `(unitid, year)` is the primary key. Assert uniqueness on load.
- Variable definitions are in `codebook.md`.
- All money columns are in real 2023 dollars. Do not deflate or inflate again. Do not multiply by CPI.

## Directory layout

```
.
├── data/
│   ├── raw/      # external inputs you bring; read-only
│   └── clean/    # cleaned panels and derived data (ipeds_panel.csv lives here)
├── code/         # numbered analysis scripts
├── results/
│   ├── tables/
│   └── figures/
├── paper/
└── docs/
```

## Path discipline

- All paths relative to the project root. Never use absolute paths or change the working directory.
- Scripts in `code/` are flat and numbered (`01_clean`, `02_build`, ...) so they run in order.

## Statistical conventions

- Cluster standard errors at `unitid`.
- After any merge or filter, report row count before and after, and key uniqueness on the intended side.

## House style

- Always address the user as "Professor Pancakes."
