# Working directory layout

Astro stores run data and statistics relative to your pipeline directory.

## Pipeline directory

```text
pipeline_dir/
  pipeline.py
  .astro/
    stats.db              # SQLite statistics
  .persistent/
    {name}.parquet        # Canonical ID resolver stores
  .working/
    {run_id}/             # One directory per run
```

## Run directory

```text
.working/{run_id}/
  manifest.json           # Run metadata and step states
  astro.log               # Run-scoped log file
  ingested/
    {ingest_name}.parquet
  processed/              # Step outputs (author-defined subfolders)
  filtered/
    {step_id}/
      {ingest_name}.parquet
  snapshots/
    {step_id}/
      {ingest_name}.parquet
  quarantine/
    {step_id}/
      {ingest_name}.part-00001.parquet
```

## manifest.json

Tracks run status (`ingested`, `completed`, `quarantined`, `failed`), ingested file records, and per-step state in `step_states`.

Each step record includes `step_id`, `status` (`pending`, `complete`, `quarantined`, `failed`, `blocked`), and optional `detail`.

## Run IDs

Run IDs are 5-character lowercase alphanumeric strings assigned at ingest time.

## Statistics database

`PipelineStore` persists statistics at `.astro/stats.db`:

- **`runs`** — run metadata
- **`ingest_files`** — per-file ingest records
- **`statistics`** — scoped metrics (run, file, step)

See {doc}`statistics` for the statistics API and built-in metrics.

## Cleanup

Use `astro cleanup` to remove completed or failed run directories under `.working/` and delete their statistics records. Pass `--all` to also clear `.astro/stats.db` and `.persistent/`. See {doc}`cli` for `--dry-run` and confirmation options.

## Next steps

- {doc}`ingest` — what happens during ingest
- {doc}`quarantine` — snapshot and quarantine paths