whychose-extractor
The ~500-line Node CLI that turns a ChatGPT or Claude export into a structured decision log. Same engine the hosted product uses. Read the source, run it locally, keep your data on your laptop.
Install & run — 60 seconds
curl -sL https://whychose.com/extractor/whychose-extractor-v1.0.0.tar.gz | tar -xz
cd whychose-extractor
node bin/extractor.js sample-chatgpt.json
No npm install. No build step. No native bindings. Node 18+ is the only requirement — if you have a modern laptop you're set.
Run it against your own export:
# ChatGPT: Settings → Data Controls → Export → unzip → conversations.json
node bin/extractor.js ~/Downloads/conversations.json > decisions.json
# Claude: Settings → Account → Export data → unzip → conversations.json
node bin/extractor.js ~/Downloads/claude-export/conversations.json --format=md > decisions.md
Browse the source
Every file is plain text, served straight by the web server. Click to read — no login, no account.
Flags
| Flag | Values | Default | What it does |
|---|---|---|---|
--sensitivity | normal, high | normal | high includes confidence: low records. More recall, more false positives. |
--format | json, jsonl, md | json | Output shape. jsonl = one record per line. md = human-browsable markdown. |
Output — the DecisionRecord shape
Every record follows schema.json. One real example, extracted from the bundled Claude sample:
{
"id": "claude-20260203-d7cf9e98",
"date": "2026-02-03",
"source": "claude",
"chat_title": "Kubernetes or Fly.io for the next deploy",
"question": "kubernetes vs fly.io",
"chosen": "kubernetes",
"rejected": ["fly.io"],
"trade_offs": [],
"confidence": "medium",
"snippet": "assistant: If the rest of infra is on k8s and CI is wired, ...\nuser: Agreed. Sticking with kubernetes. The scale-to-zero thing was cool but...",
"tags": ["infra"]
}
Privacy — this CLI is local-only
The extractor makes zero network requests. You can verify this yourself:
grep -nE 'require\(|fetch|http\.|https\.' bin/extractor.js
Only fs, path, and crypto show up — all stdlib. No telemetry. No phone-home. The ChatGPT or Claude export you feed in never leaves your laptop.
If you then upload the extracted decision records (not the transcript) to the hosted product at whychose.com, that's 5–50 short strings per quarterly export — not the raw chat. See whychose.com/privacy for the full version.
Known misses (v1)
Documented openly so you know what to expect:
- Long multi-turn decisions (20+ messages between the question and the commit). The commit-search window is 6 messages.
- Implicit decisions — "ok let me scaffold this" with no commit phrase.
- Non-English transcripts.
- Decisions phrased as statements, not questions — "I think we should use Postgres" followed by "yeah ok".
If your export has a common case we're missing, send us a redacted snippet via @bitinvestigator on X. The pattern library gets tightened every time a real miss shows up.
License
MIT. See LICENSE.
What about the hosted product?
This CLI extracts decisions to stdout. The hosted product at whychose.com wraps the same engine with:
- A browser-side upload UI — no CLI, no tarball
- A searchable, filterable decision log
- Shared decision logs for teammates via a private link
- Export to Notion, Linear, or Obsidian
The CLI is strictly a subset of the hosted product. If you want a UI and multi-device sync, see pricing. If you want to keep everything on your laptop — you're already done.