Topic: extract decisions from chatgpt
How to Extract Decisions from Your ChatGPT Chats
You've had several hundred ChatGPT conversations. The architectural decisions you actually made are in there somewhere — behind the scratch thinking, the clarification questions, and the dead-end tangents. Here's how to pull them out.
TL;DR
Export your chats from chatgpt.com → Settings → Data Controls → Export data. Open the resulting conversations.json and run an extractor over it — either a regex pass looking for phrases like "we'll go with X over Y because…", or a small LLM pass that emits a JSON Schema. Typical result from a year of use: 20–80 durable decision records, each with the original chat snippet attached. The open-source CLI at whychose.com/extractor does this locally in ~500 lines of dependency-free Node. The hosted product wraps the same engine with team sharing and Notion/Linear export.
Why this matters
Every senior engineer uses ChatGPT 15–30 times a week to think through trade-offs: stack choices, pricing, hiring, architecture migrations. Six months later a new teammate asks "why did we pick Postgres over Mongo?" — and the reasoning is gone. It lives somewhere in ~800 chats, across ~30,000 messages. CMD+F doesn't work because you don't remember the exact phrasing. ChatGPT's native search retrieves conversations, not the structured reasoning inside them. The value of a decision record is not the thinking itself — you already did that — it's having it findable in September when someone on the interview panel asks.
How to approach it
- Export the archive. Settings → Data Controls → Export data. OpenAI emails a ZIP link within 30 minutes. The interesting file inside is
conversations.json— an array of objects with amappingfield holding the message tree. - Flatten conversations to linear text. Each conversation's
mappingis a DAG (because of edits/branches). Walk from the root down each branch, keep only user + assistant messages, join them as plain text. This step is the one most hand-rolled scripts get wrong — you end up with duplicate messages from edit forks. - Run the extractor. Two approaches work. (a) Regex-first: scan each flattened conversation for phrases like
"we'll go with","let's pick","the answer is","decided to","vs"within ~40 tokens of each other. Noisy but zero-cost. (b) LLM pass: feed each conversation into Claude or GPT-4o-mini with a JSON Schema asking for{title, chose, rejected, rationale}[]. More accurate, costs ~$0.001/conversation. - Dedupe and tag. Most decisions get revisited 2–4 times over weeks. Group by normalized-topic string and keep only the last-reached conclusion. Add tags for downstream filtering:
stack,pricing,hiring,migration. - Link back to source. For every record, keep the conversation ID + message index so reviewers can click through to the original chat. Without the backlink, the extract is a nice summary; with it, it's a real audit trail.
How WhyChose helps
WhyChose is the productized version of the above. You drop your conversations.json into the browser, extraction runs client-side (we never see your transcripts), and you get back a searchable decision log with the original chat snippets attached. Pro tier exports to Notion, Linear, and Obsidian; Team tier adds shared decision logs with per-teammate access. If you'd rather self-host, the extractor is MIT-licensed — download the tarball, run node bin/extractor.js conversations.json, keep your data off our servers entirely. The hosted and open-source paths run the same extraction engine; the difference is UI, storage, and team features.
Related questions
Does this work with Claude exports too?
Yes. Claude's export ships conversations.json with a flatter shape (no mapping DAG — messages live under chat_messages[] per conversation). Our extractor handles both formats and emits the same output schema. Claude export walkthrough →
What about Perplexity or Gemini?
Perplexity has no user-facing export as of April 2026. Gemini's Takeout export is partial and inconsistent. For now, WhyChose focuses on ChatGPT + Claude, where the exports are first-class. We track Perplexity's changelog and will add support when they ship a stable format.
Do you see my chat content?
In the hosted product, extraction runs in your browser — the raw transcript never crosses our network. We persist only the extracted records (short strings: title, chose, rejected, rationale, timestamp) and a hash of the source message for de-dup. In the open-source CLI, nothing leaves your machine.
Why not just let ChatGPT summarize it?
Asking ChatGPT "summarize my decisions" in a fresh chat doesn't work — it can only see the current context, not your history. The export-and-extract pattern works because you control the corpus and the prompt is the same across every conversation (deterministic output, reproducible results).
Further reading
- How to export your ChatGPT history (2026 guide) — the step-by-step on the OpenAI side.
- How to export your Claude conversations — the Anthropic equivalent, with the JSON shape difference explained.
- ADR example: Postgres vs MongoDB — what a single extracted record looks like, fully written out.
- The open-source extractor — ~500 lines of Node, zero deps, MIT licensed.