Topic: extract decisions from chatgpt

How to Extract Decisions from Your ChatGPT Chats

You've had several hundred ChatGPT conversations. The architectural decisions you actually made are in there somewhere — behind the scratch thinking, the clarification questions, and the dead-end tangents. Here's how to pull them out.

TL;DR

Export your chats from chatgpt.com → Settings → Data Controls → Export data. Open the resulting conversations.json and run an extractor over it — either a regex pass looking for phrases like "we'll go with X over Y because…", or a small LLM pass that emits a JSON Schema. Typical result from a year of use: 20–80 durable decision records, each with the original chat snippet attached. The open-source CLI at whychose.com/extractor does this locally in ~500 lines of dependency-free Node. The hosted product wraps the same engine with team sharing and Notion/Linear export.

Why this matters

Every senior engineer uses ChatGPT 15–30 times a week to think through trade-offs: stack choices, pricing, hiring, architecture migrations. Six months later a new teammate asks "why did we pick Postgres over Mongo?" — and the reasoning is gone. It lives somewhere in ~800 chats, across ~30,000 messages. CMD+F doesn't work because you don't remember the exact phrasing. ChatGPT's native search retrieves conversations, not the structured reasoning inside them. The value of a decision record is not the thinking itself — you already did that — it's having it findable in September when someone on the interview panel asks.

How to approach it

  1. Export the archive. Settings → Data Controls → Export data. OpenAI emails a ZIP link within 30 minutes. The interesting file inside is conversations.json — an array of objects with a mapping field holding the message tree.
  2. Flatten conversations to linear text. Each conversation's mapping is a DAG (because of edits/branches). Walk from the root down each branch, keep only user + assistant messages, join them as plain text. This step is the one most hand-rolled scripts get wrong — you end up with duplicate messages from edit forks.
  3. Run the extractor. Two approaches work. (a) Regex-first: scan each flattened conversation for phrases like "we'll go with", "let's pick", "the answer is", "decided to", "vs" within ~40 tokens of each other. Noisy but zero-cost. (b) LLM pass: feed each conversation into Claude or GPT-4o-mini with a JSON Schema asking for {title, chose, rejected, rationale}[]. More accurate, costs ~$0.001/conversation.
  4. Dedupe and tag. Most decisions get revisited 2–4 times over weeks. Group by normalized-topic string and keep only the last-reached conclusion. Add tags for downstream filtering: stack, pricing, hiring, migration.
  5. Link back to source. For every record, keep the conversation ID + message index so reviewers can click through to the original chat. Without the backlink, the extract is a nice summary; with it, it's a real audit trail.

How WhyChose helps

WhyChose is the productized version of the above. You drop your conversations.json into the browser, extraction runs client-side (we never see your transcripts), and you get back a searchable decision log with the original chat snippets attached. Pro tier exports to Notion, Linear, and Obsidian; Team tier adds shared decision logs with per-teammate access. If you'd rather self-host, the extractor is MIT-licensed — download the tarball, run node bin/extractor.js conversations.json, keep your data off our servers entirely. The hosted and open-source paths run the same extraction engine; the difference is UI, storage, and team features.

Get early access

Related questions

Does this work with Claude exports too?

Yes. Claude's export ships conversations.json with a flatter shape (no mapping DAG — messages live under chat_messages[] per conversation). Our extractor handles both formats and emits the same output schema. Claude export walkthrough →

What about Perplexity or Gemini?

Perplexity has no user-facing export as of April 2026. Gemini's Takeout export is partial and inconsistent. For now, WhyChose focuses on ChatGPT + Claude, where the exports are first-class. We track Perplexity's changelog and will add support when they ship a stable format.

Do you see my chat content?

In the hosted product, extraction runs in your browser — the raw transcript never crosses our network. We persist only the extracted records (short strings: title, chose, rejected, rationale, timestamp) and a hash of the source message for de-dup. In the open-source CLI, nothing leaves your machine.

Why not just let ChatGPT summarize it?

Asking ChatGPT "summarize my decisions" in a fresh chat doesn't work — it can only see the current context, not your history. The export-and-extract pattern works because you control the corpus and the prompt is the same across every conversation (deterministic output, reproducible results).

Further reading