Topic: extract decisions from chatgpt

How to Extract Decisions from Your ChatGPT Chats

Q: Does this work with Claude exports too?

Yes. Claude's export ships conversations.json with a flatter shape (no mapping DAG — messages live under chat_messages[] per conversation). Our extractor handles both formats and emits the same output schema. Claude export walkthrough →

You've had several hundred ChatGPT conversations. The architectural decisions you actually made are in there somewhere — behind the scratch thinking, the clarification questions, and the dead-end tangents. Here's how to pull them out.

TL;DR

Export your chats from chatgpt.com → Settings → Data Controls → Export data. Open the resulting conversations.json and run an extractor over it — either a regex pass looking for phrases like "we'll go with X over Y because…", or a small LLM pass that emits a JSON Schema. Typical result from a year of use: 20–80 durable decision records, each with the original chat snippet attached. The open-source CLI at whychose.com/extractor does this locally in ~500 lines of dependency-free Node. The hosted product wraps the same engine with team sharing and Notion/Linear export.

Why this matters

Every senior engineer uses ChatGPT 15–30 times a week to think through trade-offs: stack choices, pricing, hiring, architecture migrations. Six months later a new teammate asks "why did we pick Postgres over Mongo?" — and the reasoning is gone. It lives somewhere in ~800 chats, across ~30,000 messages. CMD+F doesn't work because you don't remember the exact phrasing. ChatGPT's native search retrieves conversations, not the structured reasoning inside them. The value of a decision record is not the thinking itself — you already did that — it's having it findable in September when someone on the interview panel asks.

How to approach it

Export the archive. Settings → Data Controls → Export data. OpenAI emails a ZIP link within 30 minutes. The interesting file inside is conversations.json — an array of objects with a mapping field holding the message tree.
Flatten conversations to linear text. Each conversation's mapping is a DAG (because of edits/branches). Walk from the root down each branch, keep only user + assistant messages, join them as plain text. This step is the one most hand-rolled scripts get wrong — you end up with duplicate messages from edit forks.
Run the extractor. Two approaches work. (a) Regex-first: scan each flattened conversation for phrases like "we'll go with", "let's pick", "the answer is", "decided to", "vs" within ~40 tokens of each other. Noisy but zero-cost. (b) LLM pass: feed each conversation into Claude or GPT-4o-mini with a JSON Schema asking for {title, chose, rejected, rationale}[]. More accurate, costs ~$0.001/conversation.
Dedupe and tag. Most decisions get revisited 2–4 times over weeks. Group by normalized-topic string and keep only the last-reached conclusion. Add tags for downstream filtering: stack, pricing, hiring, migration.
Link back to source. For every record, keep the conversation ID + message index so reviewers can click through to the original chat. Without the backlink, the extract is a nice summary; with it, it's a real audit trail.

How WhyChose helps

WhyChose is the productized version of the above. You drop your conversations.json into the browser, extraction runs client-side (we never see your transcripts), and you get back a searchable decision log with the original chat snippets attached. Pro tier exports to Notion, Linear, and Obsidian; Team tier adds shared decision logs with per-teammate access. If you'd rather self-host, the extractor is MIT-licensed — download the tarball, run node bin/extractor.js conversations.json, keep your data off our servers entirely. The hosted and open-source paths run the same extraction engine; the difference is UI, storage, and team features.

Get early access

How to export your ChatGPT history (2026 guide) — the step-by-step on the OpenAI side.
How to export your Claude conversations — the Anthropic equivalent, with the JSON shape difference explained.
conversations.json field reference — the schema-level companion to this page; the leaf-walk JS lives here.
How to search your ChatGPT history — what to do before jumping straight to extraction; covers the lighter sidebar / ripgrep / jq levels.
ADR example: Postgres vs MongoDB — what a single extracted record looks like, fully written out.
How to extract decisions from your Claude conversations — the symmetric Anthropic-side guide; flat chat_messages shape, Artifact-as-decision framing, project_uuid grouping.
Gemini conversation export — the third-platform precondition; Gemini's HTML-only export normalized via the parsing script lets the same regex + LLM passes run on Google's chat history.
ChatGPT Projects export — the Project-scoped variant; preserve project_id as a first-class column in the decision log so per-Project decision queries are one filter, not a manual group-by; the custom instructions from each Project's README scope decision-relevance scoring to the Project's intent.
ChatGPT Team export — differences from Plus, workspace admin flow, and the Compliance API — workspace exports add created_by_user_id and participants[] as filterable columns alongside project_id; per-Project decision audits become one filter, per-member decision-history reviews become two filters, and the audit log integrates as a second source for "who had access to the rationale when this decision was made."
The open-source extractor — ~500 lines of Node, zero deps, MIT licensed.
ChatGPT Custom GPTs export — conversations vs configurations — Custom GPT conversations are filtered by gizmo_id in the extractor output; if you built a Custom GPT specifically for architecture or decision-making work, this companion page explains how to use the GPT Builder export to back up the configuration and how gizmo_id becomes a per-GPT decision journal filter.
Perplexity conversation export — how to save your AI research history — Perplexity research threads (often the research step before a ChatGPT decision conversation) have no batch export path; the extractor processes Perplexity content as plain-text paste input. If your decision workflow uses Perplexity for research and ChatGPT for synthesis, this page covers how the two layers fit together in a complete decision record.
Gemini Workspace export — Google Vault, admin console, and enterprise data portability — for decision workflows that include Gemini for Workspace: the Vault MBOX export path, the Python MBOX parsing script that extracts conversation text, and how the plain-text output integrates with the extractor as a paste input. Teams that use Gemini for research and ChatGPT for synthesis have Gemini history available in MBOX form — parse it, paste the conversation text, and the extractor identifies the decision-shaped content.
ChatGPT web search in conversations.json — tether content types, cited URLs, and extraction recipes — the extraction companion for conversations where ChatGPT cited web sources: the tether_browsing_display and tether_quote nodes contain the research evidence (query, cited URLs, quoted passages) behind the decisions the extractor surfaces. Understanding the tether node schema improves decision extraction quality — a record that captures "chose Valkey because of the Redis licensing change" is more defensible when the tether_quote node confirming the RSAL license is extractable alongside the decision text.
ChatGPT voice mode in the data export — transcripts, what's missing, and how to process them — voice conversations export as plain text in conversations.json; audio is never stored; how to identify voice turns, understand Whisper transcription quality, and extract decisions from voice-heavy sessions.
GitHub Copilot Chat export — why there isn't one, and where your history actually lives — for engineers who make architecture decisions in their IDE: Copilot Chat conversations are stored locally in VS Code's workspace storage, not exported by ChatGPT's data archive (different product). If the decision happened in Copilot Chat rather than a ChatGPT or Claude conversation, deliberate at-the-time capture is the only path — the extractor on this page only processes ChatGPT and Claude exports, not Copilot Chat's local storage.
ChatGPT Canvas export — documents in conversations.json and decision-drafting workflow — Canvas (the collaborative document editing mode) is a high-value surface for the extractor because Canvas sessions often contain the structured decision language the extractor targets: trade-off comparisons, Alternatives Considered sections, explicit Consequences enumerations. Canvas documents appear in conversations.json as long assistant messages in the Canvas conversation thread — the extractor processes them alongside regular chat conversations. This companion page explains the Canvas message structure, the jq recipes for identifying Canvas-heavy conversations before running the extractor, and why Canvas-drafted ADRs typically pass decision extraction on the first pass without the noise-filtering that informal chat conversations require.

How to Extract Decisions from Your ChatGPT Chats

TL;DR

Why this matters

How to approach it

How WhyChose helps

Related questions

Further reading