Topic: extract decisions from claude

How to Extract Decisions from Your Claude Conversations

You spent the last year sparring with Claude on architecture, pricing, and stack choices. Most of those decisions are still in force; almost none of them made it into the repo. Here's how to pull them back out — flat chat_messages array, <antartifact> wrappers, project_uuid grouping, and the working code for each step.

TL;DR

Export your conversations from claude.ai → Settings → Privacy → Export data. Anthropic emails a ZIP within ~10 minutes. The interesting file is conversations.json — an array of conversation objects, each with a flat chat_messages[] array (no DAG to walk, unlike ChatGPT). Run an extractor over it: a regex pass for the obvious phrases, an Artifact pass that treats every <antartifact> as a decision crystallization, then an optional LLM pass for precision. Typical yield from a year of ICP usage: 25–90 durable decision records, plus the Artifacts attached. The open-source CLI does all three passes locally in ~500 lines of Node. The hosted product wraps the same engine with team sharing and Notion / Linear export.

Why Claude is a richer extraction target than ChatGPT

Most senior engineers at 5–50-person SaaS companies use both ChatGPT and Claude, but the conversations look different. ChatGPT gets the quick "translate this regex" and "what's the syntax for X" exchanges. Claude gets the longer architecture deep-dives — the back-and-forth where you weigh Postgres vs MongoDB across four turns, or argue with yourself about whether to monolith or split. The signal-to-noise ratio for durable decisions is meaningfully higher on the Claude side, especially for engineers and CTOs who treat Claude as a thinking partner for hard calls rather than a faster Google.

Claude also has three structural features that make extraction sharper:

  1. Flat conversation shape. No mapping DAG to walk, no edit branches to deduplicate. chat_messages[] is already linear.
  2. Artifacts as first-class durable content. Schemas, code, plans, designs — anything Claude produced as an artifact gets wrapped in <antartifact> tags with explicit attributes (identifier, type, language, title). When Claude produced an Artifact, that's almost always the moment a decision crystallized.
  3. Project boundaries preserved. Conversations that happened inside a Claude Project carry project_uuid, so per-initiative filtering is one column — not a manual group-by on conversation titles.

The five steps

  1. Export the archive. Settings → Privacy → Export data. Anthropic emails a ZIP link within 10–30 minutes. The file you care about is conversations.json; the ZIP also includes users.json (your account profile), projects.json (project definitions), and a files/ directory with attachments.
  2. Flatten conversations to linear text. Each conversation has chat_messages[], already in order. One jq command produces one Markdown file per conversation:
    jq -r '.[] | "# " + (.name // "untitled") + "\n\n" +
      ([.chat_messages[] |
        "## " + .sender + " — " + (.created_at) + "\n\n" + .text
      ] | join("\n\n---\n\n"))' \
      conversations.json > conversations-flat.md
    No DAG, no parent pointers, no current_node walks. Exactly what the equivalent ChatGPT step requires 30 lines to handle.
  3. Pull every Artifact out as a separate file. Artifacts are wrapped in <antartifact identifier="..." type="..." title="..." language="...">...</antartifact> inside chat_messages[].text. Treat each unique identifier as a candidate decision record. The full extraction recipe lives at the Artifacts export reference; the short version is a jq + regex pipe that writes artifacts/<identifier>.<ext> with the right extension per type. For most ICP usage this single pass recovers 40–70% of decisions on its own — every schema, plan, and design doc the user asked Claude to produce ends up captured.
  4. Run the regex pass over the prose. The remaining decisions live in plain text — the back-and-forth where the trade-offs got worked out before any Artifact was produced. Scan each flattened conversation for the small set of phrases that almost always indicate a chosen option:
    const PATTERNS = [
      /(?:we'?ll|let'?s|i'?ll)\s+(?:go\s+with|pick|choose|use)\s+([A-Z][\w-]+)/gi,
      /(?:decided|chose|picked|went)\s+(?:to|with|on)\s+([A-Z][\w-]+)/gi,
      /([A-Z][\w-]+)\s+(?:over|vs|instead\s+of)\s+([A-Z][\w-]+)/g,
      /(?:the\s+answer|the\s+choice|conclusion)\s+is\s+([A-Z][\w-]+)/gi,
    ];
    These four patterns cover roughly 70% of decisions in a typical Claude conversation. They over-fire on tutorials and exploratory chats; you'll dedupe those in step 5.
  5. (Optional) run an LLM pass for precision. Feed each candidate conversation into a JSON-schema-constrained Claude or GPT-4o-mini call:
    {
      "schema": {
        "decisions": [{
          "title": "string",
          "chose": "string",
          "rejected": ["string"],
          "rationale": "string",
          "confidence": "number 0-1"
        }]
      }
    }
    At Haiku 4.5 prices a year of typical chats costs roughly $0.30 to process. Skip this step if regex + Artifacts already gave you what you need — most users will have 25+ records before they get here.

The conversation-level metadata you should preserve

Most extraction guides flatten the export and lose the metadata. That metadata is what makes the resulting decision log queryable later. Keep these five fields on every record:

FieldSourceWhy it matters
conversation_uuidconversations[].uuidBacklink to the original chat. The decision log is a summary; this lets a reviewer click through to the full reasoning.
project_uuidconversations[].project_uuidOne-column filter for per-initiative views. "Every decision made inside Mobile rewrite Q2" → one query, not a manual group-by.
created_atchat_messages[i].created_at of the decision turnThe decision date, not the conversation start date. Critical when a long chat spanned a sprint and the call got made on day 4, not day 1.
senderchat_messages[i].sender"Did the user decide, or did Claude propose?" Both are valid; both should be flagged. The user's commit is the load-bearing one.
artifact_id (optional)Artifact identifier if extracted from oneIf the decision crystallized in an Artifact, link to the Artifact body file. Saves the reviewer one hop when the decision IS the Artifact.

Every WhyChose decision record carries these five fields. The first two power the searchable view (filter by project, click through to original); the last three power the audit-trail view (when, who, what's the artifact).

What gets missed (and how to recover)

Three categories of decisions slip past every extractor. Knowing the failure modes lets you backfill manually:

  1. Decisions that crossed multiple conversations. The trade-offs got weighed in chat A, the choice got made in chat B a week later. Each conversation alone reads as exploration; the actual decision is the diff between them. Extractors see them as two separate exchanges. Recover by sorting the rejected-as-exploratory list by conversation title and reading any clusters that share a topic — those are usually the cross-conversation decisions in pieces.
  2. Decisions framed as questions, not statements. "Should we use Postgres or Mongo?" with five turns of comparison and an implicit conclusion. The closing turn doesn't say "we'll use Postgres," it just stops talking about Mongo. Regex misses this entirely; the LLM pass catches it most of the time. The recovery pattern: if a conversation is >10 turns and the last 3 turns mention only one of the alternatives, infer that as the choice.
  3. Decisions that got reversed. A choice was made in March, reversed in June. Both turns get extracted; without supersession, the log shows two contradictory records. Mitigate by running a final pass that groups records by normalized title and flags clusters with conflicting chose values — a reviewer marks the latest as Active and the older as Superseded by the new conversation_uuid.

How WhyChose helps

WhyChose is the productized version of the above. You drop your Claude conversations.json into the browser, all three extraction passes (regex, Artifacts, LLM) run client-side, and you get back a searchable decision log with the original chat snippets attached and Artifacts linked. We never see your transcripts — only the short extracted records (title, chose, rejected, rationale, timestamp, conversation_uuid, project_uuid, optional artifact_id) get persisted. Pro tier exports to Notion, Linear, and Obsidian; Team tier adds shared decision logs scoped by Project membership. If you'd rather self-host, the extractor is MIT-licensed — download the tarball, run node bin/extractor.js conversations.json, keep your data off our servers entirely. Hosted and open-source paths run the same engine.

Get early access

Related questions

How does Claude extraction differ from ChatGPT extraction?

Three real differences. (1) Shape: Claude ships a flat chat_messages[] array per conversation; ChatGPT ships a mapping DAG with edit branches you have to walk. Claude is the easier flatten — one .filter() and you have linear text. (2) Artifacts: Claude wraps schemas, code, and design docs in <antartifact> tags, so the durable output is structurally separated from the prose. ChatGPT inlines everything in message text. (3) Project context: Claude conversations carry project_uuid when they happen inside a Project, so per-initiative filtering is one column.

Should I use the regex pass or the LLM pass?

Regex first, LLM only when you need precision. Regex is free, runs in under a second, recovers about 70% of decisions cleanly, but misses softer language and over-fires on exploratory chats. The LLM pass at Haiku 4.5 or GPT-4o-mini prices runs roughly $0.0008 per conversation — about $0.30 for a year of typical use — and recovers high-90s precision. Run regex first to find the obvious decisions, then LLM-pass the regex-flagged-but-user-uncertain set plus a sample of the rejected ones to estimate miss rate.

Do Artifacts always represent a decision?

Almost always. When Claude produced an Artifact, the user asked for a durable thing — a schema, a config, a plan, a code skeleton. The prose around it is the rationale; the Artifact body is the chosen option. The exceptions are exploratory Artifacts ("show me what this would look like") and Artifacts that got revised to disposable variants. Simplest practical filter: an Artifact is decision-bearing if it was edited at least once after first creation (command="update") or if the conversation continued for more than three turns after it appeared. Pure first-shot Artifacts the user accepted without follow-up are usually exploratory.

Does this work with Claude Projects?

Yes — and Projects make extraction sharper, not harder. Every conversation inside a Project carries a project_uuid, so you can filter to one initiative with one column. The Project's system prompt and custom instructions are also exported, which gives you the framing under which decisions were made — useful context for the LLM pass. The one caveat: knowledge-base files attached to a Project are not in the export, so if a decision references a knowledge-base PDF you'll lose that context. See the Claude Project export reference for the full handling.

Further reading