Topic: ChatGPT Canvas export

ChatGPT Canvas Export — What's in Your ZIP and How to Extract Documents

ChatGPT Canvas is the collaborative document editing mode launched in October 2024: instead of the back-and-forth of ordinary chat, you and the AI co-edit a single document in a side panel. Engineers use it to draft architecture decision records, technical RFCs, runbooks, and design documents. Unlike Cursor Chat or GitHub Copilot Chat, Canvas content is recoverable — it appears in the standard ChatGPT ZIP export that you download from Settings. This page explains exactly how Canvas documents appear in conversations.json, how to extract them, and where the export story falls short.

TL;DR

Canvas documents are included in the standard ChatGPT ZIP export (Settings → Data Controls → Export Data → conversations.json). Canvas conversations appear alongside regular chat conversations in the conversations array. The document content is stored in the message parts of the Canvas conversation — the final document state is in the last substantial assistant message. Intermediate edits during the Canvas session may appear as earlier message nodes in the mapping. What is not in the export: UI-state annotations (highlights, inline comments), the interactive revision timeline, and code execution outputs from Canvas code blocks. For engineers who used Canvas to draft an ADR or RFC: the document text is recoverable. The WhyChose extractor can process the conversations.json and surface Canvas-drafted documents that contain architecture decision patterns.

What ChatGPT Canvas is (and isn't)

Canvas is a distinct mode within ChatGPT, not a separate product. To enter Canvas mode, use the "Canvas" option from the model selector or start a new conversation with a prompt that asks for a document or code artifact. ChatGPT switches to a split-panel view: chat on the left, the editable Canvas document on the right.

Key distinctions that affect the export story:

How Canvas content appears in conversations.json

The conversations.json file is a JSON array where each element is a conversation object. Canvas conversations are structurally similar to regular conversations with two identifying characteristics: (1) the conversation title often includes the Canvas document title, and (2) the message mapping includes one or more messages with the full document text as the content.

Canvas message structure

Each Canvas revision produces a new message node in the conversation mapping. A typical Canvas session produces this message sequence in the mapping:

  1. User prompt: "Draft an ADR for our decision to use PostgreSQL full-text search instead of Elasticsearch."
  2. Assistant message: the first draft of the Canvas document — full Markdown text of the ADR.
  3. User message (canvas edit comment): "Add a section on migration complexity."
  4. Assistant message: the revised Canvas document — full Markdown text with the new section.
  5. [repeat for each revision cycle]
  6. Final assistant message: the committed final state of the Canvas document.

Each assistant message in this sequence contains the full document text, not just the delta. This means that even if the intermediate revision history is sparse in your export (some intermediate revisions may be coalesced), the final document state is always the content of the last assistant message in the thread.

Canvas message content_type

In the conversations.json schema, Canvas document content appears with content_type: "text" in the message parts — the same content type as a regular chat response, but the content is the full document text rather than a conversational reply. You can distinguish Canvas messages from regular chat messages by their length and structure: a Canvas draft of an ADR will be 500–3000 words in a single message part, not a paragraph of chat prose.

Some Canvas messages carry additional metadata that identifies the canvas mode. The metadata field on a Canvas message may include a canvas sub-object with the document title and format (text vs code). This metadata is not guaranteed to be present on all Canvas messages in all export versions, so length and content structure are the more reliable heuristics for identifying Canvas content.

Identifying Canvas conversations in a large export

The most reliable way to identify Canvas conversations in a large export is by conversation title. Canvas conversations typically have titles that reflect the document subject rather than a conversational question. A regular chat conversation might be titled "Why PostgreSQL over Elasticsearch?"; a Canvas conversation on the same subject would more likely be titled "PostgreSQL Full-Text Search ADR" or "Architecture Decision: Search Infrastructure".

jq recipe to list all conversations and their titles for manual review:

jq '[.[] | {id: .id, title: .title, message_count: (.mapping | length)}]' conversations.json \
  | jq 'sort_by(-.message_count)'

Canvas conversations tend to have moderate message counts (5–30 nodes) compared to long chat sessions (100+ nodes) or very short interactions (1–3 nodes), though this is a rough heuristic, not a reliable filter.

Extracting Canvas document content with jq

The following recipes assume the standard conversations.json format. Adjust the .content.parts[0] path if your export version uses a different parts schema.

Extract all long assistant messages (likely Canvas documents)

jq -r '
  .[] |
  .title as $title |
  .mapping |
  to_entries[] |
  select(.value.message.author.role == "assistant") |
  select((.value.message.content.parts[0] // "" | length) > 1000) |
  {
    conversation: $title,
    message_id: .key,
    length: (.value.message.content.parts[0] | length),
    preview: (.value.message.content.parts[0] | .[0:200])
  }
' conversations.json

This surfaces all assistant messages longer than 1000 characters — a threshold that captures full Canvas document drafts while excluding short chat responses. Adjust the threshold up (to 2000 or 3000) if your regular chat responses are verbose, or down (to 500) if your Canvas documents are short.

Extract Canvas document content from a specific conversation by title

CANVAS_TITLE="PostgreSQL Full-Text Search ADR"

jq -r --arg title "$CANVAS_TITLE" '
  .[] |
  select(.title == $title) |
  .mapping |
  to_entries[] |
  select(.value.message.author.role == "assistant") |
  select((.value.message.content.parts[0] // "" | length) > 500) |
  .value.message.content.parts[0]
' conversations.json | tail -1

The tail -1 returns the last matching Canvas content — the final committed state of the document — rather than all intermediate revision drafts.

Extract all Canvas documents to separate files

mkdir -p canvas-export

jq -r '
  .[] |
  .title as $title |
  (.create_time // 0 | strftime("%Y-%m-%d")) as $date |
  .mapping |
  to_entries[] |
  select(.value.message.author.role == "assistant") |
  select((.value.message.content.parts[0] // "" | length) > 1000) |
  [$date, $title, .value.message.content.parts[0]] |
  @tsv
' conversations.json | while IFS=$'\t' read -r date title content; do
  filename="canvas-export/${date}-$(echo "$title" | tr ' /' '--' | tr -cd '[:alnum:]-').md"
  echo "$content" > "$filename"
  echo "Wrote: $filename"
done

This writes each long assistant message to a separate Markdown file named with the conversation date and title. You will likely get both Canvas documents and a few long non-Canvas responses (detailed explanations, code blocks) — review the output and delete the non-Canvas files manually.

What Canvas content is NOT in the export

Four categories of Canvas content are missing or degraded in the ZIP export:

Canvas feature In export? Notes
Document body text (Markdown or code) Yes Full text in message parts. Final state always present; intermediate revisions present as earlier message nodes.
Conversation title Yes In the title field of the conversation object.
Chat thread (prompts and responses) Yes Same as any regular conversation — all message nodes in the mapping.
UI-state annotations (highlights, inline comments) No Highlighting and suggestion markers are rendering state. Not persisted as message content. Not in export.
Interactive revision timeline Partial Multiple revisions may appear as sequential message nodes, but the interactive slider in the Canvas UI is not reconstructable from the export. Some intermediate revisions may be coalesced into a single message node.
Code execution outputs (Canvas code blocks) Partial The code in the Canvas document is present. Execution results (stdout, rendered charts) may not be fully captured in the message content parts.
Share link No Share links to Canvas documents are not listed in the export. The document content is recoverable; the share URL is not.
Canvas document metadata (format, version) Partial Some Canvas messages carry a metadata.canvas field with the document title and format. Not guaranteed across all export versions.

Canvas vs other ChatGPT features: export comparison

ChatGPT has several content modes beyond ordinary chat. Their export stories differ:

Feature Content type In conversations.json? Recoverable?
Regular chat Text messages Yes — complete Full conversation text
Canvas (text documents) Markdown documents Yes — document body in message parts Final document state + revision history
Canvas (code) Code artifacts Yes — code text in message parts Code text; execution outputs partial
DALL-E image generation Images (CDN URLs) Partial — CDN URL preserved, binary not in ZIP URL expires ~30 days; prompt text permanent
Code Interpreter / Data Analysis Code + execution results Partial — code text preserved, execution outputs may be missing Code text; output rendering not guaranteed
File uploads (context documents) User-uploaded files Partial — file reference in message, not binary content File names visible; file content not in ZIP
GPT-4o native image generation Generated images Partial — internal file-service URI; not a public URL No download path from export

Canvas has the best export story among ChatGPT's non-standard content modes: the document text is preserved in full, in the standard message format, without requiring CDN URL resolution or binary file download.

Using Canvas for ADR and RFC drafting

Canvas is well-suited for structured technical writing — ADRs, RFCs, runbooks, design documents — because the document-centric editing model matches the artifact-centric nature of these documents better than chat mode does. Instead of extracting a document from a conversational thread, you start with a document frame and the AI fills it in collaboratively.

A practical ADR drafting workflow using Canvas:

  1. Open Canvas with a structured prompt. "Draft an ADR in Nygard format for the following decision: [describe the decision]. Include sections for Context, Decision, Consequences (positive, negative, neutral), and Alternatives Considered." ChatGPT creates a Canvas document with the requested structure.
  2. Iterate via chat comments. Use the left chat panel to refine: "The Alternatives Considered section is missing Elasticsearch with Postgres as a hybrid. Add it with the trade-offs." Each iteration produces a new Canvas revision.
  3. Export the document. When the draft is complete, copy the Canvas document text to your ADR file in the code repository. Alternatively, export your ChatGPT data (Settings → Data Controls → Export Data) and use the jq recipes above to extract the Canvas content from conversations.json.
  4. Run WhyChose on the conversation. If the same ChatGPT export contains the deliberation conversations that led up to the Canvas draft — the "which database should we use?" chat sessions — the WhyChose extractor surfaces those as structured decision records alongside the Canvas-drafted ADR. The conversation is the evidence; the Canvas document is the formalized position.

The key advantage over Cursor Chat for this workflow: when you use Canvas in ChatGPT to draft an ADR, both the deliberation (the chat thread) and the output (the Canvas document) are preserved in the export. A Canvas-drafted ADR that was never committed to a code repository is still recoverable from the conversations.json export — it did not vanish when you closed the browser tab.

Get early access to WhyChose

Related questions

Are ChatGPT Canvas documents included in the data export?

Yes. ChatGPT Canvas documents are preserved in the standard conversations.json export downloaded via Settings → Data Controls → Export Data. Canvas conversations appear alongside ordinary chat conversations in the conversations array. The Canvas document content is in the message parts of the Canvas conversation — the final document state is in the last substantial assistant message. Intermediate revisions appear as earlier message nodes in the conversation mapping. There is no separate "Canvas documents" export; all Canvas content lives inside conversations.json.

How does Canvas content appear in conversations.json?

Canvas content appears as assistant messages within the Canvas conversation's mapping. The document body is stored in message.content.parts[0] with content_type: "text". Each Canvas revision produces a new message node. The final committed document state is the last assistant message with document-length content. Some Canvas messages carry a metadata.canvas sub-object with the document title and format, but this metadata is not guaranteed across all export versions. Identifying Canvas conversations by title is more reliable than identifying them by message metadata.

Can I use ChatGPT Canvas for ADR or RFC drafting and still recover the document?

Yes — Canvas is one of the better ADR drafting surfaces in ChatGPT precisely because the output is recoverable. The Canvas interface gives you a structured document editing experience; the resulting Markdown document appears as the final message content in conversations.json. The WhyChose extractor processes conversations.json and can identify Canvas conversations containing decision-pattern language (trade-off comparisons, Alternatives Considered sections, Consequences enumerations) and surface them as structured ADR records. This is significantly better than Cursor Chat, where the deliberation is gone when you close the panel.

What ChatGPT Canvas content is NOT in the export?

Four things are missing or degraded: (1) UI-state annotations — highlights and suggestion markers are rendering state, not persisted content; (2) the interactive revision timeline — multiple revisions appear as message nodes but the slider is not reconstructable; (3) code execution results from Canvas code blocks — code text is present, execution outputs may not be; (4) share links — the share URL is not listed in the export, only the document content.

Further reading