Topic: ChatGPT image generation export
ChatGPT Image Generation in the Data Export — DALL-E Prompts, CDN URLs, and What Expires
Engineers and designers who use ChatGPT to generate UI mockups, architecture diagrams, brand assets, and design explorations often discover a painful fact when they export their conversation history: the image files are not in the ZIP. The conversations.json export preserves the prompt text and a CDN URL for each generated image — but the CDN URL expires roughly 30 days after the image was created, after which the image is gone from OpenAI's servers permanently. This page explains exactly how DALL-E 3 and GPT-4o native image generation appear in conversations.json, how to extract your image generation history, and what recovery options exist when the CDN URLs have already expired.
TL;DR
DALL-E 3 images: the export preserves your original prompt, the DALL-E revised prompt (which may differ substantially), and a CDN URL that expires approximately 30 days post-generation. The image binary is not in the ZIP. DALL-E invocations appear as tool messages with author.name: "dalle.text2im" in the mapping DAG. GPT-4o native image generation: appears as content_type: "image_asset_pointer" with an internal file reference URI — not a public URL, not downloadable from the export. Recovery after expiry: regenerate from the preserved prompt text (revised_prompt for DALL-E 3). The prompt is always preserved regardless of image URL status. Capture strategy: download images at generation time; don't rely on CDN URLs as an archive.
What the ChatGPT data export does and doesn't include for images
| Item | In conversations.json? | As a downloadable file? | Expiry? | Notes |
|---|---|---|---|---|
| Your original image prompt (what you typed) | Yes — in the user message content | n/a (text) | Never — permanent | Preserved in the parts array of the user message turn |
| DALL-E revised prompt (what DALL-E actually used) | Yes — in the dalle.text2im tool invocation | n/a (text) | Never — permanent | Often significantly different from your original prompt; contains the full revised instruction |
| DALL-E image CDN URL | Yes — in the dalle.text2im tool result | Not in the ZIP; must fetch the URL while it's live | ~30 days from generation | URL format: https://files.oaiusercontent.com/file-XXXXX?.... Returns HTTP 404 after expiry. |
| DALL-E image binary | No | No | ~30 days | OpenAI does not store image binaries indefinitely; only the CDN URL is preserved in the export |
| GPT-4o native image (content_type: image_asset_pointer) | Yes — as an internal asset_pointer URI | No — the URI is an internal reference, not a public URL | Unknown — may be shorter than DALL-E CDN | Worse export coverage than DALL-E 3; no public URL path exists |
| Image generation model version | Yes — as model_slug (e.g., gpt-4-gizmo-dalle) | n/a | Never | Identifies whether DALL-E 3, DALL-E 2, or GPT-4o native generation was used |
| Number of images generated per prompt | Partially — each image is a separate URL in the tool result | No | Per URL | When you ask for multiple images, each gets its own CDN URL in the tool result content |
The critical asymmetry: the text-based components of image generation (your prompt, the revised prompt, the conversation around the image) are preserved permanently in conversations.json. The image itself is ephemeral. If you need the image file, download it at generation time — do not assume the CDN URL will remain accessible.
How DALL-E 3 image generation appears in conversations.json
Understanding the schema is necessary before you can write the jq queries to extract image generation history. DALL-E 3 image generation produces three types of messages in the mapping DAG:
1. The user prompt message
A standard user message with your image description in the parts array:
{
"id": "user-turn-uuid",
"message": {
"id": "msg-uuid",
"author": { "role": "user" },
"content": {
"content_type": "text",
"parts": ["Generate a diagram showing a three-tier web architecture with load balancer, application servers, and database cluster"]
},
"create_time": 1738234567.123
},
"parent": "parent-uuid",
"children": ["tool-invocation-uuid"]
}
2. The DALL-E tool invocation message
This is the message that identifies image generation sessions. The key identifier is author.name: "dalle.text2im":
{
"id": "tool-invocation-uuid",
"message": {
"id": "msg-uuid",
"author": {
"role": "assistant",
"name": "dalle.text2im"
},
"content": {
"content_type": "tether_browsing_display",
"result": "{\"size\":\"1792x1024\",\"prompt\":\"A professional technical diagram showing a three-tier web architecture. At the top, a load balancer distributes traffic to three application server nodes in the middle tier, shown as blue rectangles with server icons. At the bottom, a primary PostgreSQL database cluster flanked by two read replicas, shown as dark cylinders. Clean white background, minimal style, architecture diagram aesthetic.\",\"dalle_prompt\":\"...\"}",
"parts": [
{
"content_type": "image_asset_pointer",
"asset_pointer": "file-service://file-XXXXXXXXXXXXX",
"size_bytes": 1234567,
"width": 1792,
"height": 1024
}
]
},
"create_time": 1738234570.456
},
"parent": "user-turn-uuid",
"children": ["assistant-followup-uuid"]
}
Note: the exact schema varies by ChatGPT version. Older DALL-E 3 sessions (pre-2025) may have the revised prompt in a text field inside a JSON string in the result field. Newer sessions use the parts array structure shown above. Both include the prompt; the field names differ.
3. The assistant follow-up message
The assistant's response after image generation — typically includes a Markdown image embed with the CDN URL and a text description of what was generated:
{
"id": "assistant-followup-uuid",
"message": {
"id": "msg-uuid",
"author": { "role": "assistant" },
"content": {
"content_type": "text",
"parts": ["Here's your three-tier architecture diagram:\n\n\n\nThe diagram shows the load balancer at the top routing traffic to three application servers..."]
}
}
}
The CDN URL appears in the Markdown image syntax  embedded in the assistant's text. This is a second location where you can find the URL — in addition to the tool invocation message.
jq recipes for image generation extraction
Count image generation sessions
jq '[
.[] | select(
.mapping | to_entries[].value.message.author.name? == "dalle.text2im"
)
] | length' conversations.json
Returns the number of conversations that contain at least one DALL-E image generation.
Extract all image prompts (what you asked for)
jq -r '
[.[] |
.mapping | to_entries[].value |
select(.message.author.role? == "user") |
.message.content.parts[]? |
select(type == "string")
] | .[]
' conversations.json
This extracts the user-turn text from all conversations. To filter for conversations that contain image generation, chain with the DALL-E session filter.
Extract all CDN URLs from the assistant follow-up messages
jq -r '
[.[] |
.mapping | to_entries[].value |
select(.message.author.role? == "assistant" and
(.message.author.name? != "dalle.text2im")) |
.message.content.parts[]? |
select(type == "string") |
scan("https://files\\.oaiusercontent\\.com/[^)\"\\s]+")
] | .[]
' conversations.json
This scans assistant message text for CDN URLs matching the files.oaiusercontent.com pattern. Output is one URL per line, suitable for piping to a bulk-download script.
Full image generation inventory: prompt + URL + date
#!/usr/bin/env bash
# Produces a TSV: conversation_title | create_date | user_prompt | cdn_url
jq -r '
.[] | . as $conv |
.mapping | to_entries[] | . as $entry |
select($entry.value.message.author.name? == "dalle.text2im") |
{
title: $conv.title,
date: ($entry.value.message.create_time | todate),
parent_id: $entry.value.parent
} as $meta |
# Get the user prompt from the parent node
($conv.mapping | to_entries[] |
select(.key == $meta.parent_id) |
.value.message.content.parts[]? |
select(type == "string")) as $prompt |
# Get the CDN URL from the children (assistant follow-up)
($conv.mapping | to_entries[] |
select(.value.parent == $entry.key) |
.value.message.content.parts[]? |
select(type == "string") |
scan("https://files\\.oaiusercontent\\.com/[^)\"\\s]+")
) as $url |
[$meta.title, $meta.date, $prompt, $url] | @tsv
' conversations.json
Save this as extract-images.sh and run bash extract-images.sh > image-inventory.tsv. The result is a tab-separated file you can open in a spreadsheet application for review.
Bulk-check which CDN URLs are still live
#!/usr/bin/env bash
# Read CDN URLs from a file (one per line) and report live vs expired
while IFS= read -r url; do
status=$(curl -s -o /dev/null -w "%{http_code}" --head --max-time 5 "$url")
if [ "$status" = "200" ]; then
echo "LIVE $url"
else
echo "EXPIRED($status) $url"
fi
done < cdn-urls.txt
Pipe the CDN URL extract above into cdn-urls.txt, then run this script to identify which images are still downloadable and which have expired. For images still returning HTTP 200, download with curl -o output-filename.webp "URL" before they expire.
GPT-4o native image generation
GPT-4o's native image generation capability (launched in 2025, distinct from the DALL-E 3 tool) produces images that appear differently in conversations.json than DALL-E 3 invocations. The key difference: GPT-4o native images are referenced via internal asset_pointer URIs rather than public CDN URLs.
How to identify GPT-4o native image generation in your export:
jq '[
.[] | .mapping | to_entries[].value |
select(
.message.content.parts? |
arrays |
.[] |
objects |
.content_type? == "image_asset_pointer"
)
] | length' conversations.json
The asset_pointer field contains an internal URI like file-service://file-XXXXXXXXXXXXX. This is not a URL you can fetch with curl. There is no documented public endpoint that resolves these internal file references from outside ChatGPT's authenticated session. The image is referenced in your export data, but it is not accessible from the export alone.
The practical difference from DALL-E 3:
| Attribute | DALL-E 3 (tool invocation) | GPT-4o native image generation |
|---|---|---|
| Identifier in export | author.name: "dalle.text2im" |
content_type: "image_asset_pointer" in parts array |
| Image URL type | Public CDN URL (files.oaiusercontent.com) |
Internal file reference URI (file-service://file-XXX) |
| Downloadable from export? | Yes — while CDN URL is live (~30 days) | No — internal URI, no public endpoint |
| Prompt preserved? | Yes — both original and revised prompt | Yes — in the conversation turn where image was requested |
| Recovery path when image lost | Regenerate from preserved revised_prompt | Regenerate from original user prompt in conversation |
The CDN URL expiry timeline
Based on observed behaviour in conversations.json exports, DALL-E 3 CDN URLs follow this pattern:
- Day 0–7 (fresh): URL returns HTTP 200. Image is downloadable. The URL contains a signed expiry parameter (
se=YYYY-MM-DDTHH%3A00%3A00Zin the query string) that shows the exact expiry timestamp. - Day 8–30 (aging): URL may still return HTTP 200 but expiry is approaching. Check the
se=parameter to confirm remaining lifetime. - Day 31+ (expired): URL returns HTTP 404. OpenAI has deleted the stored image. The URL in your conversations.json is now a dead reference. The prompt text remains accessible.
The expiry timestamp is encoded in the CDN URL itself. Extract it with:
python3 -c "
from urllib.parse import urlparse, parse_qs
import sys
url = sys.stdin.read().strip()
params = parse_qs(urlparse(url).query)
print('Expires:', params.get('se', ['not found'])[0])
" <<< "https://files.oaiusercontent.com/file-XXXXX?se=2025-03-15T12%3A00%3A00Z&sp=r&..."
This prints the expiry date for any CDN URL you extract from conversations.json. Images generated more than 30 days before your export date will have already expired; images generated within 30 days of the export may still be downloadable if you act immediately after downloading the ZIP.
Recovery strategy when images have already expired
The preserved prompt text is your primary recovery asset. DALL-E 3 preserves the revised_prompt — the full, detailed prompt that DALL-E actually used to generate the image — which is typically more detailed and useful for regeneration than the original short prompt you typed.
Regeneration workflow:
- Extract the revised_prompt from the DALL-E tool invocation using the jq recipes above.
- Submit the revised_prompt verbatim to DALL-E 3 in a new ChatGPT conversation. The result will not be pixel-identical (generative models produce different outputs on each run) but will be stylistically consistent with the original.
- If exact consistency is required (e.g., you used a specific random seed), note that DALL-E 3 via ChatGPT does not expose a seed parameter in the public UI. Use the OpenAI Images API directly (
POST /v1/images/generationswith model: dall-e-3) where seeds may be set in some configurations.
For design work where the image represents a specific decision or specification, the revised_prompt itself may be more valuable than the image — it is an unambiguous, machine-readable specification of what was designed.
Extracting revised prompts for regeneration
#!/usr/bin/env bash
# Extract all DALL-E revised prompts from conversations.json
# Output: one prompt per line, suitable for batch regeneration
python3 <<'EOF'
import json, re, sys
with open("conversations.json") as f:
data = json.load(f)
prompts = []
for conv in data:
for node in conv.get("mapping", {}).values():
msg = node.get("message", {})
if msg.get("author", {}).get("name") == "dalle.text2im":
content = msg.get("content", {})
# Try parts array (newer format)
for part in content.get("parts", []):
if isinstance(part, dict) and "text" in part:
prompts.append({
"conversation": conv.get("title", ""),
"date": msg.get("create_time", 0),
"revised_prompt": part["text"]
})
# Try result field (older format)
result = content.get("result", "")
if result:
try:
result_json = json.loads(result)
if "prompt" in result_json:
prompts.append({
"conversation": conv.get("title", ""),
"date": msg.get("create_time", 0),
"revised_prompt": result_json["prompt"]
})
except (json.JSONDecodeError, TypeError):
pass
for p in sorted(prompts, key=lambda x: x["date"]):
from datetime import datetime
date_str = datetime.fromtimestamp(p["date"]).strftime("%Y-%m-%d")
print(f"[{date_str}] {p['conversation']}")
print(p["revised_prompt"])
print("---")
EOF
This script handles both the pre-2025 and post-2025 DALL-E 3 schema variants and outputs a human-readable list of every revised prompt in your export, sorted by date.
Capture strategy: don't rely on CDN URLs
The architectural lesson from the CDN URL expiry is the same as the lesson from ChatGPT shared links: content that lives only on OpenAI's servers — rather than being embedded in the export — expires. The export is a snapshot of text and references; it is not a complete archive of all generated artifacts.
Practical capture strategy for engineers and designers who use ChatGPT image generation for work artifacts:
- At generation time: right-click each image in the ChatGPT UI and save it locally. The UI-displayed image is at full resolution and does not expire during the browser session. This 10-second action at generation time prevents hours of regeneration work later.
- For batch recovery (recent export): run the CDN URL extractor immediately after downloading your conversations.json ZIP, check which URLs are still live using the bulk-check script, and download live images before they expire. Images from the last 30 days are recoverable; older images are not.
- For prompt archival: run the revised_prompt extractor and save the output to a plaintext file (
image-prompts.md). The prompts are permanently preserved and give you regeneration capability on demand, even after the CDN URLs expire. For design-decision documentation, the revised prompt is the specification — paste it into the decision record alongside the ADR or design doc it informs.
How image generation sessions compare to other ChatGPT export gaps
| Content type | In conversations.json? | As downloadable binary? | Permanently preserved? |
|---|---|---|---|
| Conversation text (all turns) | Yes — full text | n/a | Yes |
| DALL-E image prompt + revised prompt | Yes | n/a (text) | Yes |
| DALL-E image binary | No — CDN URL only | No (must fetch live URL) | No (~30 days) |
| GPT-4o native image | Internal URI only | No | No (no public download path) |
| Uploaded image files | Filename only (asset_pointer) | No | No (binary excluded from export) |
| Code Interpreter output files | Code text yes; generated file binaries no | No | No (only code text) |
| ChatGPT Memory entries | In memory.json (separate file in ZIP) | n/a (text) | Yes (until deleted) |
Image generation as a decision-capture surface
For engineers and product designers, ChatGPT image generation sessions often contain specification-quality content: the prompt you wrote to generate an architecture diagram, a UI mockup, or a brand asset is effectively a specification of what you were designing. "Generate a diagram showing event-driven data flow between three microservices using an event bus, with the order-service as producer and inventory-service plus notification-service as consumers" is a more precise description of the intended architecture than most informal ADRs.
The revised_prompt that DALL-E 3 produces from your original prompt is even more detailed — it expands the specification with rendering decisions (style, colour palette, composition) that reflect how the image was actually produced. For design documentation, preserving the revised_prompt alongside the image is equivalent to preserving the compiled output alongside the source code.
When you use the WhyChose extractor on your conversations.json, image generation turns are processed differently from text turns — the extractor identifies tool-invocation messages and flags them as potential specification artifacts rather than decision rationale turns. For sessions where image generation was the output of a design decision (e.g., you discussed options for an architecture diagram and then generated the winning design), the extractor surfaces the decision context from the text turns, and the image prompt documents the specification that resulted from that decision. Together they form a complete decision record: the reasoning that led to the design and the specification that captured the design itself.
Related questions
Are DALL-E images included in the ChatGPT data export?
Not as downloadable image files. The ChatGPT export preserves the original prompt, the DALL-E revised prompt, and a CDN URL for each generated image. The CDN URL expires approximately 30 days after generation and returns HTTP 404 after expiry. The image binary itself is not in the ZIP — you must download images during the live CDN window or regenerate them from the preserved prompt text.
How do I find image generation sessions in my conversations.json export?
DALL-E 3 invocations appear as tool messages with author.name: "dalle.text2im" in the mapping DAG. Run: jq '[.[] | select(.mapping | to_entries[].value.message.author.name? == "dalle.text2im")] | length' conversations.json to count conversations with image generation. For GPT-4o native images, look for content_type: "image_asset_pointer" in message parts arrays.
What happens to GPT-4o native image generation in the export?
GPT-4o native image generation (from 2025 onwards) appears as content_type: "image_asset_pointer" with an internal file-service://file-XXX URI. This is not a public URL — it cannot be fetched with curl or a browser outside an authenticated ChatGPT session. There is no supported download path for GPT-4o native images from the data export. Only the prompt text that produced the image is recoverable.
What is the revised_prompt field and why is it different from my original prompt?
DALL-E 3 automatically rewrites your prompt before generating images to improve quality and safety compliance. The revised_prompt in the export is what DALL-E actually used — often significantly more detailed than what you typed. For a 5-word original prompt, the revised prompt may be 100+ words. Both are valuable: the original reflects your intent (the requirement), the revised prompt reflects the specification that was built. The revised prompt is often more useful for regeneration because it contains the rendering instructions that produced the original output style.
Further reading
- ChatGPT conversations.json format — field reference (2026) — the complete schema reference for the mapping DAG, message content types, author roles, and tool invocation structures. Essential context for understanding where DALL-E tool messages appear in the node graph and how to walk the DAG to reconstruct full conversations including image generation turns.
- Uploaded files in ChatGPT exports — what's included, what's missing, and how to recover them — the parallel reference for file uploads. Covers the binary-exclusion rule that applies to uploaded PDFs and images, Code Interpreter output gaps, and how asset_pointer URIs work for files you uploaded vs. DALL-E images ChatGPT generated — the two categories have different schema representations but the same fundamental gap: binaries are not in the export.
- ChatGPT shared links — what persists, what expires, and how to archive conversations (2026) — the companion reference for expiry-by-design content in ChatGPT; shared links have the same expiry risk as image CDN URLs — they can 404 without warning when accounts are deleted or OpenAI changes URL schemes. The archival discipline for shared links (download at creation time) is the same as for image CDN URLs.
- ChatGPT web search in conversations.json — tether content types, what's stored, and how to extract citations (2026) — the reference for another category of tool-invocation messages in conversations.json. Web search tether_browsing_display and tether_quote nodes follow a similar pattern to DALL-E tool messages: tool-role author with a specific author.name value, structured content that most naive export scripts miss entirely.
- How to export your ChatGPT history (2026 guide) — the end-to-end export walkthrough: Settings → Data Controls → Export Data, the email confirmation, ZIP structure overview, and the common failure modes that result in incomplete exports.
- The open-source extractor — reads ChatGPT conversations.json and surfaces decision records as structured Markdown. For image generation sessions, the extractor identifies design decisions from the text context around image generation turns — the reasoning that led to the design is in the conversation text even when the image itself has expired.
- ChatGPT Canvas export — what's in your ZIP and how to extract documents — the contrasting case for non-standard ChatGPT content: Canvas documents (Markdown and code artifacts produced in collaborative document mode) have a significantly better export story than DALL-E images. Canvas document text is preserved permanently in message parts, does not depend on CDN URLs with expiry dates, and is fully present in the standard ZIP export. The comparison table on that page (Canvas vs DALL-E vs Code Interpreter vs file uploads) illustrates how the export completeness varies dramatically by content mode even within the same conversations.json file.