Topic: Gemini Deep Research export
Gemini Deep Research export — reports, citations, and what appears in Google Takeout
Gemini Deep Research produces multi-thousand-word structured research reports from a single question. These reports appear in your Google Takeout export as part of the Gemini Apps Activity data — but only the final output is preserved. The intermediate search queries, the research plan, and the inline citation-to-sentence mapping are not. Here is exactly what survives the export and how to use Deep Research reports as durable architecture decision context.
TL;DR
Gemini Deep Research reports are preserved in Google Takeout (Gemini Apps Activity export) as long model messages within ordinary conversation JSON. The full report text, the original research question, cited source URLs, and timestamps are all in the export. What is missing: the intermediate search queries Gemini ran internally, the research planning step, and the content of every web page fetched. The inline citation markers ([1], [2]) may or may not survive as text depending on export version. For decision documentation, treat the Deep Research report as the Context section of an ADR — the decision itself is captured from the follow-up ChatGPT or Claude session where you deliberated on the findings.
What is Gemini Deep Research
Gemini Deep Research is a feature available to Gemini Advanced subscribers (Google One AI Premium, $19.99/month as of 2026). Instead of generating a single response to your question, Gemini Deep Research treats your query as a research brief: it autonomously plans a multi-step investigation, executes many individual web searches over a period of five to thirty minutes, synthesizes the findings from multiple sources, and delivers a comprehensive structured research report. The report is displayed inline in the Gemini conversation thread once the research phase completes.
The practical difference from standard Gemini is large. A standard Gemini response to a complex question is typically two to four paragraphs with a handful of citations. A Deep Research report on the same question is typically ten to forty sections with sub-headings, 3000 to 10000 words, and a numbered source list at the bottom that can contain 20 to 60 cited URLs. Deep Research is designed for the kind of thorough background investigation that would take a human researcher several hours — evaluating technology options, surveying a technical landscape, comparing competing frameworks — making it particularly valuable as pre-decision research for architectural choices.
What Gemini Deep Research reports look like in Google Takeout
Google Takeout (takeout.google.com) is the standard path for exporting Google account data. To export your Gemini conversation history including Deep Research reports, go to Google Takeout, click "Deselect all", scroll down to find "Gemini Apps Activity", check that item, then proceed to create an export. The resulting ZIP archive contains a folder with JSON files covering your Gemini conversation history. Depending on the volume of your history, this may be one file or split across multiple files, but the structure is consistent.
Each conversation in the export is a JSON object containing a list of messages. Each message has a role field (user or model) and a text content field. A Deep Research conversation looks like a standard Gemini conversation in the JSON structure, with one key difference: the final model message is very long. While a standard Gemini model response might be 200 to 500 words, a Deep Research report model message is typically 3000 to 10000 words — the full report text, including all section headings, body paragraphs, and the numbered source list at the bottom. The report appears as a single text content field with Markdown-style formatting: ## headings, bullet lists, numbered lists, and bold text are encoded as plain Markdown syntax within the string value. The source list at the bottom of the report is encoded as plain text — a numbered list of source entries, each containing the source title and URL.
What is in the Deep Research export
The Google Takeout Gemini Apps Activity export preserves the following for Deep Research conversations:
| Data element | In Takeout export? | Notes |
|---|---|---|
| Original research prompt | Yes | Preserved as the user message immediately before the report in the conversation message list |
| Full report text (all sections) | Yes | Complete report body including all section headings, sub-headings, paragraphs, and lists — as the model message text field |
| Cited source URLs | Yes | Preserved as plain text in the numbered source list at the bottom of the report body; not as a separate structured array |
| Conversation timestamp | Yes | Creation timestamp on the conversation object; message-level timestamps may vary by export version |
| Report Markdown formatting | Partially | Section headings and lists are preserved as Markdown syntax within the text string; rendering fidelity depends on the export version |
| Inline citation markers ([1], [2]) | Partially | In some export versions the superscript markers are preserved as [N] text; in others they are stripped. The source list at the bottom is consistently preserved regardless |
| Follow-up messages in same conversation | Yes | If you asked follow-up questions after the report, those are preserved as additional user/model message pairs in the same conversation object |
| Research plan (planning step) | No | The outline Gemini generates before beginning the research phase is transient UI state and is not stored in the conversation data model |
What is NOT in the Deep Research export
Four categories of data are generated during a Deep Research session but do not appear in the Takeout export:
- Intermediate search queries. During the research phase, Gemini internally executes many individual web searches — sometimes dozens for a single report. These queries are not surfaced in the UI (beyond a brief "searching the web" progress indicator) and are not stored in the conversation data model. Only the final output is stored. The cited source URLs give indirect evidence of some searches Gemini ran, but the complete search query log is not recoverable from any export path.
- Full content of fetched web pages. Gemini fetches and reads the content of many web pages during the research phase. Only the URLs of sources it chose to cite appear in the report's source list. The content of every page consulted — including pages Gemini read but did not cite — is not stored. This is analogous to a human researcher's browser history: only the sources they chose to reference end up in the bibliography.
- Structured citation-to-sentence mapping. Deep Research reports use superscript citation markers like [1] and [2] to indicate which source supports which claim. The source list at the bottom maps these numbers to URLs. In the export, the source list is plain text and any inline [N] markers that survive are also plain text within the prose. There is no machine-parseable data structure that maps "the claim in sentence X was supported by source Y" — only the prose with embedded markers and the list at the bottom. Reconstructing the precise citation-to-claim mapping from the export requires parsing the text.
- Research branching and abandoned sub-queries. If Gemini began a research direction and found it unproductive, that branch is not logged anywhere accessible. The report only reflects the synthesis of the research paths that produced usable findings. There is no audit trail of what Gemini tried and discarded during the investigation phase.
How to extract citations from a Deep Research export
The Gemini Takeout JSON does not provide a separate citations array — the source list is embedded as plain text at the bottom of the report body. Extracting it programmatically requires finding the source list section within the report text. The following Python script reads a Gemini Takeout JSON file, identifies candidate Deep Research conversations by model message length, and extracts the citations section from each report:
import json
import re
import sys
WORD_THRESHOLD = 2000 # minimum words in model message to flag as Deep Research
def extract_citations(report_text):
"""Extract the numbered source list from a Deep Research report."""
# Citations typically appear after a 'Sources' heading or as a trailing
# numbered list. Find the last numbered-list block in the text.
lines = report_text.splitlines()
citation_lines = []
in_citations = False
for line in lines:
if re.match(r'^(#{1,3}\s+)?(Sources|References|Works Cited)', line, re.IGNORECASE):
in_citations = True
continue
if in_citations and re.match(r'^\d+[\.\)]\s+', line):
citation_lines.append(line.strip())
elif in_citations and line.strip() == '':
continue # allow blank lines within citations block
elif in_citations and citation_lines:
break # non-matching line after citations started — done
return citation_lines
def find_deep_research(export_path):
with open(export_path, 'r', encoding='utf-8') as f:
data = json.load(f)
conversations = data if isinstance(data, list) else data.get('conversations', [])
results = []
for conv in conversations:
messages = conv.get('messages', [])
for msg in messages:
if msg.get('role') != 'model':
continue
text = msg.get('text', '') or msg.get('content', '')
word_count = len(text.split())
if word_count >= WORD_THRESHOLD:
# Find the user message that preceded this (the research question)
idx = messages.index(msg)
prompt = ''
for prior in reversed(messages[:idx]):
if prior.get('role') == 'user':
prompt = (prior.get('text', '') or prior.get('content', ''))[:300]
break
citations = extract_citations(text)
results.append({
'conversation_id': conv.get('conversation_id', ''),
'timestamp': conv.get('create_time', ''),
'research_question': prompt,
'word_count': word_count,
'citations': citations,
})
return results
if __name__ == '__main__':
path = sys.argv[1] if len(sys.argv) > 1 else 'gemini_export.json'
reports = find_deep_research(path)
print(json.dumps(reports, indent=2))
Run it as python3 extract_deep_research.py MyActivity.json. The output is a JSON array of objects — one per candidate Deep Research report — with the research question, timestamp, word count, and extracted citation list. The 2000-word threshold is conservative; adjust it up (e.g. 3000) to reduce false positives from unusually long standard responses, or down to catch shorter reports. The citation extractor looks for a "Sources" heading or a trailing numbered list; Deep Research reports consistently end with one or the other.
Note that the Gemini Takeout JSON field names for message text and role may vary slightly across export versions (some use text, others use content). The script above checks both. If neither matches your export, inspect the raw JSON for the message structure with python3 -c "import json; d=json.load(open('MyActivity.json')); print(list(d[0]['messages'][0].keys()))" and adjust the field names accordingly.
Decision-capture workflow for Deep Research reports
Deep Research reports are information-dense but ephemeral unless you act at the time of the session. The following three-step workflow keeps them recoverable and useful for long-term documentation:
-
At the time of the research session: copy the share link and the citations list. The Gemini interface has a share icon for Deep Research reports — clicking it generates a persistent
gemini.google.com/app/...URL that remains accessible until you delete the conversation. Copy this URL into your notes immediately. Also copy the numbered source list from the bottom of the report into a separate note — this is the evidence trail, and it is the piece most useful for the ADR Context section. Do not rely solely on Google Takeout retrieval: export timing is unpredictable and doing it immediately after each session is the only reliable approach. -
For long-term preservation: export to a local Markdown file within 30 days of critical research sessions. Google Takeout is the authoritative export path, but do not wait for an annual or quarterly export cycle for research that informs important decisions. Request a Gemini Apps Activity export from Google Takeout within 30 days of any research session whose output you want to preserve. From the export JSON, save the full report text as a local Markdown file with front-matter: include the date, the research question you submitted, the list of cited source URLs, and the share link. A simple front-matter block looks like:
This Markdown file is the durable artifact. It is not dependent on Google's servers, does not require re-export, and can be committed to the same repository as your ADRs.--- date: 2026-06-02 tool: Gemini Deep Research research_question: "Compare Postgres, CockroachDB, and PlanetScale for a multi-region SaaS with 200M rows" share_link: https://gemini.google.com/app/abc123 sources: - https://www.cockroachlabs.com/docs/stable/multi-region-overview.html - https://planetscale.com/docs/concepts/sharding - https://www.postgresql.org/docs/current/ddl-partitioning.html --- - The ADR connection: use the report as the Context section. The ADR Context section asks: what situation, constraints, and evidence existed when this decision was made? A Deep Research report answers that question directly — it is a structured account of the technical landscape as it existed when you were evaluating your options. Paste a summary of the report's key findings (two to four paragraphs) into the ADR Context section, and link to the saved Markdown file for the full evidence trail. Use the report's "Alternatives Considered" or comparative sections to populate the ADR's Options section if your template includes one. The source list becomes the ADR's references section. The decision itself comes from the follow-up deliberation — not from the Deep Research report, which surveys the landscape without making the call for you.
Gemini Deep Research vs Perplexity Deep Research vs ChatGPT Browse
The three main AI research tools that produce long-form research output each have distinct export characteristics. The comparison matters because the tool you use for pre-decision research determines how much of your evidence trail is recoverable later:
| Feature | Gemini Deep Research | Perplexity Deep Research | ChatGPT GPT-4o with Browse |
|---|---|---|---|
| Typical report length | 3000–10000 words | 1500–3000 words | Varies — 500–3000 words per browsing-augmented response |
| Export in data download | Yes — full report text in Google Takeout (Gemini Apps Activity) | Partial — GDPR data request only; thread content coverage varies; no native export button | Yes — full conversation text in conversations.json via ChatGPT Settings → Data Controls → Export |
| Structured citations in export | Source list as plain text at the bottom of the report body; inline [N] markers may or may not survive export | Citations array available in API response JSON when return_citations: true; product-side export does not provide a separate citations array |
Tether nodes in conversations.json encode browsed URLs per assistant message turn; not a clean citations array but extractable |
| Intermediate search queries | Not exported — only final citations preserved | Not exported — only final citations preserved | Not fully exported — tether_browsing_display nodes show per-turn query context in some export versions but not a complete search log |
| Native share link | Yes — persists until conversation is deleted; gemini.google.com/app/... URL |
Yes — thread link persists in Library until deleted | Yes — chatgpt.com/share/... URL; persists until manually revoked |
| Decision-capture path | Share link + local Markdown copy at session time; full report in Google Takeout export | Manual copy at session time (no native export); GDPR request for partial retroactive recovery | conversations.json via ChatGPT Settings export; comprehensive and reliable |
| Subscription required | Yes — Gemini Advanced (Google One AI Premium) | Yes — Perplexity Pro | No — available on free tier with limits; Plus for higher volume |
The headline finding: Gemini Deep Research and ChatGPT both provide reliable export paths — Google Takeout and ChatGPT's built-in export respectively — while Perplexity's export coverage is partial and slow. For research that will inform architecture decisions, prefer tools with reliable export paths, and capture share links immediately regardless of which tool you use.
Why this matters for decision capture
Deep Research reports are the most information-dense research artifact available in a standard AI-assisted engineering workflow. A single Deep Research session can survey a technical landscape in a depth and breadth that would take a human researcher half a day — pulling together benchmark data, documentation, GitHub issues, blog posts, and community discussions into a structured synthesis. For architectural decisions, that synthesis is invaluable: it captures what was true about the state of the available options at the moment the decision was being considered, which is exactly what the ADR Context section asks for.
But Deep Research reports precede the decision. They provide the landscape; they do not make the call. The decision itself — "we chose Postgres over CockroachDB because our team has Postgres expertise and the operational complexity of global distribution is not justified at our scale" — comes from the follow-up conversation where you deliberated on what the research found. That deliberation conversation typically happens in ChatGPT or Claude, where you bring the research findings and reason through which option to take and why. That deliberation conversation is extractable via the WhyChose extractor, which reads your conversations.json export and surfaces the decisions with their original reasoning, formatted as structured decision records.
The complete picture for a well-evidenced ADR requires both artifacts: the Deep Research report (Google Takeout) as the Context section, and the ChatGPT or Claude deliberation session (conversations.json) as the Decision and Consequences sections. Preserving only one gives you an incomplete record — either a thorough literature review with no decision, or a decision with no evidence trail. The workflow is: run Deep Research to build context, save the report immediately, deliberate in ChatGPT or Claude, extract the decision with WhyChose, paste the report summary into the ADR Context section. The result is a decision record that shows both what was decided and what evidence shaped it.
Related questions
Are Gemini Deep Research reports included in Google Takeout?
Yes. When you export your Google account data via Google Takeout and select "Gemini Apps Activity", the resulting ZIP archive includes JSON files covering your Gemini conversation history. Deep Research conversations appear in that JSON as ordinary conversations, but with the final model message being very long — typically 3000 to 10000 words — because it contains the full research report. The original research prompt, the complete report text, the cited source URLs listed at the bottom of the report, and conversation timestamps are all preserved. The intermediate search queries Gemini ran internally and the research planning step are not included.
How do I find Deep Research conversations in my Gemini export?
Deep Research conversations are not flagged with a special field in the Gemini Takeout JSON — they look like standard conversations structurally. The most reliable identification method is message length: Deep Research model responses are substantially longer than standard responses, typically 2000 words or more. Load the export JSON in Python, iterate over conversations, and flag any conversation where the model message text exceeds roughly 2000 words as a Deep Research candidate. You can also look for the source list at the end of the model message text — Deep Research reports consistently end with a numbered source list or a "Sources" heading that does not appear in standard responses. The research question that triggered the report is the user message immediately preceding the long model message.
Do Deep Research exports include the intermediate web search queries?
No. The intermediate search queries that Gemini runs during the Deep Research phase are not stored or exportable from any path. During a Deep Research session, Gemini internally executes many individual web searches to gather the information it synthesizes into the report. These queries are not surfaced to the user in the UI beyond a brief progress indicator, and they are not stored in the conversation data model. Only the final output — the research question and the completed report — is in the export. The cited source URLs at the bottom of the report give indirect evidence of some searches that ran, but the complete internal search log is not recoverable. This limitation applies equally to Perplexity Deep Research and ChatGPT Browse.
How do I use a Gemini Deep Research report as ADR context?
A Deep Research report maps directly to the Context section of an Architecture Decision Record, which asks: what situation, constraints, and evidence existed when this decision was made? Export the report from Google Takeout or copy it to a local Markdown file with front-matter (date, research question, source URLs). When writing the ADR, paste a summary of the report's key findings into the Context section and link to the saved Markdown file for the full evidence trail. The report's comparative analysis maps to the ADR's options or alternatives section. The decision and its consequences come from the follow-up deliberation conversation in ChatGPT or Claude — not from the Deep Research report itself, which surveys the landscape but does not make the architectural call. That deliberation is extractable via WhyChose from the conversations.json export.
Further reading
- Gemini conversation export — how to download your Gemini history via Google Takeout — the base guide to the Google Takeout export path for all Gemini conversations: step-by-step instructions, the JSON structure for standard conversations, what the export includes and excludes, and how Gemini's export compares to ChatGPT and Claude. The Deep Research export described on this page is a specific case of the broader Gemini Takeout export covered there.
- Gemini for Google Workspace export — enterprise Gemini in Docs, Gmail, and Meet — the enterprise layer on top of consumer Gemini: Gemini in Google Workspace (formerly Duet AI) integrated into Docs, Gmail, Sheets, and Meet. Different data residency controls, Vault/eDiscovery export paths for admins, and the Workspace Data Export tool. If you use Gemini through a Google Workspace account rather than a personal Google account, the export architecture differs from the consumer Takeout path described on this page.
- Perplexity Spaces export — what team workspace data is recoverable — similar research tool, contrasting export story. Perplexity Spaces is the team workspace for the research tool most comparable to Gemini Deep Research in use case. It has the worst export coverage of any major team AI workspace — nothing is batch-exportable: no threads, instructions, or uploaded context files. The contrast with Gemini's Google Takeout path makes the relative export quality clear.
- Perplexity API vs Perplexity product — export gap and the stateless API — the API-vs-product split for Perplexity: the Perplexity API is stateless (no storage, no export path), the product stores Library threads (partial GDPR export only). The parallel structure to this page's Gemini Deep Research vs standard Gemini discussion — and the relevant comparison for engineers choosing between the two research tools for pre-decision investigation.
- The WhyChose extractor — reads ChatGPT and Claude conversations.json exports and surfaces the decisions inside as structured records. This is the tool for the deliberation side of the workflow: after using Gemini Deep Research to build context, bring the findings to ChatGPT or Claude for deliberation, export those conversations, and run them through the extractor to produce the Decision and Consequences sections of your ADR. The Deep Research report provides the Context; the extractor surfaces the Decision.
- Google NotebookLM export — what you can save and what you can't — the document-synthesis counterpart to Gemini Deep Research's web-research role. NotebookLM synthesizes sources you upload (PDFs, Drive docs, web URLs) rather than conducting autonomous web searches — complementary tools for the research phase of a decision. The critical difference: Deep Research reports appear in Google Takeout Gemini Apps Activity; NotebookLM outputs are not in Takeout at all and require manual capture at session time.
- Perplexity Deep Research export — saving reports before they disappear — the direct comparison product: Perplexity also offers a Deep Research mode that runs iterative web searches and produces a structured synthesis report. The critical export asymmetry: Gemini Deep Research reports can be exported to Google Docs and retrieved via Takeout; Perplexity Deep Research reports are not in Perplexity's data export at all and require manual PDF print or Google Docs save at the time of generation. Both tools serve the same research-synthesis role in an ADR workflow; this page documents the Perplexity-side gap.
- Gemini Advanced export — what the Google One subscription changes (and what it doesn't) — Deep Research is available only to Gemini Advanced (Google One AI Premium) subscribers. This page covers the export path for the Advanced subscription overall: Advanced is a model-tier upgrade, not a separate storage system; Deep Research reports appear in the same Takeout path as regular Gemini conversations; and what the Gems export includes vs excludes. If you're a Gemini Advanced subscriber, read this page for the full Advanced-tier export picture beyond Deep Research alone.