Topic: Gemini Deep Research export

Gemini Deep Research export — reports, citations, and what appears in Google Takeout

Gemini Deep Research produces multi-thousand-word structured research reports from a single question. These reports appear in your Google Takeout export as part of the Gemini Apps Activity data — but only the final output is preserved. The intermediate search queries, the research plan, and the inline citation-to-sentence mapping are not. Here is exactly what survives the export and how to use Deep Research reports as durable architecture decision context.

TL;DR

Gemini Deep Research reports are preserved in Google Takeout (Gemini Apps Activity export) as long model messages within ordinary conversation JSON. The full report text, the original research question, cited source URLs, and timestamps are all in the export. What is missing: the intermediate search queries Gemini ran internally, the research planning step, and the content of every web page fetched. The inline citation markers ([1], [2]) may or may not survive as text depending on export version. For decision documentation, treat the Deep Research report as the Context section of an ADR — the decision itself is captured from the follow-up ChatGPT or Claude session where you deliberated on the findings.

What is Gemini Deep Research

Gemini Deep Research is a feature available to Gemini Advanced subscribers (Google One AI Premium, $19.99/month as of 2026). Instead of generating a single response to your question, Gemini Deep Research treats your query as a research brief: it autonomously plans a multi-step investigation, executes many individual web searches over a period of five to thirty minutes, synthesizes the findings from multiple sources, and delivers a comprehensive structured research report. The report is displayed inline in the Gemini conversation thread once the research phase completes.

The practical difference from standard Gemini is large. A standard Gemini response to a complex question is typically two to four paragraphs with a handful of citations. A Deep Research report on the same question is typically ten to forty sections with sub-headings, 3000 to 10000 words, and a numbered source list at the bottom that can contain 20 to 60 cited URLs. Deep Research is designed for the kind of thorough background investigation that would take a human researcher several hours — evaluating technology options, surveying a technical landscape, comparing competing frameworks — making it particularly valuable as pre-decision research for architectural choices.

What Gemini Deep Research reports look like in Google Takeout

Google Takeout (takeout.google.com) is the standard path for exporting Google account data. To export your Gemini conversation history including Deep Research reports, go to Google Takeout, click "Deselect all", scroll down to find "Gemini Apps Activity", check that item, then proceed to create an export. The resulting ZIP archive contains a folder with JSON files covering your Gemini conversation history. Depending on the volume of your history, this may be one file or split across multiple files, but the structure is consistent.

Each conversation in the export is a JSON object containing a list of messages. Each message has a role field (user or model) and a text content field. A Deep Research conversation looks like a standard Gemini conversation in the JSON structure, with one key difference: the final model message is very long. While a standard Gemini model response might be 200 to 500 words, a Deep Research report model message is typically 3000 to 10000 words — the full report text, including all section headings, body paragraphs, and the numbered source list at the bottom. The report appears as a single text content field with Markdown-style formatting: ## headings, bullet lists, numbered lists, and bold text are encoded as plain Markdown syntax within the string value. The source list at the bottom of the report is encoded as plain text — a numbered list of source entries, each containing the source title and URL.

What is in the Deep Research export

The Google Takeout Gemini Apps Activity export preserves the following for Deep Research conversations:

Data element In Takeout export? Notes
Original research prompt Yes Preserved as the user message immediately before the report in the conversation message list
Full report text (all sections) Yes Complete report body including all section headings, sub-headings, paragraphs, and lists — as the model message text field
Cited source URLs Yes Preserved as plain text in the numbered source list at the bottom of the report body; not as a separate structured array
Conversation timestamp Yes Creation timestamp on the conversation object; message-level timestamps may vary by export version
Report Markdown formatting Partially Section headings and lists are preserved as Markdown syntax within the text string; rendering fidelity depends on the export version
Inline citation markers ([1], [2]) Partially In some export versions the superscript markers are preserved as [N] text; in others they are stripped. The source list at the bottom is consistently preserved regardless
Follow-up messages in same conversation Yes If you asked follow-up questions after the report, those are preserved as additional user/model message pairs in the same conversation object
Research plan (planning step) No The outline Gemini generates before beginning the research phase is transient UI state and is not stored in the conversation data model

What is NOT in the Deep Research export

Four categories of data are generated during a Deep Research session but do not appear in the Takeout export:

How to extract citations from a Deep Research export

The Gemini Takeout JSON does not provide a separate citations array — the source list is embedded as plain text at the bottom of the report body. Extracting it programmatically requires finding the source list section within the report text. The following Python script reads a Gemini Takeout JSON file, identifies candidate Deep Research conversations by model message length, and extracts the citations section from each report:

import json
import re
import sys

WORD_THRESHOLD = 2000  # minimum words in model message to flag as Deep Research

def extract_citations(report_text):
    """Extract the numbered source list from a Deep Research report."""
    # Citations typically appear after a 'Sources' heading or as a trailing
    # numbered list. Find the last numbered-list block in the text.
    lines = report_text.splitlines()
    citation_lines = []
    in_citations = False
    for line in lines:
        if re.match(r'^(#{1,3}\s+)?(Sources|References|Works Cited)', line, re.IGNORECASE):
            in_citations = True
            continue
        if in_citations and re.match(r'^\d+[\.\)]\s+', line):
            citation_lines.append(line.strip())
        elif in_citations and line.strip() == '':
            continue  # allow blank lines within citations block
        elif in_citations and citation_lines:
            break  # non-matching line after citations started — done
    return citation_lines

def find_deep_research(export_path):
    with open(export_path, 'r', encoding='utf-8') as f:
        data = json.load(f)

    conversations = data if isinstance(data, list) else data.get('conversations', [])
    results = []
    for conv in conversations:
        messages = conv.get('messages', [])
        for msg in messages:
            if msg.get('role') != 'model':
                continue
            text = msg.get('text', '') or msg.get('content', '')
            word_count = len(text.split())
            if word_count >= WORD_THRESHOLD:
                # Find the user message that preceded this (the research question)
                idx = messages.index(msg)
                prompt = ''
                for prior in reversed(messages[:idx]):
                    if prior.get('role') == 'user':
                        prompt = (prior.get('text', '') or prior.get('content', ''))[:300]
                        break
                citations = extract_citations(text)
                results.append({
                    'conversation_id': conv.get('conversation_id', ''),
                    'timestamp': conv.get('create_time', ''),
                    'research_question': prompt,
                    'word_count': word_count,
                    'citations': citations,
                })
    return results

if __name__ == '__main__':
    path = sys.argv[1] if len(sys.argv) > 1 else 'gemini_export.json'
    reports = find_deep_research(path)
    print(json.dumps(reports, indent=2))

Run it as python3 extract_deep_research.py MyActivity.json. The output is a JSON array of objects — one per candidate Deep Research report — with the research question, timestamp, word count, and extracted citation list. The 2000-word threshold is conservative; adjust it up (e.g. 3000) to reduce false positives from unusually long standard responses, or down to catch shorter reports. The citation extractor looks for a "Sources" heading or a trailing numbered list; Deep Research reports consistently end with one or the other.

Note that the Gemini Takeout JSON field names for message text and role may vary slightly across export versions (some use text, others use content). The script above checks both. If neither matches your export, inspect the raw JSON for the message structure with python3 -c "import json; d=json.load(open('MyActivity.json')); print(list(d[0]['messages'][0].keys()))" and adjust the field names accordingly.

Decision-capture workflow for Deep Research reports

Deep Research reports are information-dense but ephemeral unless you act at the time of the session. The following three-step workflow keeps them recoverable and useful for long-term documentation:

  1. At the time of the research session: copy the share link and the citations list. The Gemini interface has a share icon for Deep Research reports — clicking it generates a persistent gemini.google.com/app/... URL that remains accessible until you delete the conversation. Copy this URL into your notes immediately. Also copy the numbered source list from the bottom of the report into a separate note — this is the evidence trail, and it is the piece most useful for the ADR Context section. Do not rely solely on Google Takeout retrieval: export timing is unpredictable and doing it immediately after each session is the only reliable approach.
  2. For long-term preservation: export to a local Markdown file within 30 days of critical research sessions. Google Takeout is the authoritative export path, but do not wait for an annual or quarterly export cycle for research that informs important decisions. Request a Gemini Apps Activity export from Google Takeout within 30 days of any research session whose output you want to preserve. From the export JSON, save the full report text as a local Markdown file with front-matter: include the date, the research question you submitted, the list of cited source URLs, and the share link. A simple front-matter block looks like:
    ---
    date: 2026-06-02
    tool: Gemini Deep Research
    research_question: "Compare Postgres, CockroachDB, and PlanetScale for a multi-region SaaS with 200M rows"
    share_link: https://gemini.google.com/app/abc123
    sources:
      - https://www.cockroachlabs.com/docs/stable/multi-region-overview.html
      - https://planetscale.com/docs/concepts/sharding
      - https://www.postgresql.org/docs/current/ddl-partitioning.html
    ---
    This Markdown file is the durable artifact. It is not dependent on Google's servers, does not require re-export, and can be committed to the same repository as your ADRs.
  3. The ADR connection: use the report as the Context section. The ADR Context section asks: what situation, constraints, and evidence existed when this decision was made? A Deep Research report answers that question directly — it is a structured account of the technical landscape as it existed when you were evaluating your options. Paste a summary of the report's key findings (two to four paragraphs) into the ADR Context section, and link to the saved Markdown file for the full evidence trail. Use the report's "Alternatives Considered" or comparative sections to populate the ADR's Options section if your template includes one. The source list becomes the ADR's references section. The decision itself comes from the follow-up deliberation — not from the Deep Research report, which surveys the landscape without making the call for you.

Gemini Deep Research vs Perplexity Deep Research vs ChatGPT Browse

The three main AI research tools that produce long-form research output each have distinct export characteristics. The comparison matters because the tool you use for pre-decision research determines how much of your evidence trail is recoverable later:

Feature Gemini Deep Research Perplexity Deep Research ChatGPT GPT-4o with Browse
Typical report length 3000–10000 words 1500–3000 words Varies — 500–3000 words per browsing-augmented response
Export in data download Yes — full report text in Google Takeout (Gemini Apps Activity) Partial — GDPR data request only; thread content coverage varies; no native export button Yes — full conversation text in conversations.json via ChatGPT Settings → Data Controls → Export
Structured citations in export Source list as plain text at the bottom of the report body; inline [N] markers may or may not survive export Citations array available in API response JSON when return_citations: true; product-side export does not provide a separate citations array Tether nodes in conversations.json encode browsed URLs per assistant message turn; not a clean citations array but extractable
Intermediate search queries Not exported — only final citations preserved Not exported — only final citations preserved Not fully exported — tether_browsing_display nodes show per-turn query context in some export versions but not a complete search log
Native share link Yes — persists until conversation is deleted; gemini.google.com/app/... URL Yes — thread link persists in Library until deleted Yes — chatgpt.com/share/... URL; persists until manually revoked
Decision-capture path Share link + local Markdown copy at session time; full report in Google Takeout export Manual copy at session time (no native export); GDPR request for partial retroactive recovery conversations.json via ChatGPT Settings export; comprehensive and reliable
Subscription required Yes — Gemini Advanced (Google One AI Premium) Yes — Perplexity Pro No — available on free tier with limits; Plus for higher volume

The headline finding: Gemini Deep Research and ChatGPT both provide reliable export paths — Google Takeout and ChatGPT's built-in export respectively — while Perplexity's export coverage is partial and slow. For research that will inform architecture decisions, prefer tools with reliable export paths, and capture share links immediately regardless of which tool you use.

Why this matters for decision capture

Deep Research reports are the most information-dense research artifact available in a standard AI-assisted engineering workflow. A single Deep Research session can survey a technical landscape in a depth and breadth that would take a human researcher half a day — pulling together benchmark data, documentation, GitHub issues, blog posts, and community discussions into a structured synthesis. For architectural decisions, that synthesis is invaluable: it captures what was true about the state of the available options at the moment the decision was being considered, which is exactly what the ADR Context section asks for.

But Deep Research reports precede the decision. They provide the landscape; they do not make the call. The decision itself — "we chose Postgres over CockroachDB because our team has Postgres expertise and the operational complexity of global distribution is not justified at our scale" — comes from the follow-up conversation where you deliberated on what the research found. That deliberation conversation typically happens in ChatGPT or Claude, where you bring the research findings and reason through which option to take and why. That deliberation conversation is extractable via the WhyChose extractor, which reads your conversations.json export and surfaces the decisions with their original reasoning, formatted as structured decision records.

The complete picture for a well-evidenced ADR requires both artifacts: the Deep Research report (Google Takeout) as the Context section, and the ChatGPT or Claude deliberation session (conversations.json) as the Decision and Consequences sections. Preserving only one gives you an incomplete record — either a thorough literature review with no decision, or a decision with no evidence trail. The workflow is: run Deep Research to build context, save the report immediately, deliberate in ChatGPT or Claude, extract the decision with WhyChose, paste the report summary into the ADR Context section. The result is a decision record that shows both what was decided and what evidence shaped it.

Get early access

Related questions

Are Gemini Deep Research reports included in Google Takeout?

Yes. When you export your Google account data via Google Takeout and select "Gemini Apps Activity", the resulting ZIP archive includes JSON files covering your Gemini conversation history. Deep Research conversations appear in that JSON as ordinary conversations, but with the final model message being very long — typically 3000 to 10000 words — because it contains the full research report. The original research prompt, the complete report text, the cited source URLs listed at the bottom of the report, and conversation timestamps are all preserved. The intermediate search queries Gemini ran internally and the research planning step are not included.

How do I find Deep Research conversations in my Gemini export?

Deep Research conversations are not flagged with a special field in the Gemini Takeout JSON — they look like standard conversations structurally. The most reliable identification method is message length: Deep Research model responses are substantially longer than standard responses, typically 2000 words or more. Load the export JSON in Python, iterate over conversations, and flag any conversation where the model message text exceeds roughly 2000 words as a Deep Research candidate. You can also look for the source list at the end of the model message text — Deep Research reports consistently end with a numbered source list or a "Sources" heading that does not appear in standard responses. The research question that triggered the report is the user message immediately preceding the long model message.

Do Deep Research exports include the intermediate web search queries?

No. The intermediate search queries that Gemini runs during the Deep Research phase are not stored or exportable from any path. During a Deep Research session, Gemini internally executes many individual web searches to gather the information it synthesizes into the report. These queries are not surfaced to the user in the UI beyond a brief progress indicator, and they are not stored in the conversation data model. Only the final output — the research question and the completed report — is in the export. The cited source URLs at the bottom of the report give indirect evidence of some searches that ran, but the complete internal search log is not recoverable. This limitation applies equally to Perplexity Deep Research and ChatGPT Browse.

How do I use a Gemini Deep Research report as ADR context?

A Deep Research report maps directly to the Context section of an Architecture Decision Record, which asks: what situation, constraints, and evidence existed when this decision was made? Export the report from Google Takeout or copy it to a local Markdown file with front-matter (date, research question, source URLs). When writing the ADR, paste a summary of the report's key findings into the Context section and link to the saved Markdown file for the full evidence trail. The report's comparative analysis maps to the ADR's options or alternatives section. The decision and its consequences come from the follow-up deliberation conversation in ChatGPT or Claude — not from the Deep Research report itself, which surveys the landscape but does not make the architectural call. That deliberation is extractable via WhyChose from the conversations.json export.

Further reading