Topic: Gemini Deep Research export

Gemini Deep Research export — reports, citations, and what appears in Google Takeout

Q: Are Gemini Deep Research reports included in Google Takeout?

Yes. When you request a Google Takeout export and select 'Gemini Apps Activity', the resulting ZIP archive contains JSON files covering your Gemini conversation history. Deep Research conversations appear in that JSON as ordinary conversations, but the final model message is very long — typically 3000 to 10000 words — because it contains the full research report. The original research prompt, the complete report text (all sections and subsections), the cited source URLs listed at the bottom of the report, and conversation timestamps are all preserved. What is not preserved are the intermediate search queries Gemini ran internally during the research phase, the research plan that Gemini generated before executing, and the content of every web page fetched (only cited URLs, not all consulted pages).

Q: How do I find Deep Research conversations in my Gemini export?

Deep Research conversations are not marked with a special flag in the Gemini Takeout JSON. The most reliable proxy is message length: Deep Research model responses are substantially longer than standard Gemini responses — typically 2000 words or more. In Python, load the Gemini export JSON, iterate over conversations, and for each conversation check the length of the final model message text field. Any model message exceeding roughly 2000 words is a strong candidate for a Deep Research report. You can also look for a numbered source list near the end of the message text: Deep Research reports consistently end with a 'Sources' heading or a numbered list of citation URLs, which does not appear in standard Gemini responses. Once you find candidate conversations, the research question that triggered the report is the user message immediately preceding the long model message.

Q: How do I use a Gemini Deep Research report as ADR context?

A Deep Research report is ideal material for the Context section of an Architecture Decision Record. The Context section of an ADR asks: what situation existed when this decision was made, and what constraints and evidence shaped it? That is precisely what a Deep Research report contains. The workflow is: (1) export the report from Google Takeout or copy it to a local Markdown file immediately after the session; (2) add front-matter to the Markdown file with the date, the research question you asked, and the list of cited source URLs; (3) when writing the ADR, paste a summary of the report's key findings into the Context section and link to the saved Markdown file for the full evidence trail. The decision and its consequences then come from the follow-up ChatGPT or Claude session where you deliberated on what the research found — that deliberation conversation is extractable via WhyChose, and its output populates the Decision and Consequences sections of the same ADR. The Deep Research report is the Context; the ChatGPT or Claude deliberation is the Decision.

Gemini Deep Research produces multi-thousand-word structured research reports from a single question. These reports appear in your Google Takeout export as part of the Gemini Apps Activity data — but only the final output is preserved. The intermediate search queries, the research plan, and the inline citation-to-sentence mapping are not. Here is exactly what survives the export and how to use Deep Research reports as durable architecture decision context.

TL;DR

Gemini Deep Research reports are preserved in Google Takeout (Gemini Apps Activity export) as long model messages within ordinary conversation JSON. The full report text, the original research question, cited source URLs, and timestamps are all in the export. What is missing: the intermediate search queries Gemini ran internally, the research planning step, and the content of every web page fetched. The inline citation markers ([1], [2]) may or may not survive as text depending on export version. For decision documentation, treat the Deep Research report as the Context section of an ADR — the decision itself is captured from the follow-up ChatGPT or Claude session where you deliberated on the findings.

What is Gemini Deep Research

Gemini Deep Research is a feature available to Gemini Advanced subscribers (Google One AI Premium, $19.99/month as of 2026). Instead of generating a single response to your question, Gemini Deep Research treats your query as a research brief: it autonomously plans a multi-step investigation, executes many individual web searches over a period of five to thirty minutes, synthesizes the findings from multiple sources, and delivers a comprehensive structured research report. The report is displayed inline in the Gemini conversation thread once the research phase completes.

The practical difference from standard Gemini is large. A standard Gemini response to a complex question is typically two to four paragraphs with a handful of citations. A Deep Research report on the same question is typically ten to forty sections with sub-headings, 3000 to 10000 words, and a numbered source list at the bottom that can contain 20 to 60 cited URLs. Deep Research is designed for the kind of thorough background investigation that would take a human researcher several hours — evaluating technology options, surveying a technical landscape, comparing competing frameworks — making it particularly valuable as pre-decision research for architectural choices.

What Gemini Deep Research reports look like in Google Takeout

Google Takeout (takeout.google.com) is the standard path for exporting Google account data. To export your Gemini conversation history including Deep Research reports, go to Google Takeout, click "Deselect all", scroll down to find "Gemini Apps Activity", check that item, then proceed to create an export. The resulting ZIP archive contains a folder with JSON files covering your Gemini conversation history. Depending on the volume of your history, this may be one file or split across multiple files, but the structure is consistent.

Each conversation in the export is a JSON object containing a list of messages. Each message has a role field (user or model) and a text content field. A Deep Research conversation looks like a standard Gemini conversation in the JSON structure, with one key difference: the final model message is very long. While a standard Gemini model response might be 200 to 500 words, a Deep Research report model message is typically 3000 to 10000 words — the full report text, including all section headings, body paragraphs, and the numbered source list at the bottom. The report appears as a single text content field with Markdown-style formatting: ## headings, bullet lists, numbered lists, and bold text are encoded as plain Markdown syntax within the string value. The source list at the bottom of the report is encoded as plain text — a numbered list of source entries, each containing the source title and URL.

What is in the Deep Research export

The Google Takeout Gemini Apps Activity export preserves the following for Deep Research conversations:

Data element	In Takeout export?	Notes
Original research prompt	Yes	Preserved as the user message immediately before the report in the conversation message list
Full report text (all sections)	Yes	Complete report body including all section headings, sub-headings, paragraphs, and lists — as the model message text field
Cited source URLs	Yes	Preserved as plain text in the numbered source list at the bottom of the report body; not as a separate structured array
Conversation timestamp	Yes	Creation timestamp on the conversation object; message-level timestamps may vary by export version
Report Markdown formatting	Partially	Section headings and lists are preserved as Markdown syntax within the text string; rendering fidelity depends on the export version
Inline citation markers ([1], [2])	Partially	In some export versions the superscript markers are preserved as [N] text; in others they are stripped. The source list at the bottom is consistently preserved regardless
Follow-up messages in same conversation	Yes	If you asked follow-up questions after the report, those are preserved as additional user/model message pairs in the same conversation object
Research plan (planning step)	No	The outline Gemini generates before beginning the research phase is transient UI state and is not stored in the conversation data model

What is NOT in the Deep Research export

Four categories of data are generated during a Deep Research session but do not appear in the Takeout export:

Intermediate search queries. During the research phase, Gemini internally executes many individual web searches — sometimes dozens for a single report. These queries are not surfaced in the UI (beyond a brief "searching the web" progress indicator) and are not stored in the conversation data model. Only the final output is stored. The cited source URLs give indirect evidence of some searches Gemini ran, but the complete search query log is not recoverable from any export path.
Full content of fetched web pages. Gemini fetches and reads the content of many web pages during the research phase. Only the URLs of sources it chose to cite appear in the report's source list. The content of every page consulted — including pages Gemini read but did not cite — is not stored. This is analogous to a human researcher's browser history: only the sources they chose to reference end up in the bibliography.
Structured citation-to-sentence mapping. Deep Research reports use superscript citation markers like [1] and [2] to indicate which source supports which claim. The source list at the bottom maps these numbers to URLs. In the export, the source list is plain text and any inline [N] markers that survive are also plain text within the prose. There is no machine-parseable data structure that maps "the claim in sentence X was supported by source Y" — only the prose with embedded markers and the list at the bottom. Reconstructing the precise citation-to-claim mapping from the export requires parsing the text.
Research branching and abandoned sub-queries. If Gemini began a research direction and found it unproductive, that branch is not logged anywhere accessible. The report only reflects the synthesis of the research paths that produced usable findings. There is no audit trail of what Gemini tried and discarded during the investigation phase.

How to extract citations from a Deep Research export

The Gemini Takeout JSON does not provide a separate citations array — the source list is embedded as plain text at the bottom of the report body. Extracting it programmatically requires finding the source list section within the report text. The following Python script reads a Gemini Takeout JSON file, identifies candidate Deep Research conversations by model message length, and extracts the citations section from each report:

import json
import re
import sys

WORD_THRESHOLD = 2000  # minimum words in model message to flag as Deep Research

def extract_citations(report_text):
    """Extract the numbered source list from a Deep Research report."""
    # Citations typically appear after a 'Sources' heading or as a trailing
    # numbered list. Find the last numbered-list block in the text.
    lines = report_text.splitlines()
    citation_lines = []
    in_citations = False
    for line in lines:
        if re.match(r'^(#{1,3}\s+)?(Sources|References|Works Cited)', line, re.IGNORECASE):
            in_citations = True
            continue
        if in_citations and re.match(r'^\d+[\.\)]\s+', line):
            citation_lines.append(line.strip())
        elif in_citations and line.strip() == '':
            continue  # allow blank lines within citations block
        elif in_citations and citation_lines:
            break  # non-matching line after citations started — done
    return citation_lines

def find_deep_research(export_path):
    with open(export_path, 'r', encoding='utf-8') as f:
        data = json.load(f)

    conversations = data if isinstance(data, list) else data.get('conversations', [])
    results = []
    for conv in conversations:
        messages = conv.get('messages', [])
        for msg in messages:
            if msg.get('role') != 'model':
                continue
            text = msg.get('text', '') or msg.get('content', '')
            word_count = len(text.split())
            if word_count >= WORD_THRESHOLD:
                # Find the user message that preceded this (the research question)
                idx = messages.index(msg)
                prompt = ''
                for prior in reversed(messages[:idx]):
                    if prior.get('role') == 'user':
                        prompt = (prior.get('text', '') or prior.get('content', ''))[:300]
                        break
                citations = extract_citations(text)
                results.append({
                    'conversation_id': conv.get('conversation_id', ''),
                    'timestamp': conv.get('create_time', ''),
                    'research_question': prompt,
                    'word_count': word_count,
                    'citations': citations,
                })
    return results

if __name__ == '__main__':
    path = sys.argv[1] if len(sys.argv) > 1 else 'gemini_export.json'
    reports = find_deep_research(path)
    print(json.dumps(reports, indent=2))

Run it as python3 extract_deep_research.py MyActivity.json. The output is a JSON array of objects — one per candidate Deep Research report — with the research question, timestamp, word count, and extracted citation list. The 2000-word threshold is conservative; adjust it up (e.g. 3000) to reduce false positives from unusually long standard responses, or down to catch shorter reports. The citation extractor looks for a "Sources" heading or a trailing numbered list; Deep Research reports consistently end with one or the other.

Note that the Gemini Takeout JSON field names for message text and role may vary slightly across export versions (some use text, others use content). The script above checks both. If neither matches your export, inspect the raw JSON for the message structure with python3 -c "import json; d=json.load(open('MyActivity.json')); print(list(d[0]['messages'][0].keys()))" and adjust the field names accordingly.

Decision-capture workflow for Deep Research reports

Deep Research reports are information-dense but ephemeral unless you act at the time of the session. The following three-step workflow keeps them recoverable and useful for long-term documentation:

At the time of the research session: copy the share link and the citations list. The Gemini interface has a share icon for Deep Research reports — clicking it generates a persistent gemini.google.com/app/... URL that remains accessible until you delete the conversation. Copy this URL into your notes immediately. Also copy the numbered source list from the bottom of the report into a separate note — this is the evidence trail, and it is the piece most useful for the ADR Context section. Do not rely solely on Google Takeout retrieval: export timing is unpredictable and doing it immediately after each session is the only reliable approach.
For long-term preservation: export to a local Markdown file within 30 days of critical research sessions. Google Takeout is the authoritative export path, but do not wait for an annual or quarterly export cycle for research that informs important decisions. Request a Gemini Apps Activity export from Google Takeout within 30 days of any research session whose output you want to preserve. From the export JSON, save the full report text as a local Markdown file with front-matter: include the date, the research question you submitted, the list of cited source URLs, and the share link. A simple front-matter block looks like:
```
---
date: 2026-06-02
tool: Gemini Deep Research
research_question: "Compare Postgres, CockroachDB, and PlanetScale for a multi-region SaaS with 200M rows"
share_link: https://gemini.google.com/app/abc123
sources:
  - https://www.cockroachlabs.com/docs/stable/multi-region-overview.html
  - https://planetscale.com/docs/concepts/sharding
  - https://www.postgresql.org/docs/current/ddl-partitioning.html
---
```
This Markdown file is the durable artifact. It is not dependent on Google's servers, does not require re-export, and can be committed to the same repository as your ADRs.
The ADR connection: use the report as the Context section. The ADR Context section asks: what situation, constraints, and evidence existed when this decision was made? A Deep Research report answers that question directly — it is a structured account of the technical landscape as it existed when you were evaluating your options. Paste a summary of the report's key findings (two to four paragraphs) into the ADR Context section, and link to the saved Markdown file for the full evidence trail. Use the report's "Alternatives Considered" or comparative sections to populate the ADR's Options section if your template includes one. The source list becomes the ADR's references section. The decision itself comes from the follow-up deliberation — not from the Deep Research report, which surveys the landscape without making the call for you.

Gemini Deep Research vs Perplexity Deep Research vs ChatGPT Browse

The three main AI research tools that produce long-form research output each have distinct export characteristics. The comparison matters because the tool you use for pre-decision research determines how much of your evidence trail is recoverable later:

Feature	Gemini Deep Research	Perplexity Deep Research	ChatGPT GPT-4o with Browse
Typical report length	3000–10000 words	1500–3000 words	Varies — 500–3000 words per browsing-augmented response
Export in data download	Yes — full report text in Google Takeout (Gemini Apps Activity)	Partial — GDPR data request only; thread content coverage varies; no native export button	Yes — full conversation text in conversations.json via ChatGPT Settings → Data Controls → Export
Structured citations in export	Source list as plain text at the bottom of the report body; inline [N] markers may or may not survive export	Citations array available in API response JSON when `return_citations: true`; product-side export does not provide a separate citations array	Tether nodes in conversations.json encode browsed URLs per assistant message turn; not a clean citations array but extractable
Intermediate search queries	Not exported — only final citations preserved	Not exported — only final citations preserved	Not fully exported — `tether_browsing_display` nodes show per-turn query context in some export versions but not a complete search log
Native share link	Yes — persists until conversation is deleted; `gemini.google.com/app/...` URL	Yes — thread link persists in Library until deleted	Yes — `chatgpt.com/share/...` URL; persists until manually revoked
Decision-capture path	Share link + local Markdown copy at session time; full report in Google Takeout export	Manual copy at session time (no native export); GDPR request for partial retroactive recovery	conversations.json via ChatGPT Settings export; comprehensive and reliable
Subscription required	Yes — Gemini Advanced (Google One AI Premium)	Yes — Perplexity Pro	No — available on free tier with limits; Plus for higher volume

The headline finding: Gemini Deep Research and ChatGPT both provide reliable export paths — Google Takeout and ChatGPT's built-in export respectively — while Perplexity's export coverage is partial and slow. For research that will inform architecture decisions, prefer tools with reliable export paths, and capture share links immediately regardless of which tool you use.

Why this matters for decision capture

Deep Research reports are the most information-dense research artifact available in a standard AI-assisted engineering workflow. A single Deep Research session can survey a technical landscape in a depth and breadth that would take a human researcher half a day — pulling together benchmark data, documentation, GitHub issues, blog posts, and community discussions into a structured synthesis. For architectural decisions, that synthesis is invaluable: it captures what was true about the state of the available options at the moment the decision was being considered, which is exactly what the ADR Context section asks for.

But Deep Research reports precede the decision. They provide the landscape; they do not make the call. The decision itself — "we chose Postgres over CockroachDB because our team has Postgres expertise and the operational complexity of global distribution is not justified at our scale" — comes from the follow-up conversation where you deliberated on what the research found. That deliberation conversation typically happens in ChatGPT or Claude, where you bring the research findings and reason through which option to take and why. That deliberation conversation is extractable via the WhyChose extractor, which reads your conversations.json export and surfaces the decisions with their original reasoning, formatted as structured decision records.

The complete picture for a well-evidenced ADR requires both artifacts: the Deep Research report (Google Takeout) as the Context section, and the ChatGPT or Claude deliberation session (conversations.json) as the Decision and Consequences sections. Preserving only one gives you an incomplete record — either a thorough literature review with no decision, or a decision with no evidence trail. The workflow is: run Deep Research to build context, save the report immediately, deliberate in ChatGPT or Claude, extract the decision with WhyChose, paste the report summary into the ADR Context section. The result is a decision record that shows both what was decided and what evidence shaped it.

Get early access