Topic: gemini conversation export

Gemini Conversation Export — How to Get Your Google Gemini Chats Out (2026)

Q: What's missing from the export that I'd expect to be there?

Five things. (1) Gem configurations — the system instructions you wrote for custom Gems are not in the export. (2) Per-conversation grouping — turns are flat-listed by timestamp, with no thread/session ID, so multi-turn conversations appear as adjacent entries you have to re-stitch by clock proximity. (3) Generated images and audio — the prompts that produced them are present, but the binary outputs are not included. (4) Workspace-account separation — Gemini for Google Workspace uses a separate admin export controlled by your workspace admin; personal Takeout returns nothing for those chats even if you used both accounts. (5) Tool use — when Gemini called a tool (search, code execution, Workspace integration), the tool inputs and outputs are summarized in the rendered HTML but not exposed as structured records.

Google Gemini does not ship an in-app export. The only path is Google Takeout, and what comes back is HTML, not JSON — a meaningful step backward from what ChatGPT and Claude offer. Here's the request flow, what's missing, and the parsing recipes that make the archive usable.

TL;DR

Go to takeout.google.com → click Deselect all → scroll to My Activity → click All activity data included → deselect everything except Gemini Apps → click OK → choose HTML as the format → request a one-time export. Google emails a download link 4–48 hours later (longer than ChatGPT or Claude, which deliver in minutes). The archive is a directory containing one MyActivity.html file (or several, split by month) with prompts and responses interleaved as flat <div> entries. No JSON, no per-conversation grouping, no Gem configurations. Use the 30-line parsing script below to convert it to a JSON-shaped conversations.json the same loaders that handle ChatGPT exports and Claude exports can read.

Why this guide is shorter on glamour and longer on workarounds

If you came here from how to export your ChatGPT history or how to export your Claude conversations expecting a similar in-app button and a clean JSON file, the honest answer is: that's not what Google ships. Gemini's export story is a deliberate Google decision — the chat history is treated as an activity log, like Search and Maps and YouTube comments, rather than as a first-class application surface like Gmail (which exports as MBOX) or Drive (which exports as native file types). The result is that everything downstream — archiving, search, decision extraction — has an extra parsing step compared to the JSON-emitting platforms.

This page exists because the chat-history extraction cluster on whychose.com covers ChatGPT and Claude in depth (12 pages between them) and was missing the third platform that senior engineers actually use. Senior engineers who treat Gemini as a thinking partner — typically for Google Workspace integration, Vertex AI prototyping, or for the long-context use cases Gemini 2.5 handles well — get less back from the export than they got out of ChatGPT or Claude. Knowing the gap up front lets you plan around it.

How to request the export — the exact path

Open takeout.google.com in a browser. Sign in with the Google account whose Gemini history you want. If you used Gemini under both a personal account and a Workspace account, you have to request separate exports for each account.
Click "Deselect all." The default Takeout configuration includes 50+ products; selecting them all makes the export take days and ship 30+ GB of data you don't want.
Scroll to "My Activity" in the product list and check it.
Click "All activity data included" (the link below the My Activity checkbox). A modal appears listing every product whose activity is bundled.
Click "Deselect all" inside the modal, then check only Gemini Apps. (Do not also check "Search" or "Maps" unless you want those — they bloat the archive.) Click OK.
Click "Multiple formats" (still inside the My Activity row) and confirm the activity format is HTML. JSON is offered as a format choice, but for Gemini specifically the JSON output is shallower than the HTML — Gemini turns are emitted as one-line activity stubs in JSON whereas the HTML preserves the full prose. Counter-intuitive but true.
Click "Next step" at the bottom of the page.
Choose delivery method — "Send download link via email" is fine for one-off exports; "Add to Drive" if you want it to land in your Drive automatically. Pick a one-time export (not scheduled), file type ZIP, file size 50 GB.
Click "Create export." Google Takeout queues the request. The page updates to "Export in progress."
Wait 4–48 hours. Yes, really. Takeout's queue depth varies; for Gemini-only requests the typical turnaround is 4–8 hours, but enterprise Workspace accounts or accounts with years of activity can take a full day or two. Google emails when it's ready. The email comes from noreply-takeout@google.com with subject "Your Google data archive is ready."

What's actually in the archive

The ZIP unpacks to a directory tree:

Takeout/
└── My Activity/
    └── Gemini Apps/
        └── MyActivity.html

One MyActivity.html file containing every Gemini turn you've ever made. For accounts with significant history (50,000+ turns), Takeout splits this into per-month files (MyActivity-2024-12.html, etc.) — same shape, smaller per-file.

Each turn is a flat <div> with three things: the timestamp, the prompt or response prose, and a "details" link. The shape is roughly:

<div class="content-cell">
  Used Gemini Apps<br>
  <a href="...">Prompted with: "Should we adopt CockroachDB for the metrics service?"</a><br>
  Apr 28, 2026, 10:42:13 AM PDT
</div>
<div class="content-cell">
  Used Gemini Apps<br>
  Response: "CockroachDB is a strong fit for global multi-region writes..."<br>
  Apr 28, 2026, 10:42:24 AM PDT
</div>

Two things to notice. First, "Prompted with:" and "Response:" are the only role markers — there is no "sender": "user" or "role": "assistant" field, just the prefix in the prose. Second, there is no thread ID — adjacent prompt/response pairs that share a clock proximity (typically <30 seconds apart) belong to the same conversation, but you have to reconstruct that grouping yourself.

What's missing — five things you'd expect

Gem configurations. Custom Gems (Gemini's equivalent of OpenAI's custom GPTs) carry system instructions you wrote and a knowledge base of files you uploaded. The system instructions are not in the export. The knowledge-base files are not in the export. Conversations with a Gem appear as regular turns with no marker indicating which Gem they belong to. If your Gems are load-bearing for your workflow, document them out-of-band before relying on the export.
Per-conversation grouping. Turns are listed in reverse-chronological order by timestamp, full stop. Multi-turn conversations are adjacent in the file but not grouped — there is no conversation_id, no thread URL, no UUID. The reconstructed-by-clock-proximity heuristic works for 80% of conversations but breaks for: (a) conversations you re-opened hours later (gap > 30 minutes makes them look like separate threads); (b) parallel conversations you ran in different browser tabs simultaneously (their turns interleave by timestamp); (c) very fast multi-turn conversations where prompts and responses arrive within a second of each other (timing alone can't distinguish a four-turn conversation from two two-turn ones).
Generated images, audio, and video. If you used Gemini's image generation, audio output, or video output, the prompts that produced those outputs are in the export — the binary outputs are not. The HTML rows reference the generations by attribution text ("Created an image") but the files are absent. To preserve those, save them at generation time inside the Gemini UI; they cannot be recovered after the fact.
Workspace-account separation. Gemini for Google Workspace runs under your work account, governed by your Workspace admin. Personal Takeout requests against your @gmail.com account return nothing for chats you had under @company.com — even if both accounts use the same browser. To get Workspace-account chat history, your Workspace admin has to run an admin-side export through the Vault product, not Takeout. Most admins do not realize Vault holds Gemini history; you may need to point them at the https://support.google.com/vault documentation that confirms it.
Tool use details. When Gemini called a tool (Google Search, code execution, Workspace integration like Calendar or Gmail), the inputs and outputs are summarized in the rendered HTML — "Searched for X and found Y" — but the tool-call structure itself is not exposed. ChatGPT's export preserves tool-use as a distinct "role": "tool" message with structured input/output; Gemini collapses it into prose. For workflows that want to audit which tools the model relied on, this is a meaningful gap.

Parsing the HTML into something usable

The minimum viable conversion is HTML → flat array of turn objects, each with a timestamp, a role guess, and the prose. The 30-line script below uses Node and cheerio for the HTML parse, then writes one JSON file in a shape close enough to ChatGPT's that the same downstream tools can consume it.

#!/usr/bin/env node
// gemini-html-to-json.js — convert Takeout MyActivity.html → conversations.json
// Usage: node gemini-html-to-json.js MyActivity.html > gemini-conversations.json
import { readFileSync } from 'node:fs';
import * as cheerio from 'cheerio';

const html = readFileSync(process.argv[2], 'utf8');
const $ = cheerio.load(html);

const turns = [];
$('.content-cell').each((_, el) => {
  const text = $(el).text().trim();
  // Only the "Used Gemini Apps" cells contain prompts/responses.
  if (!text.startsWith('Used Gemini Apps')) return;

  const promptMatch = text.match(/Prompted with:\s*"?([\s\S]+?)"?\s+([A-Z][a-z]+ \d+, \d+, \d+:\d+:\d+ [AP]M [A-Z]+)$/);
  const respMatch   = text.match(/^Used Gemini Apps\s+([\s\S]+?)\s+([A-Z][a-z]+ \d+, \d+, \d+:\d+:\d+ [AP]M [A-Z]+)$/);

  if (promptMatch) {
    turns.push({ role: 'user', text: promptMatch[1].trim(), ts: new Date(promptMatch[2]).toISOString() });
  } else if (respMatch && !text.includes('Prompted with:')) {
    turns.push({ role: 'assistant', text: respMatch[1].trim(), ts: new Date(respMatch[2]).toISOString() });
  }
});

// Reconstruct conversations: adjacent turns within 30 minutes of each other are one thread.
const THIRTY_MIN = 30 * 60 * 1000;
turns.sort((a, b) => new Date(a.ts) - new Date(b.ts));
const conversations = [];
let current = null;
for (const t of turns) {
  if (!current || new Date(t.ts) - new Date(current.last_ts) > THIRTY_MIN) {
    current = { uuid: `gemini-${t.ts}`, name: t.text.slice(0, 60), created_at: t.ts, last_ts: t.ts, chat_messages: [] };
    conversations.push(current);
  }
  current.chat_messages.push({ sender: t.role === 'user' ? 'human' : 'assistant', text: t.text });
  current.last_ts = t.ts;
}
process.stdout.write(JSON.stringify(conversations, null, 2));

Run it: node gemini-html-to-json.js MyActivity.html > gemini-conversations.json. The output shape mirrors Claude's chat_messages array intentionally — the WhyChose extractor's Claude loader can read this with no changes, which is why the script targets that shape rather than ChatGPT's mapping DAG. (ChatGPT's shape is more expressive but harder to write into; Claude's is the lowest-common-denominator we lean on for cross-platform normalization.)

The four edge cases the script handles imperfectly

Conversations split by long pauses. The 30-minute heuristic is the best you can do without thread IDs. For conversations where you stepped away for an hour and came back, the script will create two adjacent thread objects that you'd consider one. The downstream WhyChose extractor handles this gracefully — decision detection works per-turn, so split conversations don't silently lose decisions, they just get attributed to two threads instead of one. If you want stricter grouping, raise the threshold to 4 hours; if you want stricter splitting, drop it to 5 minutes.
Parallel browser tabs. If you ran two conversations in two tabs simultaneously, their turns interleave by timestamp and the script will merge them into one thread. There is no signal in the export to detect this. The only mitigation is human review of the reconstructed conversations.json — look for threads whose turns alternate between unrelated topics.
Special-character escapes in prompts. Quotes, ampersands, and Unicode in your prompt prose are HTML-entity-escaped in the export (", &, '). Cheerio's .text() de-escapes the basics, but the regex parsing on "Prompted with:" boundaries can mis-fire on prompts that contained literal quotes. The fix is the regex itself — make the closing-quote optional (the script above already does this with "?) and rely on the timestamp as the right-hand boundary instead.
Tool-use prose interleaved with response prose. When Gemini ran a Google Search mid-response, the HTML emits a separate "Searched for X" cell adjacent to the response cell. The script ignores those (they don't start with "Used Gemini Apps Prompted with:" or "Used Gemini Apps Response:"), but this means tool-augmented responses come back to the script with the search context stripped out. For decision extraction this is usually fine — the model's reasoning about the search results is in the response prose anyway — but for forensic analysis ("what did Gemini search before recommending CockroachDB?") you'd need to extend the script to capture the search-cell text and inline it.

Three sanity checks for the converted JSON

Turn count matches. jq '[.[] | .chat_messages[]] | length' gemini-conversations.json should approximately equal the number of "Prompted with:" + "Response:" occurrences in the source HTML. A large discrepancy means the regex missed a turn shape.
Conversation count is sane. If you have ~1,000 turns and the script produces 50 conversations, the average conversation length is 20 turns — reasonable for a thinking-partner workflow. If it produces 800 conversations, the threshold is too tight; if it produces 5, too loose.
No empty conversations. jq '[.[] | select(.chat_messages | length == 0)] | length' gemini-conversations.json should be 0. Empty conversations indicate the regex grabbed a header cell as a turn marker.

How WhyChose fits in

The conversion gets you to a normalized JSON; the extractor takes it from there. The WhyChose open-source extractor already supports the Claude-shape chat_messages array the script above emits — drop gemini-conversations.json into the uploader (or pipe it to the CLI) and the same regex + Artifact + LLM passes that work on ChatGPT and Claude exports run on the Gemini-derived data. Decision records are emitted in the same shape regardless of source, so a multi-platform decision audit (some conversations from ChatGPT, some from Claude, some from Gemini) produces one unified log. Per-conversation metadata is thinner on the Gemini side (no project_uuid, reconstructed conversation boundaries) but the extracted decisions themselves are platform-agnostic — the trade-offs you reasoned about and the choice you locked in don't depend on which model you reasoned with.

Get early access