<h1 align="center">
<a href="https://prompts.chat">
Generate panel images for investigations using Gemini in the browser via Chrome MCP extension. Uses the **multi-tab fire-and-harvest** approach for throughput with reliable single-tab download accuracy.
Loading actions...
<a href="https://prompts.chat">
TypeScript and ESLint rules that MUST be followed when creating, modifying, or reviewing any file under apps/frontend/, including .ts, .tsx, .js, and .jsx files. Also apply when discussing frontend linting, type safety, or ESLint configuration.
risks
Generate panel images for investigations using Gemini in the browser via Chrome MCP extension. Uses the multi-tab fire-and-harvest approach for throughput with reliable single-tab download accuracy.
Use ONE model for an entire investigation. Mixing models creates visual inconsistency.
Default: Gemini. Superior quality, better artistic interpretation, better atmosphere.
Switch to ChatGPT ONLY when:
scene-plan.md exists with panel promptsprojects/{id}/panels/{style}/Instead of generating one panel at a time (waiting 60-90s each for image + text response), open N tabs, fire a prompt into each one, wait once for all to finish, then go back and download from each tab.
PASS 1 — FIRE (fast, ~30s for 4 tabs):
Tab A: click input → prime → type prompt → send
Tab B: click input → prime → type prompt → send
Tab C: click input → prime → type prompt → send
Tab D: click input → prime → type prompt → send
WAIT (~60-90s for all 4 to generate images + text responses)
PASS 2 — HARVEST (fast, ~30s for 4 tabs):
Tab A: scroll down → hover image → click download icon
Tab B: scroll down → hover image → click download icon
Tab C: scroll down → hover image → click download icon
Tab D: scroll down → hover image → click download icon
MOVE FILES (~10s):
Check ~/Downloads for Gemini files
Move in timestamp order → panel-NN.png (oldest = first tab fired)
Effective throughput: ~30s per panel vs ~90s sequential. 3x faster.
1. Create N tabs (4 is the sweet spot — more creates context overhead):
tabs_create_mcp (repeat N-1 times, first tab already exists)
2. Navigate ALL tabs to Gemini simultaneously:
navigate(url: "https://gemini.google.com/app", tabId: each)
→ Can fire all navigate calls in parallel
3. Wait 5 seconds for all pages to fully load
4. WARM UP each tab by taking a screenshot:
screenshot(tabId: each)
→ This is CRITICAL. The Chrome extension doesn't properly connect to a tab
→ until it takes a screenshot of it. Without this, type() silently drops text.
→ This was the #1 cause of failed prompts on fresh tabs.
5. Record the tab-to-panel mapping:
Tab {tabId_A} → Panel {NN}
Tab {tabId_B} → Panel {NN+1}
Tab {tabId_C} → Panel {NN+2}
Tab {tabId_D} → Panel {NN+3}
For EACH tab, in sequence:
PER TAB:
1. Click the input field area (coordinate ~762, 286 for full viewport)
2. Wait 1 second
3. PRIME the input: type "x" then Ctrl+A → Backspace
→ This is CRITICAL. Fresh Gemini pages silently drop the first typed text.
→ Typing a single character "x" then clearing it primes the input to accept real text.
4. Type the prompt (ASCII only — no em dashes, curly quotes, or Unicode)
5. Click the send arrow (coordinate ~1143, 372 for full viewport)
→ If viewport is narrower (multiple tabs visible), coordinates shift:
1145px wide: send at ~955, 362
1524px wide: send at ~1143, 372
→ The active/focused tab gets full viewport width
6. Verify: URL should change from /app to /app/{hash}
→ If URL didn't change: the text or send click failed. Retry.
7. Move to next tab immediately — DON'T wait for generation
Critical: The "x" prime trick.
Fresh Gemini pages have a bug where the first computer(type) action is silently dropped.
The input field appears to accept text but then clears it. Typing "x" → Ctrl+A → Backspace
forces the input into an active state. After priming, the real prompt types correctly.
Coordinate awareness: When you switch between tabs, the viewport width may change (Chrome resizes tabs). The input field and send button coordinates shift. Always use coordinates appropriate for the current viewport width. Screenshot to verify if unsure.
Wait 60-90 seconds for all tabs to generate images + text responses.
Use: computer(wait, duration: 10) repeated 6-9 times.
Don't check individual tabs during this time — let them all cook.
The text responses are the bottleneck (~30-60s each), but they run in parallel.
For EACH tab, in sequence:
PER TAB:
1. Scroll down to find the generated image:
scroll(direction: down, amount: 10)
→ The image is below the prompt message, above the input field
2. Verify the image is fully generated:
→ Mic icon visible (not stop button) = generation complete
→ If stop button still visible: wait 10s more, or skip to next tab and come back
3. Hover over the image to reveal download icons:
hover(coordinate: ~700, 240)
→ Three icons appear in top-right corner: [share] [copy] [download ↓]
4. Click the download icon (rightmost):
click(coordinate: ~900, 135)
→ This is the arrow-into-tray icon
→ DO NOT click the image center — that opens the lightbox
→ Click the TOP-RIGHT area where the download icon appears
5. Move to next tab immediately — don't wait for download to complete
1. Wait 15 seconds after last download click (files are 8-10MB each)
2. List downloaded files by timestamp:
ls -lt ~/Downloads/Gemini_Generated_Image*.png
3. Map files to panels by timestamp order:
→ Oldest file = first tab's download (Panel NN)
→ Newest file = last tab's download (Panel NN+3)
→ This works because hover-downloads are triggered sequentially
4. Move each file:
mv ~/Downloads/Gemini_*.png panels/{style}/panel-{NN}.png
5. Verify count matches expected (should be N files for N tabs)
The same tabs can be reused for the next batch of panels.
Each tab already has one conversation — the next prompt goes into the same chat.
PER TAB (for subsequent batches):
1. Click the input field (coordinate ~762, 491 — below the image)
2. Wait 1 second
3. Type the next prompt (no need to prime — only fresh pages need priming)
4. Click send (coordinate ~1143, 542 — send arrow shifts down when input has text)
5. Move to next tab
IMPORTANT: Wait for ALL tabs' previous responses to fully complete (mic icon visible)
before typing the next prompt. If Gemini is still generating text (stop button visible),
the input field is locked and typed text gets dropped.
1. DO NOT click the image (that opens the lightbox)
2. HOVER over the image thumbnail in the chat
3. Three small circular icons appear in the top-right corner:
[share] [copy] [download ↓]
4. Click the download icon (rightmost, arrow-into-tray)
5. File appears as: Gemini_Generated_Image_{hash}.png in ~/Downloads
Why not lightbox? The lightbox "Download full-sized image" button works ~30% of the time. The hover-icon method works consistently.
Why 15 seconds wait? Files are 8-10MB. They take time to write to disk. Checking at 3-5 seconds causes false negatives that waste time in retry loops.
No hard session limit observed. Sessions can handle many images without refreshing.
If text responses start taking >30s consistently, navigate to fresh /app for that tab.
| Panels remaining | Batch size | Why |
|---|---|---|
| 1-3 | 1-3 tabs | Not worth the setup overhead for fewer |
| 4-8 | 4 tabs | Sweet spot — manageable, good throughput |
| 9-16 | 4 tabs x 2-4 batches | Reuse tabs across batches |
| 17-25 | 4 tabs x 5-7 batches | Full investigation |
| Problem | Fix |
|---|---|
| Text vanishes on fresh page | Prime with "x" → Ctrl+A → Backspace |
| Send button doesn't click | Screenshot to find exact coordinates — they shift with viewport width |
| Image not visible after scrolling | Scroll more (down 10 ticks), or scroll up if overshot |
| Download icon click opens lightbox | You clicked the image center, not the icon. Hover first, then click top-right (~900, 135) |
| Multiple Gemini files, unsure which is which | Use timestamp order — oldest = first tab downloaded |
| Tab still generating when trying to type next batch | Wait for mic icon. Don't type during generation. |
| Merged prompts (two prompts concatenated) | Previous response was stopped, leaving stale text. Clear with Ctrl+A → Backspace. |
SETUP:
Tab 2082987463 → Panel 15
Tab 2082987464 → Panel 16
Tab 2082987465 → Panel 17
Tab 2082987466 → Panel 18
FIRE (30s):
Tab A: click input → x → Ctrl+A Backspace → type prompt 15 → click send ✓
Tab B: click input → x → Ctrl+A Backspace → type prompt 16 → click send ✓
Tab C: click input → x → Ctrl+A Backspace → type prompt 17 → click send ✓
Tab D: click input → x → Ctrl+A Backspace → type prompt 18 → click send ✓
WAIT (60-90s):
wait 10s x 6-9 times
HARVEST (30s):
Tab A: scroll down → hover → click download icon
Tab B: scroll down → hover → click download icon
Tab C: scroll down → hover → click download icon
Tab D: scroll down → hover → click download icon
MOVE (10s):
ls -lt ~/Downloads/Gemini*.png → 4 files
oldest → panel-15.png
next → panel-16.png
next → panel-17.png
newest → panel-18.png
TOTAL: ~2 minutes for 4 panels = 30s/panel
vs sequential: ~6 minutes for 4 panels = 90s/panel
panel-{NN}.png (zero-padded: 01-25)After each batch, spot-check 1-2 panels:
Read projects/{id}/panels/{style}/panel-{NN}.png
Compare to scene-plan.md description
If it doesn't match: delete and regenerate in next batch
Full validation of all 25 panels happens after all batches complete.