<h1 align="center">
<a href="https://prompts.chat">
Extract MAXIMUM structured intelligence from ALL project source files — not just knowledge.json, but the raw production files that contain data the knowledge tagging step may have missed or compressed. This skill governs the TypeScript extraction module (`knowledge-extractor.ts`) that feeds the Gato
Loading actions...
<a href="https://prompts.chat">
TypeScript and ESLint rules that MUST be followed when creating, modifying, or reviewing any file under apps/frontend/, including .ts, .tsx, .js, and .jsx files. Also apply when discussing frontend linting, type safety, or ESLint configuration.
risks
Extract MAXIMUM structured intelligence from ALL project source files — not just knowledge.json, but the raw production files that contain data the knowledge tagging step may have missed or compressed. This skill governs the TypeScript extraction module (knowledge-extractor.ts) that feeds the GatorSquare platform database and knowledge graph.
This is the quality backbone of the knowledge graph. It stays private — never exposed to users.
The production pipeline's Step 4.7 (knowledge-tagging.md) tells Claude what to extract INTO knowledge.json. This skill tells the platform how to INGEST from ALL sources and cross-reference them to catch everything.
Every complete GatorSquare project has up to 8 source files. The extractor reads ALL of them:
| # | File | Type | What It Contains | What It Adds to the Graph |
|---|---|---|---|---|
| 1 | investigation.json | JSON | Assembled panel data — narration, images, acts, metadata | Title, subtitle, panel count, art style, narration text |
| 2 | metadata.json | JSON | Project metadata — tags, scope, period, figures | Tags, systems_mapped, geographic_scope, key_figures |
| 3 | knowledge.json | JSON | Pre-extracted entities, causality, emotions (Step 4.7 output) | Entities (8 categories), causal links/chains/trees, cross-project links, themes, emotion arc, visual elements, panel logic |
| 4 | prompts.json | JSON | Image generation prompts — characters, environments per panel | Character names → entities.people, Environment names → entities.places |
| 5 | script.md | Markdown | Raw narration text per panel — dialogue, arguments, stage directions | Per-panel narration text, dialogue extraction |
| 6 | scene-plan.md | Markdown | Visual composition per panel — composition, mood, camera | Mood descriptions, composition details, camera angles, additional visual elements |
| 7 | brief.md | Markdown | Deep research — facts, dates, sources, historical context | Research summary (investigation hook text) |
| 8 | plot.md | Markdown | Narrative structure — act breakdowns, moment descriptions | Act structure with panel ranges and descriptions |
The prompts.json file contains characters[] and environments[] arrays per panel that are NOT captured by knowledge.json. These are structured entity data — named characters and specific locations — that the knowledge tagging step doesn't extract because it runs before prompt generation.
Example from prompts.json:
{
"panel_id": "panel-05",
"characters": ["John Maynard Keynes"],
"environments": ["hotel desk", "private office"],
"prompt_text": "Close-medium portrait of an older British economist..."
}
The extractor merges:
characters[] → entities.people (only proper-noun names, not generic "delegates")environments[] → entities.places (filtered, no abstracts like "blueprint")Two variants exist across projects:
knowledge.project_knowledge.entities → merged into entities
knowledge.project_knowledge.causal_chains → stored as causal_chains
knowledge.project_knowledge.themes → stored as themes
knowledge.project_knowledge.cross_project_links → stored as cross_project_links
knowledge.project_level.entities → merged into entities
knowledge.project_level.causal_chains → stored as causal_chains
knowledge.project_level.causal_trees → stored as causal_trees (branching cascades)
knowledge.project_level.cross_project_links → stored as cross_project_links
knowledge.project_level.emotion_arc → stored as emotion_arc
knowledge.project_level.dominant_emotions → merged into emotions
The extractor handles both variants and deduplicates across them.
Three scene-plan.md formats exist across projects:
| Format | Example | Mood Field |
|---|---|---|
| Standard | **Mood:** Weight. Scale. | **Mood:** (colon inside bold) |
| Opium variant | **Mood**: Solemn. Decisive. | **Mood**: (colon outside bold) |
| Banana-republic variant | **Palette:** Deep amber dawn + **Visual Metaphor:** Democracy under bombardment | Uses Palette and Visual Metaphor instead of Mood |
| Green-revolution variant | No Mood field — uses **Key elements:** | No mood extraction possible |
The extractor handles all four. When no Mood field exists, the Visual Metaphor is used as mood fallback.
from|relation|to composite key.chain.join("|") key.project + type composite key.Tag categories are derived from entity names, grouped by the 8 entity categories:
people: ["John Maynard Keynes", "Harry Dexter White"]
institutions: ["IMF", "World Bank", "WTO"]
concepts: ["reserve currency", "conditionality", "exorbitant privilege"]
mechanisms: ["structural adjustment", "dollar-gold peg"]
systems: ["Bretton Woods System"]
commodities: ["gold", "oil"]
This is displayed as color-coded pills on the project page — a quick visual index of what the investigation covers.
After extraction, the stats should show:
| Metric | Minimum Expected | Red Flag If |
|---|---|---|
| Entities | 100+ for a 25-panel project | < 50 |
| Causal links | 40+ | < 20 |
| Causal chains | 3+ | 0 |
| Cross-project links | 3+ | 0 |
| Themes | 3+ | 0 |
| Emotions | 5+ unique | < 3 |
| Emotion arc | Same as panel count | Mismatched |
| Visual elements | 50+ | < 20 |
| Panel logic | Same as panel count | < 50% of panels |
| Narrations | Same as panel count | 0 (if script.md exists) |
| Scene directions | Same as panel count | 0 (if scene-plan.md exists) |
| Moods | 10+ unique | 0 (if scene-plan has Mood field) |
| Research summary | 200+ chars | 0 (if brief.md exists) |
The extraction module is TypeScript, not a Claude production skill. It runs at build/seed time, not during content production. The relationship is:
[Production Pipeline]
Step 4.7: knowledge-tagging.md skill
→ Claude extracts → knowledge.json
[Platform Ingestion]
deep-knowledge-extraction (this skill)
→ TypeScript module reads ALL 8 files
→ Cross-references, deduplicates, normalizes
→ Writes to SQLite database
→ Feeds knowledge graph + project pages
The production skill generates one file. The extraction module reads everything. Both are non-negotiable. Both stay private.