Started as a joke about a 'Trump Dashboard' and evolved into a high-velocity experiment in market intelligence, bridging the gap between noise and actionable strategy.

A self-auditing AI market analyst. Daily session predictions and self-grading, Bi-weekly logic audits, and Monthly 90-day strategic outlooks.
It started as a joke. I was talking to a friend about the latest Trump headline and said, "Maybe I should just get Claude to build me a 'Trump Dashboard' so we can see how the market reacts to his words." But I quickly realized I didn't want a project with an expiration date, something that would die with Trump when his tenure inevitably ends.
The idea pivoted into something more permanent: an institutional-grade premarket terminal, bootstrapped entirely on free tools and sources. The true problem was embarrassingly mundane: I kept showing up to the NY open underprepared. Not because I wasn't watching, but because premarket noise is genuinely hard to synthesize fast. You've got futures moving on overnight Nikkei action, tariff tweets still echoing from Asia hours, a Fed speaker at 9:00 AM ET, and three sector ETFs pointing in different directions. By the time you've assembled a coherent read, the opening bar has already printed and the trade is gone.
The original v1 was a Gemini notebook, a manual "run this, read that" flow. It worked fine but required me to actually do it. V2 was the real question: could I build something that woke itself up, gathered the data, formed a view, committed it to a dashboard, and then (this is the part I found genuinely interesting) graded itself against closing prices at end of day?
Then I found Antigravity on a whim. The realization of what I could actually do with it changed the scope instantly. Countless hours after that were a blur of iterating across different AI models, from Gemini Pro/Flash to Sonnet to GPT-5.4. 421 functional commits over 28 days; that’s the scale of the iteration once the pipeline was live, with change after change as I kept re-evaluating the analysis design and data sources.
The backbone is a four-workflow GitHub Actions pipeline, each with a distinct ownership boundary:
daily_update.yml fires on a dense polling schedule across a wide UTC window that covers the full pre-open range in both EDT and EST, with extra retry slots to survive the occasional GitHub scheduler miss, all without relying on a DST-aware cron parser. A Python gatekeeper (check_market_time.py) runs first and emits should_run: true/false based on current New York time. Only one early pass and one preopen pass can get through per session. The workflow then runs fetch_market_data.py (yfinance + RSS), fetch_macro_data.py, and finally build_dashboard.py --target premarket, which is where the actual intelligence work happens.
The model layer runs through gemini_util.py, a small fallback orchestrator I built around the free-tier API constraints. The hierarchy is gemini-3-flash-preview → gemini-2.5-flash → gemini-3.1-flash-lite-preview. Each call auto-injects a thinking_config appropriate to the workflow type: narrative generation and meta-analysis get thinking_level: HIGH; grading gets MEDIUM; news pre-screening gets LOW. On a 429 or 503, it sleeps 2 seconds and falls back to the next model. Simple, but it holds up under daily operation.
narrative_config.json is the regime controller, and it's the architectural decision I'm most satisfied with. Instead of hardcoding "tariff sentiment" or "Fed pivot thesis" into the prompts, all narrative framing is driven by a JSON config that defines the active regime, primary actor, drivers, and sector sensitivities. Right now the active regime is Trump Policy Uncertainty. When the dominant market driver shifts (say, from trade policy to a credit event), I update one JSON file and the entire analysis pipeline re-frames itself. The AI reads the config at runtime; it doesn't need new prompts.
The evening review (daily_review.yml) runs postmarket and ingests actual closing data via fetch_closing_data.py. It prompts Gemini to score the morning's predictions against realized price action across eight rubric dimensions: direction_score, session_direction_score, sector_score, narrative_score, watch_score, driver_relevance_score, archetype_score, and conviction_calibration_score. Scores land in track_record.json.
The meta-analysis (biweekly_meta.yml) fires every two weeks and reads the accumulated track_record.json to identify failure patterns: which drivers are consistently over-called, whether the system is short-horizon biased (gets the open right, misses the close), and whether the current regime config is still valid or needs rotation. It proposes prompt patches and config amendments as structured JSON, which I review via accept_meta_proposals.py.
The monthly strategic outlook runs on the first Sunday of each month. Rather than analyzing individual sessions, it steps back and synthesizes accumulated macro data into a longer-horizon view: a 90-day forward read on regime trajectory, sector positioning, and structural risk. It's the part of the system that operates at a different clock speed from the daily grind.
The whole output is a static HTML dashboard: the live premarket briefing, the evening scorecard, session history, the aggregate intelligence loop, a 90-day strategic outlook, and a glossary for the terminology and archetypes the system uses.
Hallucination Hysteresis. The first time Gemini generated a briefing with live news links, it fabricated about 30% of the URLs. Not obviously. The headlines were real, the domains were plausible (financialtimes.com, not fictionalnews.com), but the links were dead. The fix was to overhaul the news pipeline entirely: fetch_market_data.py now ingests from real RSS feeds, deduplicates by GUID, keyword-scores articles per active driver, and injects the full shortlist into the prompt as the only permitted source material. The prompt contains YOU MUST NOT INVENT OR FABRICATE URLS. That rule is enforced. The model only has access to links it was handed.
The Preserve Gate. Early on, if the market data fetch ran stale (say, the yfinance snapshot was timestamped from the previous session), build_dashboard.py would happily overwrite index.html with a broken briefing. I lost two mornings of live content this way. The fix was a two-layer gate: a stale-data detector that checks market_data["generated_at"] against the current session window, and a fallback path that re-renders the last valid session's content rather than writing a blank or corrupted page. The force-run flag bypasses both. I always keep that escape hatch.
DST is not a solved problem. GitHub Actions cron runs in UTC. US market open is 9:30 AM ET, which maps to 13:30 UTC in daylight time and 14:30 UTC in standard time. A naive cron schedule that targets one fixed UTC window will miss half the year. The solution was a polling window wide enough to span both seasons, combined with a Python gatekeeper that does the actual time-zone math and emits a single should_run boolean. Cron just knocks on the door repeatedly; the gatekeeper decides who gets in.
Schema Drift. The morning briefing and evening review schemas evolved independently across dozens of commits: fields renamed, nested objects restructured, new rubric dimensions added. The evening scorer was regularly failing to find morning.opening_bias because I'd moved it three levels up in the payload in a previous build. I eventually added schema/config_schema.json and schema/meta_summary_schema.json with jsonschema validation gates in run_meta_analysis.py. Expensive to retrofit, but the runtime failures stopped.
Violent Expansion Days. The archetype selection logic (which classifies sessions as things like Gap and Go, Trap Open, or Slow Bleed) was tuned on moderate-volatility sessions. On days where ES futures were already down 2.5% pre-open (several times during the tariff shock period), the archetype logic would call Controlled Selloff when the actual session was a Capitulation Flush. The meta-analysis eventually caught this: archetype_score was the weakest rubric dimension on high-ATR days. The fix was adding market_internals.participation_profile to the intraday data block and updating the prompt to treat broad participation + large gaps as a separate classification branch.
What it actually produces: every weekday morning, before the NYSE opens, a structured single-page briefing loads at the dashboard: the session's expected archetype, the active narrative drivers, 3-5 live news stories with sector impact analysis, a scenario playbook with if/then branches for any high-impact economic releases, and a conviction score. That evening, the same system fetches closing data, grades every dimension of the morning call, and writes the scorecard back to the record.
Every two weeks, the meta-analysis reads the accumulated scorecards, finds the failure patterns the morning briefing keeps making, and proposes targeted prompt changes to fix them. The system grades itself, then rewrites its own instructions. Once a month, it zooms out entirely and synthesizes everything into a 90-day strategic outlook that operates at a different tempo from the daily loop.
The intersection that makes this interesting isn't the data pipeline or the Gemini fallback hierarchy. Those are just plumbing. It's the closed feedback loop: a market intelligence system that's continuously calibrating against reality at the speed of daily market sessions, not at the speed of a developer manually reviewing outputs. The meta-analysis is the thing that makes V2 meaningfully different from any other LLM-wrapping-an-API project. It's the part that turns a data dashboard into something closer to a learning system.