Horizon
← Back to Portfolio

Horizon

What's On

Cincinnati has a lot going on. History lectures at the Cincinnati Museum Center, archaeological tours through Fort Ancient, craft workshops at local makerspaces, film screenings, nature programs, WWII aviation days at the National Museum of the Air Force, earthworks tours at Hopewell Culture. The problem isn't a shortage of events. It's that they're scattered across 99 different websites, each with its own calendar format, update cadence, and degree of crawlability.

No single place aggregates them. The city's general event sites tend toward concerts and festivals. The niche stuff — the history talks, the guided walks, the lecture series at small museums — lives on individual institution websites and dies there.

Horizon is my attempt to fix that for myself.

How It Works

Every Sunday morning a GitHub Actions cron job kicks off the scraper. It works through a list of 99 sources: history museums, nature centers, archaeological sites, art venues, community theaters, and local institutions across Cincinnati, Northern Kentucky, and the Dayton area.

For each source it fetches the HTML. For sites that require JavaScript to render their calendars — which is most of them — it uses Playwright to get the fully rendered page first. That HTML goes to the Claude API, which extracts structured event data: title, date, time, location, categories, cost, and a short description.

Each event becomes an individual JSON file, committed back to the repository. Vercel detects the push and rebuilds the static site. The whole cycle takes about an hour.

Flat File Storage

No database. Every event is a JSON file in /events/ named by source, date, and title slug. The frontend reads them all at build time. Simple, portable, and easy to inspect.

Tags

next.jstypescriptclaude-apiplaywrightgithub-actionscincinnati