Case Study
JobHunter
AI-powered job search automation and scoring
Job search platform combining Claude API for intelligent job scoring, Playwright for scraping, and Notion as the database and user interface.
Overview
JobHunter is the system I built to run my own job search. It crawls four job sources daily (Indeed, LinkedIn, WTFJ, company career pages) using Playwright and BeautifulSoup, scores every role against a 6-dimension career goals framework using Claude API, and pushes the results into Notion so I can triage in a familiar UI. GitHub Actions runs the pipeline on a daily cron; I review scored jobs in Notion, apply to high-signal matches, and feed rejections and interview outcomes back into the scoring calibration.
Problem
The bottleneck in a modern job search is not finding jobs; it is filtering them. Aggregators show thousands of loosely-matched roles; generic keyword filters miss the ones that actually fit. JobHunter solves my specific version of this problem: rather than trying to be a product for everyone, it encodes my goals, my constraints, and my current risk tolerance into the scoring framework, and keeps the noise manageable. It's also a useful stress-test for my own engineering: a full-stack AI system I eat my own dog food with.
Key Decisions and Trade-offs
- Claude API for scoring. Fine-grained scoring requires nuanced understanding of career fit. Claude's reasoning and token limit handling beat cheaper models for the job scoring use case.
- Playwright for scraping. Job boards are JavaScript-heavy. Playwright handles dynamic content better than static HTTP scraping. Trade-off is speed; payoff is reliability.
- Notion as database and UI. Notion gives a familiar, filterable interface without building a custom dashboard. The Notion API is stable. Trade-off is limited customisation; payoff is simplicity.
- GitHub Actions for scheduling. Replaces a server; Actions runs the discovery pipeline daily on a schedule. Cost zero, maintenance zero.
Stack and Why
| Layer | Technology | Rationale |
|---|---|---|
| Job Discovery | Playwright, BeautifulSoup | Playwright handles JS-heavy sites; BeautifulSoup for static scraping. Site-specific selectors. |
| AI Scoring | Claude API | 6-dimension framework (role fit, compensation, growth, location, team, mission). Reasoning for nuance. |
| Database | Notion API | User-friendly interface, filtering, sorting. No custom backend needed. |
| Automation | GitHub Actions | Scheduled daily discovery. Cron-based. Free tier sufficient. |
| Language | Python | Rich ecosystem for scraping, API clients. Quick iteration. |
What Shipped
- Discovery crawlers: Four site-specific crawlers (Indeed, LinkedIn, WTFJ, career pages) pulling job listings daily.
- AI scoring module: Claude API integration using a 6-dimension framework: role fit, compensation, growth potential, location/relocation, team quality, mission alignment. Scores normalised to 0-100.
- Notion integration: Parsed jobs pushed to Notion with title, company, score, link, and dimension breakdowns.
- GitHub Actions workflow: Daily scheduled discovery and scoring. Runs at 06:00 UTC.
- CV tailoring workflow: LaTeX-based CV templating. Programmatic fill-in of role-specific keywords. Verification via page count.
Metrics
- 4 job sources crawled daily
- 500-1000 new jobs scored per day
- 6-dimension scoring framework with weighted importance
- Average discovery latency: < 5 minutes from job board to Notion
- Cost: Claude API $5-10 per day, GitHub Actions free
What Is Next
Near-term, everything is in service of my live search. I'm refining the 6-dimension scoring weights against interview outcomes (which signals actually predict fit, which ones I over-weighted). Tracking application-to-interview conversion to ground-truth the scoring accuracy. Adding an interview prep module: company research automation, role-specific checklists, salary negotiation notes. Extending source coverage to Wellfound and AngelList for startup roles, which matter for the founder-adjacent end of my target list. Productising is not the goal; the tool exists to solve my own problem, though the architecture (Claude for scoring, Notion as UI, GitHub Actions as cron) would generalise for anyone willing to encode their own goals.
Status: In active use for my Paris/London search.