Startups build-in-public rag ai-agents startups

How I Built an AI Resume Matcher

A build-in-public breakdown of creating an AI-powered resume matching system. Covers the architecture, vector search, LLM scoring, and lessons learned.

Panda Coding SchoolMay 2, 20263 min read

I built an AI resume matcher that scores how well a candidate's resume matches a job description. Here's the full breakdown — what worked, what failed, and what I'd do differently.

The Problem

Recruiters spend hours scanning resumes. Most ATS systems use keyword matching, which misses qualified candidates who describe their skills differently.

I wanted to build something smarter — a system that understands semantic meaning, not just keywords.

Architecture Overview

The system has three core components:

Document Processor — Parses resumes and job descriptions into structured data
Vector Store — Embeds and stores documents for semantic search
Scoring Engine — Uses an LLM to generate match scores with explanations

Tech Stack

Backend: Python + FastAPI
Vector DB: Pinecone
Embeddings: OpenAI text-embedding-3-small
LLM: GPT-4o for scoring
Frontend: Next.js + Tailwind
Queue: Redis + Celery for async processing

The Document Processor

The hardest part was handling the variety of resume formats — PDF, DOCX, images, and even LinkedIn profile URLs.

I used a combination of:

pymupdf for PDF extraction
python-docx for Word documents
A vision model for image-based resumes

Key lesson: Spend 50% of your time on data extraction. Garbage in, garbage out.

Vector Search Approach

Each resume gets embedded into a vector space. When a job description comes in, I:

Embed the job description
Find the top-N most similar resumes via cosine similarity
Pass the matches to the LLM for detailed scoring

This two-stage approach keeps costs manageable — the vector search is cheap, and the LLM only processes the top candidates.

What Failed

Pure embedding similarity wasn't enough. Two resumes with similar embedding scores could have very different actual relevance. The LLM scoring stage was essential.
Resume parsing is a nightmare. Every format, every layout, every encoding issue. I spent 3x more time on parsing than I budgeted.
Initial latency was too high. Processing a single resume took 8 seconds. I had to add async processing and a queue system.

What I'd Do Differently

Start with structured input instead of parsing free-form resumes
Use a smaller model for initial filtering, save GPT-4 for final scoring
Build the queue system from day one — don't bolt it on later
Add evaluation metrics early — without ground truth data, it's hard to know if your system is improving