Engineering

Senior AI/ML Engineer

San Francisco · full-time

Overview

You will build the LLM-powered bill parsing and consumption-anomaly detection systems at the core of AirBills. You will own the eval harness, ground-truth labeling, and model selection — and you will be measured on accuracy, latency, and cost.

What you’ll do

Build and operate LLM pipelines for utility-bill extraction (PDF, image, scanned, multi-utility).
Own the eval set, ground-truth labeling, and regression testing for parsing accuracy.
Build anomaly-detection models for usage and cost spikes across customer portfolios.
Drive model / provider selection, prompt strategy, and cost / latency tradeoffs.
Partner with backend engineers to productionize models behind reliable APIs.
Establish ML observability — track quality, drift, and unit economics.

What we’re looking for

4+ years of ML in production (not just research / notebooks).
1+ year working with LLMs in production: evals, prompt design, structured output.
Strong Python and modern ML tooling; comfortable in a TypeScript codebase too.
Pragmatic about model choice — knows when fine-tuning helps and when it does not.
Comfortable owning quality metrics and being on the hook for them.

Nice to have

Vision-LLM or document-extraction experience (Gemini, Claude, GPT-4o on PDFs / images).
Background in time-series anomaly detection.
Has shipped an LLM-graded eval harness from scratch.

Compensation & location

Base salary (mid-point)

$200,000

Range: $175,000 – $235,000 base + equity

Location: San Francisco

Final offer depends on experience, location, and interview signal. Equity grants come with every offer. We also cover health, dental, vision, and a learning budget.

About the team

We are a remote-first team and bill parsing is one of our highest-leverage AI bets. You will partner closely with PM, Platform, and Ops on shipping models that hold up in production.