
The sports betting market had scaled supply but not intelligence. Most "expert picks" were closer to entertainment than insight. Building a credible AI betting product meant solving three problems simultaneously: engineering a predictive system that actually beats the vig, designing a product that earns trust through transparency, and creating a daily habit loop that keeps users coming back. There was no brand, no platform, no engine, no users—just a hypothesis that disciplined AI could outperform the crowd.
Autonomous Research System
Built an autonomous model optimization system on Karpathy's autoresearch framework—a disciplined architecture with strict separation of concerns. An immutable harness handles API integration and evaluation. An agent-modifiable experiment config gets iterated by the system. Git serves as the rollback mechanism (keep good experiments, hard-reset discards). An untracked results ledger and artifacts directory survive resets, providing a complete audit trail.
Validation & Anti-Overfitting
Implemented three-window temporal validation with frozen July 1 boundaries (falling between seasons for all four sports) to prevent seasonal data leakage. Discovery window for free iteration, selection window capped at 5 evaluations, audit window for a single one-shot check. Market-aware evaluation accounts for American odds and the ~4.5% vig—52.4% accuracy on -110 lines is breakeven, so every percentage point above that is real edge. Bootstrap lower confidence bounds handle small-sample uncertainty.
ML Pipeline
The system progresses through structured phases: single-factor profiling, combination testing, weight optimization, cross-sport exploration, then graduation to local ML mode with XGBoost and LightGBM when API mode plateaus. Expanding-window cross-validation only—never shuffled k-fold. Explicit leakage-safe feature selection filters out result columns, target columns, and metadata.
Product & Design
The interface is built for discipline: clean, daily, minimal. Each pick surfaces its edge, confidence score, and the reasoning behind it. The UX was designed around trust and daily habit—not information overload. Built on Next.js, Supabase, and Python.
72% win rate across all picks. 80%+ in NFL and NHL. 30% of users open the app daily. Retention is high. The methodology—not the marketing—is what makes those numbers credible. The autoresearch framework enforces the discipline that separates real predictive edge from overfitting: budget enforcement, temporal isolation, market-aware evaluation, and a complete experiment audit trail. Every claim is backed by a reproducible pipeline.
A predictive sports platform where the methodology makes the win rate credible.






