AutoMobileAIComputer Vision
Video Highlights & Scoring System
Event-driven backend that scores car-condition videos with YOLO and ElevenLabs.
Overview
Client Overview
An event-driven backend system built from scratch using FastAPI and PostgreSQL to score and highlight car-condition videos. The pipeline uses ElevenLabs for audio transcription and YOLO for object detection to identify car parts, rate their condition, and generate a final video report containing every detected highlight.
Industries
AutoMobileAIComputer Vision
Technologies
PythonFASTAPIYoloElevenlabsVLLM
Status
Live & Active
Challenges
The Challenges
1
Coordinating heavy video processing jobs in an event-driven backend.
2
Running YOLO accurately enough to rate car condition without manual review.
3
Synchronizing audio transcripts with detected video segments.
4
Generating a polished report video that combines all of the analysis output.
Solutions
Solutions & Strategies
01
Event-Driven Backend
- Built a FastAPI + PostgreSQL backend with event-driven job orchestration.
- Designed pipelines that scale across multiple video inputs without blocking.
02
Vision & Audio Pipeline
- Used YOLO to detect car parts in each frame and score their condition.
- Used ElevenLabs to transcribe audio commentary and align it with video.
03
Report Generation
- Combined detected highlights, condition ratings, and transcripts into one output video.
- Used VLLM for high-throughput LLM inference where needed in the pipeline.
Results
The Results
✓Key Achievements
- End-to-end video scoring pipeline shipped from scratch.
- Automated condition ratings driven by YOLO detections.
- Aligned audio transcription via ElevenLabs.
- Generated final report videos as a single deliverable.
★Project Highlights
- Event-driven FastAPI + PostgreSQL backend.
- YOLO-powered object detection and scoring.
- ElevenLabs-driven audio transcription.
- Auto-generated report videos.
Tech Stack
Technologies Used
Backend
PythonFastAPI
AI / Vision
YOLOElevenLabsVLLM
Database
PostgreSQL


