




Trelx
Catch and patch failing voice agents from live production transcripts before bad behavior scales.
Links
Additional info
How was your experience building with Codex?
Insane but good. The constraints forced me to ship fast. Built a whole AI QA engine in like 5 hours.
Describe your experience using Loops House as the hackathon platform. What worked well, what challenges (if any) did you face, and what improvements would you like to see?
It did the job. Submitting was easy enough. Very clean and intuitive
Tell us about your overall experience at Codex Community Hackathon Pune.
Electric vibe. Sleep deprived, stressed out, but getting to build and demo alongside other devs was totally worth it. Good shit.
What could Codex Community improve to create a better experience for participants?
Better wifi I would say
Team
2 members- RUOwner
Rushil
- VA
Varad Adake
Overview
Trelx is a closed-loop evaluation engine for Ultravox voice agents. It automatically ingests live production calls via the Ultravox API, analyzes transcripts using GPT-4o, maps failures to a strict taxonomy, and isolates the exact transcript quote where the agent failed. But it doesn't stop at reporting, Trelx acts as an automated prompt engineer, generating safer prompt patches and allowing teams to push those fixes back to Ultravox instantly. It is built for high-volume workflows where catching a prompt hallucination early saves hundreds of future calls.