Uber Rosetta: Translations Service
Rosetta was Uber's internationalization (i18n) service, responsible for translating all user-facing content across Uber's global platform. As Technical Lead, I helped scale the service to handle the highest read throughput of any microservice at Uber while maintaining sub-millisecond latency.
Project Overview
As Technical Lead for Rosetta, I was responsible for Uber's internationalization (i18n) service, which had the highest read throughput of all 2000+ microservices at Uber. Rosetta powered all translations across Uber's ecosystem, enabling the platform to operate seamlessly across 70+ countries and 30+ languages.
The Challenge
Uber's rapid global expansion created extraordinary internationalization challenges:
- Supporting 400+ million monthly active users across diverse languages and locales
- Handling 1M+ translation requests per second during peak traffic
- Maintaining sub-10ms response time for critical user flows
- Supporting contextual translations for different user types (riders, drivers, eaters, restaurants)
- Managing a translation corpus that grew by thousands of strings weekly
These challenges were compounded by Uber's microservice architecture, where each service needed to access translations without introducing performance bottlenecks.
Technical Implementation
Architecture Design
Rosetta was designed as a high-performance, distributed translation system:
- Core Service - Go-based API with highly optimized lookup paths
- Translation Storage - Multi-tiered storage with hot path caching
- Admin Portal - React application for translation management
- Client Libraries - SDKs for all Uber programming languages
- Integration Layer - Connectors for translation vendor APIs
This architecture enabled both performance at scale and flexibility for different use cases across Uber's platform.
Performance Optimizations
To achieve the required throughput and latency:
- Implemented multi-level caching strategy (in-process, Redis, distributed cache)
- Developed client-side caching with efficient invalidation mechanisms
- Designed sharded storage for horizontal scaling
- Created custom serialization format for minimal memory footprint
- Built traffic throttling and circuit breakers for system stability
These optimizations allowed Rosetta to deliver translations with p99 latency under 10ms despite handling the highest request volume in Uber's ecosystem.
ML Integration
A key innovation was integrating machine learning into the translation workflow:
- Partnered with Uber's AutoML (Michelangelo) team
- Implemented automatic translation suggestion system
- Created quality prediction models for human translator assistance
- Built context-aware ranking for translation alternatives
- Developed automated quality assurance workflows
This ML integration reduced translation costs by 35% while improving quality and reducing time-to-market for new languages.
Communications Platform
Beyond Rosetta, I led the integration of ML onto Uber's Communications Platform:
- Personalized message content and timing based on user behavior patterns
- Optimized channel selection (push, SMS, email) using ML models
- Developed engagement prediction to minimize notification fatigue
- Created A/B testing framework for message effectiveness
This work resulted in a 50% increase in communication engagement and significant cost savings by reducing low-value notifications.
Results & Impact
The Rosetta service delivered substantial business impact:
- Enabled Uber's expansion to 70+ countries with consistent user experience
- Reduced translation costs by 35% through ML-assisted workflows
- Shortened time-to-market for new languages from weeks to days
- Maintained 99.99% availability despite handling 1M+ requests per second
- Improved translation quality through context-aware processing
For the Communications Platform integration, we achieved:
- 50% increase in engagement with critical communications
- 30% reduction in unnecessary notifications
- Significant cost savings on SMS and other communication channels
- More effective driver incentives through personalized messaging
Technical Challenges Overcome
Scale & Performance
Handling the highest throughput at Uber required innovative approaches:
- Developed custom benchmark suite to identify performance bottlenecks
- Implemented adaptive rate limiting based on system load
- Created specialized instrumentation for sub-millisecond latency tracking
- Designed gradual rollout strategy for high-risk changes
Global Deployment
Supporting global operations required addressing regional challenges:
- Built multi-region deployment with data replication strategies
- Implemented region-specific fallback mechanisms
- Created specialized handling for right-to-left languages
- Developed support for complex pluralization rules across languages
My Role & Contributions
As Technical Lead for Rosetta, I:
- Led the architecture design and performance optimization
- Managed a team of backend, frontend, and ML engineers
- Coordinated with stakeholders across all Uber product lines
- Pioneered the ML integration strategy with the Michelangelo team
- Established SLAs and monitoring for the high-throughput service
- Represented the team in architecture review committees
Technologies Used
- Languages: Go, Python, Java
- Databases: MySQL, Redis, Cassandra
- Infrastructure: Uber's internal cloud platform
- ML Frameworks: Michelangelo (Uber's ML platform)
- Monitoring: M3, Grafana, internal observability tools