Location: San Francisco or Croatia
Type: Full-time
About the Role
Daytona is looking for a Senior System Engineer to help us build the next-generation distributed system for AI. You’ll be responsible for scaling Daytona to support millions, and eventually billions, of sandboxes running simultaneously.
This is a systems-level engineering role that requires deep expertise in distributed architectures, observability, and large-scale reliability. You’ll design and implement infrastructure that ensures Daytona runs smoothly at unprecedented scale.
What You'll Be Doing
- Architect and scale a distributed system capable of running billions of AI agents
- Build and optimize observability stacks to monitor and maintain system health
- Solve challenges in scalability, reliability, and fault tolerance
- Collaborate with platform and AI engineers to design a smooth, performant runtime
- Contribute to frameworks and standards for high-availability distributed systems
- Continuously improve performance and efficiency of the runtime layer
You Might Be a Fit If You
- Have extensive experience in distributed systems, high-availability, and scaling challenges
- Are comfortable with containerization, scheduling, and orchestration at scale
- Excel at debugging complex systems and building observability into everything you do
- Think in terms of resilience, failure domains, and graceful degradation