Senior Reliability Engineer, DGX Cloud — Nvidia | cvGO!

Nvidia · 2 Locations · Office

### About the Role Senior Reliability Engineer for Nvidia's DGX Cloud, ensuring high availability and fault tolerance of AI-training infrastructure. ### Responsibilities - Design and implement high-availability architectures for DGX Cloud. - Build automated monitoring, alerting, and self-healing sys