Principal Software Engineer, At-Scale Reliability and Fleet Intelligence — CSP Engagements — Nvidia | cvGO!
Nvidia · US, CA, Santa Clara · Office
### About the Role Nvidia seeks a Principal Software Engineer to architect and implement fleet intelligence and reliability systems for GPU accelerators at cloud provider (CSP) scale. ### Responsibilities - Design and deploy monitoring, diagnostics, and predictive analytics for GPU cluster fleet int