Site Reliability Engineer - Hardware Infrastructure — Nvidia | cvGO!
Nvidia · US, CA, Santa Clara · Office
### About the Role SRE for hardware infrastructure at Nvidia, ensuring reliability and scalability of GPU clusters and AI systems in data centers. ### Responsibilities - Design, deploy, and maintain hardware (servers, networks, storage) for high-load GPU clusters. - Monitor hardware health, diagnose