← All Jobs
Posted May 6, 2026

Senior Hardware Engineer, GPU & PCIe

Apply Now
About the position CoreWeave is seeking a highly skilled and motivated Infrastructure/Hardware Engineer, focusing on GPU and PCIe troubleshooting, to join our Hardware Engineering team, reporting to the Hardware Engineering Manager. In this role, you will play a crucial part in the design, development, troubleshooting, and optimization of our server hardware infrastructure. You will collaborate closely with cross-functional teams, external vendors, and stakeholders to ensure the successful delivery of highly performant and reliable hardware solutions. Responsibilities • Troubleshoot complex GPU and PCIe related failures • Partner with external vendors on failure analysis • Track component RMAs • Develop and maintain hardware/firmware management services. • Automate all aspects of the server hardware lifecycle. • Serve as the senior point of contact for hardware escalation and troubleshooting. • Collaborate with cross-functional teams to define hardware requirements, specifications, system architecture and issue identification and resolution playbooks. • Create and maintain accurate documentation of hardware designs, specifications, test procedures, and results. • Analyze and optimize the performance of hardware systems, identify bottlenecks, and propose improvements for enhanced efficiency. • Establish processes for internal hardware testing, deployment, performance optimization and troubleshooting. Requirements • 5+ years of prior experience supporting and troubleshooting data center class GPUs ( H100 or newer, including Infiniband and NVLink). • Proficiency in ansible/python and experience with programmatically interacting with server BMCs, using IPMI or Redfish (preferably Redfish). • Experience using, integrating and automating data center class GPU diagnostics and troubleshooting tools, including observability platforms like prometheus and grafana. • In-depth knowledge of server hardware, components, and management technologies, particularly GPUs and PCIe devices. • Proven ability to stay updated with the latest industry technologies and trends. • Previous experience collaborating with hardware vendors to identify novel issues, generate operational playbooks, create alerts and drive issue resolution to completion • Strong passion for automation, with a commitment to automating processes comprehensively. • Excellent documentation skills and attention to detail. • Strong analytical and problem-solving abilities. Benefits • Medical, dental, and vision insurance - 100% paid for by CoreWeave • Company-paid Life Insurance • Voluntary supplemental life insurance • Short and long-term disability insurance • Flexible Spending Account • Health Savings Account • Tuition Reimbursement • Ability to Participate in Employee Stock Purchase Program (ESPP) • Mental Wellness Benefits through Spring Health • Family-Forming support provided by Carrot • Paid Parental Leave • Flexible, full-service childcare support with Kinside • 401(k) with a generous employer match • Flexible PTO • Catered lunch each day in our office and data center locations • A casual work environment • A work culture focused on innovative disruption Apply Now Apply Now
Interested in this role?Apply on iHire