Back to Search Results
Get alerts for jobs like this Get jobs like this tweeted to you
Company: AMD
Location: Markham, ON, Canada
Career Level: Mid-Senior Level
Industries: Technology, Software, IT, Electronics

Description

WHAT YOU DO AT AMD CHANGES EVERYTHING

We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences – the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives. 

AMD together we advance_

THE ROLE:

The Quality Returns Debug Team is looking for a GPU PCBA Debug and Failure Analysis Sr. Engineer who will work with our engineering functions to perform board level (PCBA) failure analysis on customer and factory failures of GPU Accelerators to reproduce reported failures, isolate the cause of failure and work closely with cross-functional teams including design, validation, FW and manufacturing to drive root cause analysis and corrective actions. Your contributions will directly impact product quality, reliability, and customer satisfaction.

 

THE PERSON:

The ideal candidate is a self-starter with a strong aptitude for hands-on technical work and a passion for solving complex problems. They thrive in dynamic environments, quickly adapting to change and embracing new challenges. With a collaborative mindset and the ability to work independently with minimal supervision, they effectively manage multiple tasks and contribute meaningfully to team goals. Their analytical thinking and experience in system integration, particularly with GPUs and High Performance Computing systems, support their ability to deliver high-quality solutions. A natural problem-solver, they bring a proactive approach to failure analysis and repair, consistently driving progress through curiosity and technical insight.

 

KEY RESPONSIBILITIES:

  • Support internal and external requests to troubleshoot PCBA-level AMD GPU product failures for continuous yield & quality improvements, and customer quality support within expected timelines.
  • Execute DOE's that run targeted tests to reproduce and isolate hard to find failures.
  • Develop Automation and tools to run tests and analyze results/logs.
  • Perform thorough incoming visual inspection and document condition of all units submitted for analysis.
  • Perform initial triage and communicate with the contract manufacturer and/or internal AMD teams (such as Design, BIOS, firmware, memory, I/O, display, diagnostics, Test Engineering, Board operations, etc.) as needed to converge on failure reproduction efforts and root cause identification.
  • Document all findings into FA database and create a complete failure analysis report for customer consumption as needed.
  • Implement ongoing continuous improvements of failure analysis process & techniques and create procedures of the steps to follow.
  • Oversee the set-up of new products and test stations for Failure Analysis operations.

PREFERRED EXPERIENCE:

  • Expertise in GPU architecture, including debug, validation, and stress/functional test development.
  • Skilled in using lab equipment (oscilloscopes, logic analyzers) and custom test tools.
  • Strong understanding of PCBA diagnostics, failure analysis, and debug techniques.
  • Experience with BIOS/firmware configuration and knowledge of firmware-driver-hardware interactions.
  • Proficient in Python, shell scripting, and working across Windows and Linux environments.
  • Familiarity with PCBA manufacturing processes and IPC-A-610 quality standards.
  • Hands-on experience assembling, installing, and maintaining computer systems and servers.
  • Able to read schematics, interpret datasheets, identify components, and perform soldering/rework.
  • Knowledge of high-speed digital design, memory interfaces (HBM, GDDR), PCIe, and display outputs (DP, HDMI).
  • Strong documentation skills and proficiency in MS Excel.
  • Experience with GPU data center infrastructure and AI/ML technologies is a plus.

ACADEMIC CREDENTIALS:

Bachelor's degree in electrical or computer engineering or equivalent preferred.

 

LOCATION:

CA,ON,Markham

 

 

#LI-BS1

Benefits offered are described:  AMD benefits at a glance.

 

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law.   We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.


 Apply on company website