Search for More Jobs

Get alerts for jobs like this Get jobs like this tweeted to you

Company: AMD

Location: Beijing, China

Career Level: Mid-Senior Level

Industries: Technology, Software, IT, Electronics

Apply on company website View all jobs at this company

Description

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

THE ROLE

We are looking for a hands‑on Engineer to design, implement, and optimize AI model training and inference solutions for AMD platforms. The role focuses on end‑to‑end performance and accuracy improvements at the framework, model, and operator levels, with strong emphasis on low‑bitwidth quantization, model compression, and real‑world deployment. You will work closely with AMD hardware and software teams, support customers, and contribute to open‑source projects and inference/training frameworks.

KEY RESPONSIBILITIES

* Design, implement, and optimize inference and training pipelines for AMD GPUs/accelerators at the framework, model, and operator levels.

* Lead research and development of model optimization algorithms: low‑bitwidth quantization, pruning/sparsity, compression, efficient attention mechanisms, and lightweight architectures.

* Implement and tune CUDA/ROCm/Triton kernels for critical operators; profile and eliminate performance bottlenecks.

* Integrate and optimize models for PyTorch/JAX and common distributed training/inference stacks (Torchtitan, Megatron, DeepSpeed, HF Transformers, etc.).

* Reduce latency and increase throughput for large‑model inference (e.g., batching strategies, caching, speculative decoding).

* Contribute to and/or maintain open‑source inference/training tools, ensuring production readiness and community adoption.

* Provide technical support and guidance to customers and internal teams to achieve target accuracy and performance on AMD platforms.

TECHNICAL REQUIREMENTS

* Strong software engineering in Python and C/C++.

* Practical experience with PyTorch/JAX and building/extending deep learning frameworks.

* Hands‑on CUDA and/or ROCm development; experience writing or optimizing GPU kernels.

* Experience with Triton (kernel development/optimization) is highly desired.

* Proven experience with model optimization techniques, especially low‑bitwidth quantization and other compression methods.

* Familiarity with GenAI inference engines and optimizations (e.g., vLLM, SGLang, xDiT, continuous batching, speculative decoding).

* Skilled at profiling and performance debugging across stack layers (operator → model → framework → hardware).

PREFERRED QUALIFICATIONS

* Publications or contributions in model optimization / ML systems are a strong plus.

* Experience with distributed training/inference frameworks (e.g., Torchtitan, Megatron, DeepSpeed, HF Accelerate, vLLM, SGLang, xDiT).

* Background in PTQ/QAT quantization algorithms, efficient attention variants, or low-bitwidth/sparse kernels.

* Familiarity with real‑world deployment constraints and performance validation.

PERSONAL ABILITIES

* Excellent problem‑solving skills and comfort with low‑level performance work.

* Fast learner who keeps pace with new algorithms and hardware capabilities.

* Strong communicator, able to collaborate across software, hardware, and customer teams.

* Proactive open‑source mindset and willingness to represent the team in community projects.

If you want to push the boundaries of model performance on AMD hardware and enjoy both research and engineering at the system level — we'd like to hear from you.

#LI-FL1

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

Apply on company website

模型优化工程师（推理&训练）Model Optimization Engineer (Inference & Training) Job Listing at AMD in Beijing (Job ID 75844-en-us)

Description

About CareerArc

HR Solutions

Job Seekers

模型优化工程师（推理&训练）Model Optimization Engineer (Inference & Training) Job Listing at AMD in Beijing (Job ID 75844-en-us)

Description

Find Connections via Linkedin

General Tips

Asking for Help

Getting Introduced

About CareerArc

HR Solutions

Job Seekers