Nvidia
Senior Solutions Architect, CSP System
Company
Role
Senior Solutions Architect, CSP System
Location
China
Job type
Full time
Posted
3 hours ago
Salary
Job description
As a Senior CPU Expert focusing on Cloud Service Providers (CSPs) in China, you will be a core technical pillar in NVIDIA’s CSP team, responsible for driving CPU-related technical strategy, solution optimization, and customer engagement. You will work closely with major Chinese CSPs to address their CPU-centric needs in AI data centers, accelerate the deployment of NVIDIA’s integrated CPU-GPU-DPU platforms, and ensure optimal performance of customer workloads. This role requires deep expertise in CPU architecture, performance optimization, and data center infrastructure, as well as the ability to bridge technical requirements between CSP customers and NVIDIA’s global engineering teams.
What you'll be doing:
Work with Sales, BD and CPM team to introduce NVIDIA technologies into assigned accounts and grow business accordingly.
Serve as the primary technical authority on CPU technologies for NVIDIA’s Chinese CSP customers, providing expert consultation on CPU selection, architecture design, and integration with NVIDIA’s AI infrastructure (including Grace/Vera CPUs and NVL72 platforms).
Lead CPU-focused technical engagements with CSPs, collaborating with their R&D, infrastructure, and AI teams to understand workload requirements (e.g., AI data preprocessing, HPC, distributed computing) and design optimized CPU-GPU integrated solutions.
Drive CPU performance optimization for CSP workloads, conducting in-depth analysis of bottlenecks, implementing tuning strategies (including SIMD instruction set optimization and low-level intrinsics), and delivering reference implementations to unlock full platform potential.
Act as a liaison between CSP customers and NVIDIA’s global engineering, product, and R&D teams, advocating for customer-specific CPU requirements, providing feedback on product roadmaps, and ensuring alignment with NVIDIA’s technical strategy and export compliance guidelines.
Lead technical workshops, training sessions, and proof-of-concept (PoC) projects for CSPs, demonstrating the value of NVIDIA’s CPU-integrated solutions and enabling customer teams to effectively leverage these technologies.
Monitor industry trends in CPU technology, data center architectures, and CSP workload evolution, providing strategic insights to internal teams to enhance NVIDIA’s CPU-related products and solutions for the Chinese market.
Mentor junior technical team members, share CPU expertise, and drive best practices in CSP technical engagement and solution delivery.
What we need to see:
Bachelor’s/Master’s/PhD degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field; equivalent industry experience will be considered.
8+ years of hands-on experience in CPU architecture, performance optimization, or data center infrastructure, with a focus on high-performance computing (HPC) or AI workloads.
Deep expertise in CPU microarchitecture (e.g., x86, ARM), performance analysis tools, and optimization methodologies; proven track record of CPU benchmarking and bottleneck-driven tuning.
Strong programming proficiency in C/C++ and/or Python, with experience in low-level software optimization, compiler toolchains, or performance libraries.
Proven experience working with major Chinese CSPs or global hyperscalers, with a deep understanding of their technical requirements, infrastructure, and workload characteristics.
Excellent technical communication and presentation skills, with the ability to convey complex CPU and system-level concepts to both technical and non-technical stakeholders (customers, executives, internal teams).
Strong cross-functional collaboration skills, with the ability to work effectively in a matrixed global team and manage multiple priorities in a fast-paced environment.
Familiarity with NVIDIA’s data center products (GPUs, DPUs, CPUs) and software stacks is a significant plus; understanding of AI factory concepts and large-scale data center deployment is preferred.
Hands-on ability is mandatory.
Ways to stand out from the crowd:
Experience with NVIDIA Grace/Vera CPUs or other ARM-based high-performance CPUs, and hands-on experience with integrated CPU-GPU-DPU platforms.
Experience with CPU in Agentic AI and Post-Training.
Background in AI/ML workload optimization, particularly in data preprocessing, distributed training, or inference pipelines on CPU platforms.
Contributions to open-source performance tools, HPC frameworks, or CPU optimization libraries.
Experience leading technical programs or cross-functional initiatives for CSP customers, including PoC delivery and large-scale deployment support.
With competitive salaries and a generous benefits package, we are widely considered to be one of the world’s most desirable employers! We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our best-in-class engineering teams are rapidly growing. If you're a creative and autonomous person with a real passion for technology, we want to hear from you.