Job Information
Amazon Sr. Technical Program Manager, Global Data Center Operations (Central Operations) in Seattle, Washington
Description
The Central Operations team within Amazon Web Services (AWS) Infrastructure is seeking a Senior Technical Program Manager to drive the health, stability, and operational excellence of new hardware deployments across our global data center fleet. This role uniquely blends technical program management with strategic account management to ensure our GenAI and high-performance computing infrastructure delivers maximum value to customers.
As a Sr. TPM, you will be the technical advocate and strategic advisor for operational support of new AI/ML hardware platforms. You will serve as the central owner of operational health (failure rate, repair efficacy, repair dwell time, break/fix process improvement) while driving cross-functional initiatives to improve these key performance indicators. You will work at the intersection of hardware engineering, data center operations, and service teams like EC2—translating complex technical data into actionable insights and leading programs that accelerate capacity delivery while maintaining the highest standards of operational health.
This is not a sales role, but rather an opportunity to be the 'voice of the customer' and the 'voice of operations' for critical infrastructure that powers AWS's most demanding workloads. You will craft and execute strategies to optimize new hardware deployments, proactively identify and remediate stability issues, and establish best practices that scale across AWS's global infrastructure.
Key job responsibilities
Hardware Health & Stability Leadership
Own the end-to-end health and stability metrics for new AI/ML hardware platforms, establishing KPIs and routines that provide real-time visibility into operational performance
Drive deep-dive analyses on hardware failures to identify root causes and drive systematic improvements
Lead cross-functional investigations, experiments, and post-mortem processes, ensuring lessons learned translate into preventive measures and design improvements
Develop and maintain hardware health scorecards that inform leadership decisions on deployment readiness, capacity planning, and risk mitigation
Technical Program Management
Manage complex, multi-phase infrastructure projects involving hardware engineering, supply chain, data center operations, and software teams across multiple time zones
Establish and maintain program schedules, budgets, and resource plans, proactively identifying and mitigating risks to delivery timelines
Facilitate technical deep dive sessions to troubleshoot diagnostic and repair issues, remove blockers, and accelerate project delivery
Design and implement processes that eliminate non-value-add activities and optimize deployment velocity without compromising quality
Strategic Account Management
Serve as the primary operational point of contact for new platforms across software and hardware teams, summarizing platform operational status and path-to-green
Build trusted advisor relationships with data center operations, hardware engineering, and service teams to understand their operational needs and technical challenges
Translate operational feedback and customer requirements into hardware and process improvement roadmaps, and engineering priorities
Provide strategic technical guidance on AI/ML deployment strategies, best practices, and operational procedures
Advocate for operational excellence, ensuring that hardware health considerations are integrated into capacity planning and service delivery decisions
Cross-Functional Collaboration & Influence
Partner with hardware engineering teams to influence design decisions based on operational data and field performance
Collaborate with new product introduction and hardware engineering teams to ensure quality gates are met before launch
Work with monitoring and automation teams to implement appropriate signals to ensure customer commitments are met
Drive alignment across diverse stakeholders including engineering, operations, finance, and executive leadership
Present technical assessments and recommendations to senior leadership, clearly articulating trade-offs, risks, and business impact
Basic Qualifications
5+ years of technical product or program management experience
7+ years of working directly with engineering teams experience
Experience in root cause analysis and error correction, identifying changes to procedures and systems to implement long-term fixes and avoid repeating issues
Experience leading process improvements
Experience in written and verbal communication skills to communicate with technical and non-technical audiences, including senior leadership
Preferred Qualifications
Experience in technical account management, business relationship management, or consulting
Knowledge of Six Sigma tools, Lean techniques, PMP or similar standards preferred
Experience in server technologies such as, thermal, mechanical, power, and signal integrity
Experience managing UltraServer, high-performance computing, or AI/ML infrastructure deployments
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits .
USA, WA, Seattle - 148,700.00 - 201,200.00 USD annually