Job Information
Amazon Principal Systems Development Engineer, Managed Operations, Region Services Operations (RSO) in Berlin, Germany
Description
Lead the transformation of AWS's largest sovereign cloud operation while pioneering operational excellence across three European sites—tackling intrinsically hard problems that will shape the future of cloud infrastructure at unprecedented scale.
This Principal Systems Development Engineer role offers you the opportunity to drive technical, business, and cultural change within the AWS European Sovereign Cloud (ESC) Managed Operations organization, which now serves customers across Europe and beyond. You'll be embedded directly with operational teams while creating solutions that influence global AWS operations, balancing hands-on engineering with strategic leadership to reduce operational load and improve the reliability, performance, and efficiency of one of the world's largest cloud providers.
Key job responsibilities
Lead technical, business, and cultural transformation by identifying automation opportunities, designing solutions that reduce operational toil, and pioneering approaches that tackle intrinsically hard problems beyond conventional methods
Balance hands-on engineering with strategic leadership, working directly with local operational teams while collaborating with senior management to shape the strategic direction for AWS operations
Design and implement long-term engineering solutions that reduce operational load across all three European sites and influence global AWS operations standards
Analyze operational metrics across EU MO team to identify efficiency gaps and automation opportunities that deliver measurable improvements in operational burden
Lead cross-site collaboration by conducting design reviews with peer Principal Engineers and technical leaders, building consensus on solutions that scale across multiple locations
Influence investment decisions by working with senior management to secure resources and commitment for operational excellence initiatives
Prototype and develop automation systems that have the highest impact on reducing operational burden, continuously learning and acquiring new expertise as needed
Educate and inspire operational teams by advocating for best practices, pioneering new approaches, and demonstrating what's possible in cloud operations at scale
A day in the life
You'll balance your time between operating production systems and making long-term improvements to the reliability, availability, and performance of those software systems. An example week could look like: Monday you analyze operational metrics across the EU MO team to identify automation opportunities. Tuesday you identified a major efficiency gap in operational processes and designed a solution that reduces toil by 20%. On Wednesday you lead the design review with peer PEs and technical leaders across Berlin, Dublin, and Madrid, receiving consensus on a path forward. Thursday, you influenced your senior management to make investments to achieve operational excellence goals. Friday, you begun prototyping part of that automation system which would have the most impact on reducing operational burden across all sites.
• Requirement to participate in On-Call rotation.
• Fluency in written and spoken English is required.
• EU citizenship is required for this role.
• Successful applicants must have the legal right to work in Germany, Ireland, or Spain.
• Amazon will provide relocation support for successful applicants relocating within the EU.
Basic Qualifications
10+ years experience in software development or related field
Experience operating and troubleshooting reliable, scalable software systems
Proficient in at least one modern programming language such as Java, Typescript, Python, or Ruby - Able to troubleshoot at all levels, from network to operating systems to software applications
This role requires you to be a national of an EU member state
Preferred Qualifications
• Highly Proficient in operating 24x7 high-availability, distributed software applications
• Proven track record to dive deep into, and find opportunities to improve, the reliability, availability, and performance of distributed software systems
• Experience influencing and leading strategic efforts requiring work from multiple teams
• Experience actively mentoring individual engineers and managers
• Experience performance tuning software applications and optimizing fleet utilization
• Strong understanding of network fundamentals (DNS, DHCP, TCP/IP, routing, load balancing, load shedding)
• Proficient with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar)
• Proficient with operating services in AWS
• Experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar)
• Experience scripting operating system tasks in Bash, Python, etc.
• Experience with sovereign cloud, regulated industries, or data residency and compliance requirements
• Experience driving operational excellence and efficiency improvements at scale
Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build. Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice ( https://www.amazon.jobs/en/privacy_page ) to know more about how we collect, use and transfer the personal data of our candidates.
m/w/d
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.