OneMain Financial Jobs

Job Information

Akamai Technologies, Inc. Senior Manager Site Reliability Engineering in Poland

Do you thrive on building reliability into AI infrastructure from the ground up?

Prepare to lead SRE initiatives for AI solutions, optimizing GPU clusters and serverless inference, while ensuring global-scale performance and reliability.

Join the Akamai AI Team!

Akamai's Cloud Technology Group delivers AI infrastructure at global scale. Our GPU compute platform provides customers with dedicated GPU resources, from single GPUs to full clusters, for training, simulation, inference, and any workload they choose to run. Site Reliability Engineering is embedded from the start to ensure production-grade reliability and performance.

Partner with the best

As Senior Manager, you will lead the team responsible for reliability across Akamai's AI compute and platform services. You will also build the team, owning hiring strategy, candidate evaluation, and interview coordination for AI SRE roles. This is a hands-on leadership role that requires partnering with product engineering teams to embed reliability into products that are moving fast.

As a Senior Manager of SRE, you will be accountable for:

  • Fostering and growing the AI SRE team by recruiting, guiding, and supporting career development, elevating SRE expertise throughout the organization.

  • Defining and implementing SRE practices for Akamai's AI compute and platform services, encompassing SLOs, error budgets, capacity planning, and fault management.

  • Ensuring operational readiness for AI products by establishing quality gates, on-call rotations, runbooks, and escalation paths for AI infrastructure failure mode

  • Partnering with product engineering teams to embed reliability into the development lifecycle, influencing architecture and deployment decisions

  • Scaling operations through software and automation, reducing toil and driving the team toward programmatic solutions over manual intervention

  • Owning incident management integration for AI workloads, including post-incident analysis and driving systemic improvements that prevent recurrence

Do what you love

To be successful in this role you will:

  • Have extensive experience in SRE, infrastructure, or platform engineering, with an expertise in leading SRE teams

  • Track record of building SRE teams and practices, ideally in an environment where SRE was new or being established

  • Demonstrate expertise in SLOs/SLIs, observability tools, and large-scale incident management while ensuring operational efficiency.

  • Demonstrate expertise with Kubernetes and containerization in large-scale environments.

  • Demonstrate expertise in Python or Go automation and tooling, while possessing knowledge of Linux systems and networking fundamentals.

  • Manage CI/CD pipelines, implement deployment safety measures, and utilize infrastructure-as-code tools like Terraform or similar alternatives.

  • Build relationships with product engineering teams while effectively communicating SRE value in terms relevant to engineering partners.

Work in a way that works for you

FlexBase, Akamai's Global Flexible Working Program, is based on the principles that are helping us create the best workplace in the world. When our colleagues said that flexible working was important to them, we listened. We also know flexible working is important to many of the incredible people considering joining Akamai. FlexBase, gives 95% of employees the choice to work from their home, their office, or both (in the country advertised). This permanent workplace flexibility program is consistent and fair globally, to help us find incredible talent, virtually anywhere. We are happy to discuss working options for this role and encourage you to speak with your recruiter in more detail when you apply.

Learn (https://www.akamai.com/careers) what makes Akamai a great place to work

Connect with us on social and see what life at Akamai is like!

We power and protect life online, by solving the toughest challenges, together.

At Akamai, we're curious, innovative, collaborative and tenacious. We celebrate diversity of thought and we hold an unwavering belief that we can make a meaningful difference. Our teams use their global perspectives to put customers at the forefront of everything they do, so if you are people-centric, you'll thrive here.

Working for you

At Akamai, we will provide you with opportunities to grow, flourish, and achieve great things. Our benefit options are designed to meet your individual needs for today and in the future. We provide benefits surrounding all aspects of your life:

  • Your health

  • Your finances

  • Your family

  • Your time at work

  • Your time pursuing other endeavors

Our benefit plan options are designed to meet your individual needs and budget, both today and in the future.

About us

Akamai powers and protects life online. Leading companies worldwide choose Akamai to build, deliver, and secure their digital experiences helping billions of people live, work, and play every day. With the world's most distributed compute platform from cloud to edge we make it easy for customers to develop and run applications, while we keep experiences closer to users and threats farther away.

Join us

Are you seeking an opportunity to make a real difference in a company with a global reach and exciting services and clients? Come join us and grow with a team of people who will energize and inspire you!

Akamai Technologies is an Affirmative Action, Equal Opportunity Employer that values the strength that diversity brings to the workplace. All qualified applicants will receive consideration for employment and will not be discriminated against on the basis of gender, gender identity, sexual orientation, race/ethnicity, protected veteran status, disability, or other protected group status.

If no date is displayed, applications are being accepted on an ongoing basis until the job is filled.

DirectEmployers