Job Information
IBM Site Reliability Engineer in GUADALAJARA, Mexico
Introduction
At IBM Infrastructure & Technology, we design and operate the systems that keep the world running. From high-resiliency mainframes and hybrid cloud platforms to networking, automation, and site reliability. Our teams ensure the performance, security, and scalability that clients and industries depend on every day. Working in Infrastructure & Technology means tackling complex challenges with curiosity and collaboration. You’ll work with diverse technologies and colleagues worldwide to deliver resilient, future-ready solutions that power innovation. With continuous learning, career growth, and a supportive culture, IBM provides the opportunities to build expertise and shape the infrastructure that drives progress.
Your role and responsibilities
We’re seeking a Junior Site Reliability Engineer to support the availability, performance, and day‑to‑day operations of our services and platforms. The engineer in this role will apply SRE best practices—automation, observability, Kubernetes, CI/CD—while developing technical depth under the guidance of senior engineers. Responsibilities include system maintenance, tooling improvements, participation in on‑call, and contributing to the reliability and scalability of services.
Key Responsibilities
Operations & Reliability
Participate in an on‑call rotation with mentorship and established runbooks
Perform operational tasks: log reviews, rollouts, restarts, configuration updates, certificate renewals
Maintain and update runbooks, dashboards, diagrams, and documentation
Monitoring & Observability
Build or update dashboards and alerts using Prometheus, Grafana, and Loki
Tune alerts to reduce noise and improve signal quality
Apply golden signal and RED/USE patterns under guidance
Automation & Tooling
Develop automation scripts with Python, Bash, or Go to eliminate repetitive tasks
Contribute to CI/CD pipelines (linting, gates, templates)
Cloud & Platform
Support deployment and operation of workloads on Docker, Kubernetes, and OpenShift
Contribute to infrastructure changes using Terraform and Ansible with review
Assist with basic cloud provisioning tasks
Networking & Security
Apply foundational networking concepts (TCP/IP, DNS, routing, HTTP, TLS) in troubleshooting
Follow least‑privilege and proper secrets‑management practices
Collaboration & Process
Participate in Agile ceremonies (standups, planning, retros)
Contribute to blameless post‑incident reviews
Collaborate with cross‑functional teams and use standard Git workflows
Required technical and professional expertise
Less than a year of experience in SRE/DevOps/Platform Engineering or related fields
Strong Linux fundamentals: CLI, processes, permissions, logs, troubleshooting
Proficiency in at least one scripting language (Python, Bash, or Go)
Experience with Git and GitHub workflows
Familiarity with Docker and Kubernetes basics
Understanding of CI/CD fundamentals
Basic networking knowledge
Advanced English proficiency is a must
Preferred technical and professional experience
OpenShift experience
Hands‑on exposure to Terraform and Ansible
Experience with Prometheus, Grafana, Loki, Thanos, or OpenTelemetry
Cloud platform fundamentals (IBM Cloud, AWS, Azure, or GCP)
Optional experience with JavaScript or TypeScript
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.