OneMain Financial Jobs

Job Information

Intercontinental Exchange (ICE) Senior Site Reliability Engineer in Pune, India

Job Description

Job Purpose

Site Reliability Engineer (SRE) headcount to assist with day-to-day activities supporting SRE services related to incidents. Build actionable alerts/automation for preventing incidents, detecting performance bottlenecks, and identifying maintenance activities.

Responsibilities

  • Employ deep troubleshooting skills to improve the availability, performance, and security of IMT Services

  • Coding and Automation of Applications on Cloud Platform

  • Implement automated tests, automated deployments, and operational tools

  • Collaborate with Product and Support teams to plan and deploy product releases

  • Work with Cloud Platform and Operations leaders to develop narratives, backlog grooming, epic planning, and overall sprint planning processes

  • Work with Engineering leadership to build shared services that meet the requirements and need of the platform and application teams

  • Ensure services are designed with 24/7 availability and operational readiness and rigor

  • Implementation of proactive monitoring, alerting, trend analysis and self-healing systems

  • Define non-functional requirements as part of the product lifecycle to influence the new designs, standards, and methods for scalable, highly available distributed systems

  • Contribute to product development / engineering as needed to ensure Quality of Service of Highly Available services

  • Identify, evaluate, and execute preventive measures to minimize/avoid impact to the customers experience Proactive v/s Customer escalated

  • Resolution of product/service defects or design changes, infrastructure changes, or operational changes

  • Partner with other SREs and lead by example - contributor more than a delegator

Knowledge and Experience

  • BS in Computer Science, Computer Engineering, Math, or equivalent professional experience

  • 7+ years of Systems/Applications automation in 24x7 Production support services environments

  • Fluency with one or more current generation scripting language (Python/Shell/Perl/ PHP/Ruby) AND/OR Java Development and/or .NET

  • Excellent troubleshooting skills, utilizing a systematic problem-solving approach

  • Demonstrated experience in designing, analysing, and diagnosing large-scale distributed systems + Windows Server and/or Linux systems internals (system libraries, file systems, client-server protocols)

  • Experience with elastically scalable, fault tolerance and other cloud architecture patterns

  • Experience operating on AWS (both PaaS and IaaS offerings)

  • Experience in both Windows (2k8R2+) and Linux

  • Experience with Continuous Integration and Continuous Delivery concepts

  • Hand-on experience in Infrastructure as code tools like Terraform, CloudFormation AND/OR Chef, Salt Stack, Ansible, Puppet

  • Good to have experience in Containerization concepts like Docker

  • Proven strength in SaaS services, experience in massive scale web operations

DirectEmployers