Job Information
BlackRock Vice President, DevOps Engineer , Lead Engineer in Mumbai, India
About this role
Team Overview:
Data is at the core of the Aladdin platform, and increasingly, our ability to consume, store, analyze, and gain insight from data is a key component of our competitive advantage. The Data Engineering team is responsible for the data ecosystem within BlackRock. We engineer high performance data pipelines, provide a fabric to discover and consume data, and continually evolve our data storage capabilities. We believe in writing small, testable code with a focus on innovation. We are committed to open source, and we regularly contribute our work back to the community.
We are seeking top tier Cloud Native DevOps Platform Engineers to augment our Enterprise Data Platform team. Our objective is to extend our data lifecycle management practices to include structured, semi structured and unstructured data. This role requires a breadth of individual technical capabilities and competencies, though, most important, is a willingness and openness to learning new things across multiple technology disciplines. This role is for practitioners and not researchers.
Position Summary:
As a Data Platform Cloud/DevOps Engineer in the Data Engineering team, you will design, build, and maintain the cloud-native infrastructure that powers Aladdin's Enterprise Data Platform. You will enable data engineers, AI engineers, and application developers by providing scalable, reliable, and cost-efficient infrastructure for data processing, AI/ML workloads, and analytics services.
Key Responsibilities:
Infrastructure and Cloud Engineering
Design, deploy, and manage cloud-native infrastructure across AWS, Azure, and private clouds
Implement Infrastructure as Code (IaC) using Terraform, Ansible, and CloudFormation for repeatable, auditable deployments
Manage Kubernetes clusters for scalable, reliable, and secure application and data workloads
Deploy and configure service mesh, HashiCorp Vault, cert-manager, and other Kubernetes-native frameworks
Design and implement network architectures including VPCs, load balancers, and ingress/egress controls
Deploy and configure LLM serving platforms like MCP/agent orchestrators, chatbots, vector embedding services and secured API gateways for generative AI applications
CI/CD and Automation
Build and maintain CI/CD pipelines using ArgoCD, Azure DevOps, Jenkins, and GitHub Actions
Implement GitOps workflows for automated, auditable infrastructure and application deployments
Automate repetitive operational tasks using Python and Bash to improve team efficiency and reduce manual errors
Develop self-service infrastructure provisioning capabilities for engineering teams
Maintain version control best practices and collaborative development workflows
Build and maintain MLOps CI/CD pipelines for automated model deployment to production environments
Site Reliability Engineering (SRE)
Implement monitoring, logging, and observability solutions using Prometheus, Grafana, ELK Stack, and Datadog
Define and track Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets for data platform services
Build automated alerting systems to proactively detect infrastructure issues and performance degradation
Perform capacity planning and performance tuning for production infrastructure
Conduct reliability analysis and implement preventive measures to improve system uptime
Collaborate with operational teams on incident escalation and system reliability improvements
Implement chaos engineering practices to test infrastructure resilience and fault tolerance
Cloud Cost Optimization and FinOps
Monitor and optimize cloud infrastructure costs across AWS, Azure, and private cloud environments
Right-size compute, storage, and networking resources based on utilization metrics and cost-performance analysis
Develop cost dashboards and reports to provide visibility into infrastructure spending trends
Collaborate with finance and engineering teams on cloud budget planning and forecasting
Evaluate and recommend cost-effective architectural alternatives (e.g., spot instances, reserved capacity, serverless options)
Desired Skills
Cloud and Infrastructure
Expert-level experience with AWS, Azure, or GCP cloud platforms and services
Proficiency with Infrastructure as Code tools (Terraform, Ansible, CloudFormation)
Templating with Helm, ArgoCD, Ansible, and Terraform
Deep knowledge of Kubernetes (K8s) APIs, controllers, operators, and stateful workloads
Understanding of the K8s Operator Pattern -- comfort and courage to wade into (predominantly golang based) operator implementation code bases
Comfortable building atop K8s native frameworks including service mesh (Istio), secrets management (cert-manager, HashiCorp Vault), log management (Splunk), observability (Prometheus, Grafana, AlertManager).
CI/CD and Automation
Hands-on experience with CI/CD platforms (ArgoCD, Azure DevOps, Jenkins, GitHub Actions)
Proficiency in scripting languages (Python, Bash) for automation and infrastructure tooling
Experience implementing GitOps principles and workflows
Version control expertise (Git, branching strategies, collaborative development)
Site Reliability Engineering (SRE)
Experience implementing monitoring and observability solutions (Prometheus, Grafana, ELK Stack, Datadog)
Knowledge of SRE principles including SLOs, SLIs, error budgets, and reliability engineering
Experience implementing and operating telemetry-based monitoring, alerting, and incident response systems.
Performance tuning and capacity planning experience for production systems
Experience with chaos engineering and reliability testing
FinOps and Cost Management
Experience with cloud cost optimization strategies and FinOps practices
Ability to analyze infrastructure costs and identify optimization opportunities
Knowledge of cost allocation, tagging strategies, and chargeback mechanisms
Familiarity with cloud cost management tools (AWS Cost Explorer, Azure Cost Management, CloudHealth)
Nice to have skills
Certifications: AWS Certified DevOps Engineer, CKA (Certified Kubernetes Administrator), HashiCorp Terraform Associate, FinOps Certified Practitioner
Experience with AI/ML infrastructure platforms (Kubeflow, MLflow, Ray, model serving frameworks)
Understanding of Natural/Large Language Models
Experience with basic prompt engineering, LLM fine tuning, and chatbot implementations in modern python SDKs like langchain and/or transformers
Familiarity with policy-as-code tools (Open Policy Agent, Kyverno)
Experience with multi-cloud and hybrid cloud architectures
We are looking for candidates with 8-12 years of hands-on experience in Data Platform DevOps/Cloud or related Engineering practices.
Our benefits
To help you stay energized, engaged and inspired, we offer a wide range of benefits including a strong retirement plan, tuition reimbursement, comprehensive healthcare, support for working parents and Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about.
Our hybrid work model
BlackRock’s hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees, while supporting flexibility for all. Employees are currently required to work at least 4 days in the office per week, with the flexibility to work from home 1 day a week. Some business groups may require more time in the office due to their roles and responsibilities. We remain focused on increasing the impactful moments that arise when we work together in person – aligned with our commitment to performance and innovation. As a new joiner, you can count on this hybrid model to accelerate your learning and onboarding experience here at BlackRock.
About BlackRock
At BlackRock, we are all connected by one mission: to help more and more people experience financial well-being. Our clients, and the people they serve, are saving for retirement, paying for their children’s educations, buying homes and starting businesses. Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress.
This mission would not be possible without our smartest investment – the one we make in our employees. It’s why we’re dedicated to creating an environment where our colleagues feel welcomed, valued and supported with networks, benefits and development opportunities to help them thrive.
For additional information on BlackRock, please visit @blackrock (http://careers.blackrock.com/) | Twitter: @blackrock (https://twitter.com/blackrock) | LinkedIn: www.linkedin.com/company/blackrock
BlackRock is proud to be an Equal Opportunity Employer. We evaluate qualified applicants without regard to age, disability, family status, gender identity, race, religion, sex, sexual orientation and other protected attributes at law.