Job Information
Insight Global Data Operations Engineer in Cupertino, California
Job Description
As data operations engineer, you will collaborate with various infrastructure teams, platforms, product, engineering and scientists, to identify requirements that will derive the creation of sensible operations to build and manage data analytics foundations. The ideal candidate is a self-motivated teammate with good problem solving and communication skills with the ability to adapt and learn quickly, provide results with limited direction, and choose the best possible operational decisions.
KEY RESPONSIBILITIES
• Lead day-to-day operations to ensure organizational delivery of quality results.
• Responsible for provisioning, enabling, scaling and maintaining our team’s data, analytics and ML infrastructures for batch and real time systems including pipelines, frameworks, tools and services in hybrid cloud.
• Shepherd zero-downtime deployment process through continuous delivery practices, rapidly releasing features that provide critical and faster insights to business users.
• Collaborate with the platform team by building the right tools for observability , monitoring, alerting and self-healing for the day to day management of analytics foundations.
• Debug complex problems in distributed environment and ability to run the prod incidents efficiently to following up with Post incident reviews.
• Developing self-service tools and automation to improve engineering efficiency and the quality of services.
• Excellent verbal and written communication skills.
• Self-starter with forward thinking capability with strong executional track record and be accountable for business priorities.
• Hands-on experience with CI/CD pipelines and cloud environments like Gitlab, Spinnaker, Docker/Kubernetes observability using Splunk, Grafana/Data dog is a huge plus.
• Proficient with one or more programming languages such as Python/Go/Rust/Java/Scala for automation and API integrations.
• Experience in implementing security controls, governance processes, compliance validation, infrastructure cost analysis and optimization.
• Knowledge of analytics and Applied ML stack like Apache Spark, Trino/Pinot, Iceberg, Atlas, Flink, Airflow/Luigi, Tableau, Snowflake, Databricks, MLFlow, Data Catalogs, Jupyter Notebooks, Vector database and Cassandra.
• Solid understanding of IAAC (infrastructure as a code) techniques like terraform, orchestration and tooling.
• Experience scaling operations in a fast-paced and dynamic environment. Experience working in agile or evolving product environments.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
• Self-starter with a forward-thinking mindset, strong execution track record, and accountability for business priorities.
• Hands-on experience with CI/CD pipelines and cloud environments (e.g., GitLab, Spinnaker, Docker, Kubernetes).
• Experience with observability and monitoring tools such as Splunk, Grafana, or Datadog.
• Proficiency in one or more programming languages (Python, Go, Rust, Java, or Scala) for automation and API integrations.
• Solid understanding of Infrastructure as Code (IaC) tools and practices (e.g., Terraform).
• Knowledge of analytics and applied ML ecosystems, including technologies such as Spark, Flink, Airflow/Luigi, Snowflake, Databricks, MLflow, Tableau, data catalogs, and distributed databases (e.g., Cassandra, vector databases).
• Experience operating and scaling systems in distributed, production-grade environments.