Spark Product Manager in San Jose, California, United States

Job Information

IBM Spark Product Manager in San Jose, California

Introduction

At IBM Software, we transform client challenges into solutions. Building the world’s leading AI-powered, cloud-native products that shape the future of business and society. Our legacy of innovation creates endless opportunities for IBMers to learn, grow, and make an impact on a global scale. Working in Software means joining a team fueled by curiosity and collaboration. You’ll work with diverse technologies, partners, and industries to design, develop, and deliver solutions that power digital transformation. With a culture that values innovation, growth, and continuous learning, IBM Software places you at the heart of IBM’s product and technology landscape. Here, you’ll have the tools and opportunities to advance your career while creating software that changes the world.

We are looking for a Product Manager to drive critical work across our Apache Spark engine, spanning Spark SQL, Structured Streaming, the Catalyst optimizer, and lakehouse integrations. This role requires someone who understands the engine well enough to drive real improvements, not just prioritize tickets.

You will work with Spark platform engineers, data engineering, and ML infrastructure teams. Diagnosing slow stages and reading query plans is expected.

Your role and responsibilities

As a Technical Product Manager, you will drive product strategy, leveraging your deep domain expertise to identify target markets and opportunities. Your primary responsibilities will include:

Roadmap: Drive the Spark engine roadmap: Catalyst optimizer, AQE, dynamic partition pruning, Spark Connect, and native acceleration.

Lakehouse Integration: Define strategy for Spark on Delta Lake, Iceberg, and Hudi: read/write optimization, column stats, Z-order clustering, and snapshot isolation.

Structured Streaming: Drive investments in Structured Streaming: watermarking, stateful processing at scale, and latency/cost tradeoffs.

Cost & Efficiency: Partner with infrastructure on shuffle optimization, spill reduction, spot-instance resilience, and dynamic resource allocation. Track cost-per-query and job success SLOs.

Cross-Functional Alignment: Align data engineers, platform SREs, and ML teams around shared engine investments.

Required technical and professional expertise

Domain Experience: 10+ years of experience in the data infrastructure ecosystem, which may include roles in product management, solutions engineering, sales engineering, developer advocacy, or technical architecture within query engines, lakehouse platforms, or distributed data systems.

Product Thinking: Demonstrated ability to translate customer needs, system constraints, and performance insights into clear product requirements and roadmap priorities.

Open-Source & SQL: Experience with open-source data engines and SQL-based platforms. Familiar with open-source community dynamics, upstream contribution, and fork management.

Spark Internals: Working knowledge of Spark internals: DAG scheduler, Catalyst, Tungsten execution, exchange protocols. Able to read a query plan and diagnose a skewed shuffle.

JVM & Performance: Familiarity with JVM performance: GC pressure, off-heap memory, and the tradeoffs of native acceleration.

Lakehouse Experience: Experience with Delta Lake, Apache Iceberg, Hudi, and a solid understanding of lakehouse architecture, including implications for Spark read/write paths, query planning, and table format interoperability.

Quantitative Instincts: Comfortable with SQL, query plan interpretation, and translating benchmark data into product decisions.

Preferred technical and professional experience

Spark Community: Committer or contributor history on Apache Spark, or experience with a major Spark distribution (Databricks Runtime, EMR, Dataproc).

Spark Connect: Experience with Spark Connect or decoupled client/server Spark architectures.

ML Feature Engineering: Background in ML feature pipelines and the intersection of Spark with training data preparation.

Observability: Experience defining instrumentation strategy for distributed job platforms.

IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.

Apply Now

OneMain Financial Jobs

Job Information

IBM Spark Product Manager in San Jose, California

Current Search Criteria