Job Information
IBM Senior Engineer - Data, Schema & Knowledge Systems in LOWELL, Massachusetts
Introduction
At IBM Software, we transform client challenges into solutions. Building the world’s leading AI-powered, cloud-native products that shape the future of business and society. Our legacy of innovation creates endless opportunities for IBMers to learn, grow, and make an impact on a global scale. Working in Software means joining a team fueled by curiosity and collaboration. You’ll work with diverse technologies, partners, and industries to design, develop, and deliver solutions that power digital transformation. With a culture that values innovation, growth, and continuous learning, IBM Software places you at the heart of IBM’s product and technology landscape. Here, you’ll have the tools and opportunities to advance your career while creating software that changes the world.
Your role and responsibilities
We are seeking a Senior Software Engineer to own and evolve core platform systems spanning knowledge ingestion, memory architecture, evaluation infrastructure, and gateway data management. This role has deep architectural ownership across Rust- and Go-based services and directly impacts search quality, model evaluation, compliance, and platform extensibility.
What You’ll Own
Knowledge Base & Memory (Rust)
Design and evolve the data schema and ingestion pipeline supporting large-scale documentation corpora, including document extraction, segmentation, and hybrid search (BM25 + vector).
Improve corpus quality through deduplication, relevance tuning, quality scoring, source-of-truth tracking, and versioned corpus management.
Own the memory architecture across working, semantic, and observational memory tiers, designing retrieval that is context-aware and budget-conscious.
Evolve federated search capabilities, including multi-KB querying, relevance tuning, embedding model selection, and quality metrics.
Build and scale an evaluation curation system for an LLM-as-judge framework, including versioned eval datasets, regression baselines, and authoring tooling.
Gateway Data Management (Go / Rust)
Design and implement a schema-driven entity registry with YAML-defined schemas, enabling new infrastructure connectors without code changes.
Own declarative state machine configuration decoupled from hardcoded logic.
Design a domain-agnostic evidence model to support audit and compliance requirements (e.g., PCI-DSS, SOX).
Formalize metadata and provenance tracking across entities, including import/export and multi-connector support.
Evaluation Infrastructure (Go)
Extend evaluation frameworks for end-to-end coverage across composable pipelines.
Design eval schemas, dataset management tooling, and regression thresholds.
Partner with other teams on shared benchmarks, test corpora, and multi-model evaluation strategy.
Track and report model quality metrics to support production deployment decisions.
What the First 90 Days Look Like
Month 1: Onboard onto the Rust and Go services. Understand the knowledge base ingestion pipeline end-to-end: document download, extraction, chapter splitting, indexing, federated search. Run the eval framework, review existing eval cases, understand the LLM-as-judge scoring rubric. Identify quality issues in the current documentation corpus.
Month 2: Ship the entity registry refactor — dynamic entity registration with YAML schema definitions. Design the eval curation system — dataset versioning, case authoring tooling, regression baseline management. Begin expanding eval corpus coverage.
Month 3: Ship the evidence model schema. Implement eval curation tooling. Begin knowledge base quality improvements — deduplication, source-of-truth tracking, relevance tuning. Establish eval quality dashboard with cross-model comparison.
Required technical and professional expertise
· Data modeling instincts. You think naturally about schemas, entity relationships, state machines, and how data evolves over time. You’ve designed data models that other engineers build against.
· Information retrieval or search experience. You’ve worked with search indexing, document processing, corpus management, or similar — you understand how to make unstructured data findable and useful.
· You can ship across languages. This role works in both Rust and Go. You don’t need to be an expert in both, but you need to be productive in at least one and willing to learn the other.
· Quality measurement mindset. You’ve built or worked with evaluation systems, quality metrics, regression detection, or A/B testing infrastructure. You understand how to measure whether something is getting better or worse.
Preferred technical and professional experience
You don’t need all of these coming in. The team will bring you up to speed:
· IBM Z domain knowledge — the documentation sets, infrastructure concepts, and operational patterns that the knowledge base serves
· LLM evaluation methodology — rubric-based scoring, LLM-as-judge patterns, baseline regression, multi-model comparison
· Our knowledge base ingestion pipeline (document extraction, chunking, vector + full-text indexing)
IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.