Job Information
Stanford University Storage Architect / Senior Storage Systems Administrator (Research & HPC Data Platforms) in Stanford, California
Storage Architect / Senior Storage Systems Administrator (Research & HPC Data Platforms)
Business Affairs: University IT (UIT), Stanford, California, United States
New
Information Technology Services
Post Date 13 hours ago
Requisition # 108795
Reports to : Research & HPC Data Platforms Team Lead
Please note: Visa Sponsorship is not provided for this position.
The Opportunity
Stanford’s Research Computing team is seeking a storage expert to join our Research & HPC Data Platforms group. This is a flexible-level posting; we are seeking either a Storage Architect or a Senior Storage Systems Administrator to maintain and expand our current world-class infrastructure.
You will work directly with our research and HPC data platforms team lead to manage a diverse environment of more than 100PB and 5 billion files, including high-speed Lustre, MinIO object storage, and Lustre HSM, among other platforms.
Why Stanford? You aren't just managing a storage cluster; you are a part of a data and storage ecosystem that supports Nobel-caliber research across all disciplines.
Storage Architect
Key Responsibilities
Architecture: Adapt and evolve the technology designs of existing systems to meet the needs of future computing platforms and research aims.
Platform Management: Deliver on the scaling, reliability, security, compliance, operations, and lifecycle management of our primary research computing storage platforms, including for high-risk data.
Tiered Storage Architecture: Oversee the integration of Lustre HSM on the Elm platform, managing data movement policies between parallel filesystems and MinIO object storage.
Performance Engineering: Tune I/O for large-scale High Performance Computing and AI workloads.
Community Stewardship: Represent Stanford within the Lustre community and other key community groups, contributing to the upstream roadmap and maintaining a vendor-neutral storage strategy.
Required Qualifications
Education: Bachelor’s degree and eight years of relevant experience, or a combination of education and relevant experience.
Expertise at Scale: 8+ years of hands-on experience architecting, building, and managing Lustre and ZFS or similar filesystems at the 20PB+ scale.
Object Storage & HSM: Deep technical fluency in MinIO and Lustre HSM (copytools, policy engines like RobinHood) or similar tools.
Kernel & Network Mastery: Expert-level knowledge of the Linux kernel and large-scale InfiniBand/Ethernet fabric tuning.
In-depth Troubleshooting Experience: Must be capable of leading the debugging of issues such as kernel panics, LNet congestion, and metadata bottlenecks.
Leadership: Proven experience mentoring junior admins and leading large-scale migration projects without data loss.
Communication: Strong written and verbal communication skills.
Senior Storage Systems Administrator
Key Responsibilities
Platform Management: Contribute to the scaling, reliability, security, compliance, operations, and lifecycle management of our primary research computing storage platforms, including for high-risk data.
Operational Excellence: Perform complex filesystem upgrades, kernel patches, and hardware refreshes with minimal downtime.
Monitoring & Telemetry: In collaboration with others, build and maintain sophisticated observability stacks for real-time I/O tracking and trend analysis.
User Support: Act as an escalation point for researchers struggling with complex I/O patterns, job failures, or data access issues.
Maintenance: Manage the physical and logical health of the storage fleet, including RMA processes, firmware updates, and disk replacement cycles.
Required Qualifications
Experience: 5+ years of Linux Systems Administration, with 3+ years specifically in an HPC or large-scale data environment.
Technical Stack: Strong hands-on experience with Lustre, ZFS, MinIO, and/or similar technologies.
Scripting: Advanced proficiency in scripting languages for automating routine storage tasks and parsing system logs.
Hardware Mastery: Comfortable with the physical aspects of the role—diagnosing hardware failures and understanding power/cooling requirements for high-density storage.
Communication: Strong written and verbal communication skills.
Physical Requirements*:
Constantly perform desk-based computer tasks.
Frequently sit, grasp lightly/fine manipulation.
Occasionally stand/walk, writing by hand.
Rarely use a telephone, lift/carry/push/pull objects that weigh up to 10 pounds.
Consistent with its obligations under the law, the University will provide reasonable accommodations to applicants and employees with disabilities. Applicants requiring a reasonable accommodation for any part of the application or hiring process should contact Stanford University Human Resources by submitting a contact form .
Working Conditions:
- May work extended hours, evenings, and weekends.
Work Standards:
Interpersonal Skills: Demonstrates the ability to work well with Stanford colleagues and clients and with external organizations.
Promote Culture of Safety: Demonstrates commitment to personal responsibility and value for safety; communicates safety concerns; uses and promotes safe behaviors based on training and lessons learned.
Subject to and expected to stay in sync with all applicable University policies and procedures, including but not limited to the personnel policies and other policies found in Stanford's Administrative Guide, http://adminguide.stanford.edu.
The expected pay range for this position is $150,289 to $171,674 per annum.
Stanford University provides pay ranges representing its good faith estimate of the salary or hourly wage the university reasonably expects to pay for a position upon hire. The pay offered to a selected candidate will be determined based on factors such as (but not limited to) the scope and responsibilities of the position, the qualifications of the selected candidate, departmental budget availability, internal equity, geographic location and external market pay for comparable jobs.
At Stanford University, base pay represents only one aspect of the comprehensive rewards package. The Cardinal at Work website ( https://cardinalatwork.stanford.edu/benefits-rewards ) provides detailed information on Stanford’s extensive range of benefits and rewards offered to employees. Specifics about the rewards package for this position may be discussed during the hiring process.
The job duties listed are typical examples of work performed by positions in this job classification and are not designed to contain or be interpreted as a comprehensive inventory of all duties, tasks, and responsibilities. Specific duties and responsibilities may vary depending on department or program needs without changing the general nature and scope of the job or level of responsibility. Employees may also perform other duties as assigned.
Stanford is an equal employment opportunity and affirmative action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic protected by law.
Additional Information
Schedule: Full-time
Job Code: 4833
Employee Status: Regular
Grade: K
Requisition ID: 108795
Work Arrangement : Hybrid Eligible, On Site