OneMain Financial Jobs

Job Information

XCEL ENGINEERING INC Senior HPC Linux Systems Engineer in Oak Ridge, Tennessee

COMPANY OVERVIEW

XCEL Engineering, Inc. is an award-winning small business that provides trusted information technology, engineering, consulting and project management solutions and services to federal agencies and organizations. Originally founded in 1971 by professional engineers at the University of Tennessee, XCEL was acquired in 2003 by U.S. Army and Navy veterans and in 2023 became a MartinFed company.

XCEL Engineering is a part of IT Lab Partners (ITLP) which was created to support a leading research facility in the East Tennessee region in recruiting the best and the brightest technical talent. Considering joining our impressive team today!

JOB OVERVIEW

XCEL Engineering is seeking a qualified applicant for aSenior HPC Linux Systems Engineer to work for theNational Center for Computational Sciences (NCCS) at Oak Ridge National Lab (ORNL), which hosts several of the world's most powerful computer systems, is seeking a highly qualified individual to play a key role in improving the security, performance, and reliability of the NCCS computing environments. This includes supporting one of the fastest supercomputers in the world, Frontier, along with numerous commodity clusters and specialized programs and partnerships. Frontier is one of the scientific research community's most powerful computational instruments for exploring solutions to some of today's most challenging problems.

ESSENTIAL FUNCTIONS

  • Install, integrate, and administer HPC Linux clusters and high-speed networks
  • Diagnosing system operational problems quickly and effectively
  • Coordinating with vendors to resolve hardware and software problems
  • Recommending, planning, and coordinating hardware and software changes with customer participation using change management processes
  • Porting and writing system management tools
  • Documenting system administration procedures for routine and complex tasks
  • Participating in a 24-hour, 7-day on-call support rotation and off-hours maintenance windows
  • System implementation/integration into the NCCS environment and systems performance
  • Lead system deployment, integration and troubleshooting of a large-scale computer
  • Participate in relevant systems topics with the internal and external community of peers contributing experiences and solutions.
  • Mentor junior-level staff as they join the
  • Deliver ORNL's mission by aligning behaviors, priorities, and interactions with our core values of Impact, Integrity, Teamwork, Safety, and Service.

BASIC QUALIFICATIONS

  • Bachelor's Degree in a scientific or technical field
  • 8+ years of Linux systems experience is required
  • An equivalent combination of education and experience will be considered

DESIRED QUALIFICATIONS

Experience managing Linux operating systems in a large-scale system

environment

Solid understanding of networked computing environment

concepts

Experience with Linux Cluster

Administration

Ability

to

develop and

maintain programs and

scripts that

aid

in

the

operation and

automation

of

administrative

tasksusing various shell and scripting languages (bash, Python, Go)

Experience with Lustre and GPFS file

systems

Experience with batch schedulers (particularly

SLURM)

Experience deploying and maintaining automated configuration management software such as

Puppet

Strong interpersonal and commu

DirectEmployers