Job Information
SciTec MCS Site Reliability Engineer in Aurora, Colorado
Responsibilities
- Support the availability, reliability, and performance of IaaS services supporting mission systems
- Monitor infrastructure health using metrics, logs, and alerts; respond to and resolve incidents
- Perform root-cause analysis for infrastructure and service outages; implement corrective and preventative actions
- Improve system reliability through automation, standardization, and proactive engineering
- Support capacity planning, performance analysis, and scaling of infrastructure services
- Maintain and enhance monitoring, logging, and alerting solutions
- Participate in incident response, on-call rotations (as required), and post-incident reviews
- Collaborate with network, systems, platform, and application teams to resolve cross-stack issues
- Support infrastructure lifecycle activities including upgrades, patches, and configuration changes
- Apply security best practices and support compliance requirements in a regulated environment
- Develop and maintain runbooks, procedures, and operational documentation
- Contribute to CI/CD and Infrastructure-as-Code workflows supporting IaaS services
- Participate in Agile ceremonies and operational planning activities
- Perform other duties as assigned
Requirements
- 5+ years of professional experience in systems engineering, SRE, DevOps, or infrastructure operations
- Strong experience administering Linux systems
- Experience supporting on-prem, cloud, or hybrid infrastructure environments
- Hands-on experience with monitoring, logging, and alerting systems
- Strong troubleshooting skills across compute, storage, networking, and OS layers
- Experience scripting or automating tasks using Bash, Python, or similar languages
- Familiarity with Infrastructure as Code concepts and tooling
- Strong verbal and written communication skills
- Detail-oriented, self-motivated, and able to own issues through resolution
- Ability to obtain and maintain a DoD security clearance
- Ability to work on-site at the customer location
Candidates who have any of the following skills will be preferred:
- Experience working on an IaaS or platform operations team
- Experience with virtualization platforms (e.g., VMware vSphere)
- Experience supporting container platforms (Kubernetes, OpenShift)
- Experience with cloud environments (AWS, Azure, or GovCloud)
- Familiarity with SRE concepts such as SLIs, SLOs, error budgets, and toil reduction
- Experience with configuration management or automation tools (Ansible, Terraform)
- Experience with CI/CD pipelines (GitLab CI, Jenkins, or similar)
- Experience operating systems in government or secure environments
- Experience with incident management and operational readiness reviews
Benefits
SciTec offers a highly competitive salary and benefits package, including:
- 4% Safe Harbor 401(k) match
- 100% company paid HSA Medical insurance, with a choice of 2 buy-up options
- 80% company paid Dental insurance
- 100% company paid Vision insurance
- 100% company paid Life insurance
- 100% company paid Long-term Disability insurance
- Short-term Disability insurance
- Annual Profit-Sharing Plan
- Discretionary Performance Bonus
- Paid Parental Leave
- Generous Paid Time Off, including Holiday, Vacation, and Sick Pay
- Flexible work hours
The pay range for this position is $146,000 - $175,000 / year. SciTec considers several factors when extending an offer of employment, including but not limited to the role and associated responsibilities, a candidate's work experience, education/training, and key skills. This is not a guarantee of compensation.
SciTec is proud to be an Equal Opportunity employer. VET/Disabled.