Logo for cFocus Software Incorporated

EOP - Site Reliability Engineer - TS/SCI Required

Job description

cFocus Software seeks a Site Reliability Engineer to join our program supporting the United States Secret Services (USSS). This position is remote. This position requires the ability a TS/SCI clearance.
Qualifications:
  • Bachelor’s degree in Computer Science, Engineering, or related technical field (or equivalent experience).
  • Minimum of 2 years of experience in systems engineering, DevOps, or Site Reliability Engineering roles.
  • Strong proficiency with Linux/Unix operating systems.
  • Experience with scripting and automation using Python, Bash, or similar languages.
  • Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or equivalent.
  • Experience supporting CI/CD tools such as GitLab, Jenkins, or ArgoCD.
  • Experience with containerization and orchestration platforms including Docker and Kubernetes.
  • Understanding of SRE principles including SLIs, SLOs, and error budgets.
  • Strong troubleshooting, problem-solving, and documentation skills.
Duties:
 
  • Monitor system health, availability, and performance using centralized monitoring and logging tools.
  • Respond to, troubleshoot, and resolve incidents in production environments and provide root cause analysis.
  • Conduct after-action reporting and post-incident reviews to improve system resilience.
  • Automate repetitive operational tasks including deployments, monitoring, and incident response.
  • Administer user accounts, access controls, and authentication mechanisms.
  • Maintain and configure workflow templates, user fields, and application configurations.
  • Maintain test environments that mirror production and support pre-deployment testing.
  • Design and maintain backup, high availability (HA), and disaster recovery (DR) solutions.
  • Develop and maintain incident response and disaster recovery plans for supported applications.
  • Configure and support integrations with complementary enterprise systems.
  • Architect, build, and maintain on-premise and cloud infrastructure supporting applications.
  • Administer production, staging, and development environments.
  • Manage system logs and monitor for security and operational events.
  • Maintain and improve CI/CD pipelines and DevSecOps processes.
  • Apply configuration management disciplines including patching, hardening, and documentation.
  • Create and maintain dashboards, SLIs, SLOs, and service health metrics.
  • Support operational readiness boards and weekly service reviews.
  • Provide on-call support for outages, upgrades, and emergency maintenance as required.
  • Support surge activities, including Presidential Transition-related data analysis if required.

 

Site Reliability Engineer (SRE) Related jobs

Other jobs at cFocus Software Incorporated

We help you get seen. Not ignored.

We help you get seen faster — by the right people.

🚀

Auto-Apply

We apply for you — automatically and instantly.

Save time, skip forms, and stay on top of every opportunity. Because you can't get seen if you're not in the race.

AI Match Feedback

Know your real match before you apply.

Get a detailed AI assessment of your profile against each job posting. Because getting seen starts with passing the filters.

Upgrade to Premium. Apply smarter and get noticed.

Upgrade to Premium

Join thousands of professionals who got noticed and hired faster.