portal resources jobs companies c circonus, inc. site reliability engineer

Site Reliability Engineer


As a Site Reliability Engineer (SRE) at Circonus, you will be responsible for keeping Circonus SaaS and on-premise customers up and running as well as improving the automation, scalability, and performance of systems.  This is an unparalleled opportunity to grow on a small, collaborative, and friendly team with established leadership in the field of SRE.   A successful candidate will be able to effectively communicate across multiple departments and customers, can shift gears at a moment’s notice, and enjoys the challenges of supporting enterprise clients.  This is a client facing role where presentation skills are important.  Also, a successful candidate will be working in a support rotation capacity.   This position is 100% remote and will remain so post-COVID.  This position will also be supporting customers in the Pacific Time Zone, regardless of the candidate’s location.  

Job Responsibilities

  •         Install, upgrade and manage systems powering customer infrastructure running Circonus software
  •         Troubleshoot availability and performance issues
  •         Diagnose production issues and perform front-line remediation
  •         Communicate with management and customers regarding aberrant system’s behavior
  •         Influence software and architecture design based on system and architecture observations related to performance and reliability
  •         Participate in an on-call schedule

Job Requirements

  •         Linux (RHEL, CentOS, Ubuntu)
  •         Experience working with cloud service providers such as AWS, Azure, or GCP
  •         Ansible, Chef or similar configuration system
  •         HAProxy, PostgreSQL, Apache or similar technologies
  •         Strong networking knowledge: firewalls, TCP & UDP, DNS, SSL/TLS
  •         Strong understanding of monitoring principles
  •         Familiarity leveraging REST and REST-like APIs for operations tasks
  •         UNIX troubleshooting skills: tcpdump, strace, bpftrace, etc
  •         Fluency in one or more of the Git, Subversion or Mercurial version control systems

Preferred Experience

  •         7+ years’ experience in the technology industry
  •         Experience and/or senior technical knowledge of monitoring and analytics solutions
  •         Experience with Docker, Kubernetes and containers
  •         The right person will be highly technical and analytical much like the company itself
Circonus is a software company that is changing the way the world monitors both IT infrastructure and the business it powers. Our SaaS and on­-premise solutions enable companies to combine monitoring, alerting, event processing, and predictive analytics into a unified solution. Visualize any data, in any application, from any system, in real-time. Circonus scales from a single team to a worldwide organization that tracks thousands of devices analyzing millions of metrics. API driven automation empowers developers and makes operational teams incredibly efficient, while analytics drive insights that improve organization-wide performance. We enjoy a global reach, but our customers primarily cluster on the East Coast, California, and to a lesser degree, Europe. Our success stems from (a) delivering an industry-leading product and (b) an obsession with customer satisfaction. Culturally, we operate like a startup. Small, agile teams making quick decisions and short, iterative cycle times. We relish our core values of respect, integrity, value and growth, among others. This is probably the kind of place where you want to work. All of our positions include a discretionary PTO policy, generous employer covered health and dental insurance, employer matched 401K Plan and more. Compensation will consist of a base salary of $90K to $125K and will be commensurate with experience.

Other openings you might be interested in

More remote jobs

Let us send you new openings similar to Site Reliability Engineer straight to your Inbox. Weekly or Daily. 7-day free trial 💌

The ability to work remotely increases employee happiness by 20 percent.