Senior Operations Engineer
Current Labelbox customers are transforming industries within insurance, retail, manufacturing/robotics, healthcare, and beyond. Our platform is used by Fortune 500 enterprises such as Allstate, John Deere, Bayer, Stanley Black & Decker as well as leading AI-focused companies such as FLIR Systems, Caption Health, Cape Analytics. We are backed by leading investors including Andreessen Horowitz, B Capital, Gradient Ventures (Google's AI-focused fund), and Kleiner Perkins.
What you'll be doing
- Executing and actioning on an infrastructure roadmap, collaborating with team members across engineering, product, and design
- Taking ownership of Operations Engineering as the first official Operations Engineer
- Responding to day-to-day interrupt requests from business, support, and software development teams
- Supporting developers and support engineers in solving blocking issues
- Writing and maintaining documentation and playbooks for common or known problems
- Acting as a mentor to support engineers to help them become more self-sufficient in their day-to-day responsibilities
- Deploying, maintaining and troubleshooting instances in development and production environments, both in the cloud and on-premises
- Performing root cause analysis and providing potential solutions when problems arise
- Working closely with DevOps Engineers and SREs to develop permanent solutions to problems
- Working with database technologies such as PostgreSQL, MySQL, or other RDBMS
- Working with other open source technologies such as Redis, Elasticsearch, and RabbitMQ
- Identifying and measuring key performance metrics for our infrastructure and defining service-level objectives (SLOs)
- Participating in our on-call rotation
We're looking for someone with
- 4+ years of relevant experience in an Operations, SRE or DevOps role
- Experience with modern Linux systems and running services in production
- Experience managing infrastructure in a major public cloud (AWS, GCP, Azure)Experience with Kubernetes or other container orchestration systems
- Experience with Kubernetes or other container orchestration systems
- Experience with and an understanding of complex distributed systems
- Experience with database technologies such as PostgreSQL, MySQL, or other RDBMS
- Experience with other open source technologies such as Redis, Elasticsearch, and RabbitMQ
- Experience working under Agile / Scrum methodologies
- Shell scripting or Python skills
Bonus
- Experience with CI/CD tools and technologies such as Codefresh, Jenkins, TeamCity, etc
- Log and metrics collection experience using tools such as ElasticStack, Datadog, and others
- Experience with automation tools and technologies such as Terraform, Helm, etc
- Coding skills in languages such as Java or Golang
- Experience with SOC 2, FedRAMP, HIPAA, and other compliance-related programs
- Experience managing multiple Kubernetes clusters / clusters spanning multiple cloud providers
- Advanced knowledge of infrastructure management in GCP
At Labelbox, we’re building a platform to accelerate the development of this future. Rather than requiring companies to create their own expensive and incomplete homegrown tools, we’ve created a training data platform that acts as a central hub for humans to interface with AI. When humans have better ways to input and manage data, machines have better ways to learn.Apply for this job [1]