portal resources jobs companies l labelbox senior operations engineer

Senior Operations Engineer


Labelbox’s mission is to build the best products for humans to advance artificial intelligence. Labelbox has created the world’s leading training data platform for machine learning applications. Our vision is to become the default software for data scientists to manage data and train neural networks in the same way that GitHub or text editors are defaults for software engineers.

Current Labelbox customers are transforming industries within insurance, retail, manufacturing/robotics, healthcare, and beyond. Our platform is used by Fortune 500 enterprises such as Allstate, John Deere, Bayer, Stanley Black & Decker as well as leading AI-focused companies such as FLIR Systems, Caption Health, Cape Analytics. We are backed by leading investors including Andreessen Horowitz, B Capital, Gradient Ventures (Google's AI-focused fund), and Kleiner Perkins.

What you'll be doing

    • Executing and actioning on an infrastructure roadmap, collaborating with team members across engineering, product, and design
    • Taking ownership of Operations Engineering as the first official Operations Engineer
    • Responding to day-to-day interrupt requests from business, support, and software development teams
    • Supporting developers and support engineers in solving blocking issues
    • Writing and maintaining documentation and playbooks for common or known problems
    • Acting as a mentor to support engineers to help them become more self-sufficient in their day-to-day responsibilities
    • Deploying, maintaining and troubleshooting instances in development and production environments, both in the cloud and on-premises
    • Performing root cause analysis and providing potential solutions when problems arise
    • Working closely with DevOps Engineers and SREs to develop permanent solutions to problems
    • Working with database technologies such as PostgreSQL, MySQL, or other RDBMS
    • Working with other open source technologies such as Redis, Elasticsearch, and RabbitMQ
    • Identifying and measuring key performance metrics for our infrastructure and defining service-level objectives (SLOs)
    • Participating in our on-call rotation

We're looking for someone with

    • 4+ years of relevant experience in an Operations, SRE or DevOps role
    • Experience with modern Linux systems and running services in production
    • Experience managing infrastructure in a major public cloud (AWS, GCP, Azure)Experience with Kubernetes or other container orchestration systems
    • Experience with Kubernetes or other container orchestration systems
    • Experience with and an understanding of complex distributed systems
    • Experience with database technologies such as PostgreSQL, MySQL, or other RDBMS
    • Experience with other open source technologies such as Redis, Elasticsearch, and RabbitMQ
    • Experience working under Agile / Scrum methodologies
    • Shell scripting or Python skills

Bonus

    • Experience with CI/CD tools and technologies such as Codefresh, Jenkins, TeamCity, etc
    • Log and metrics collection experience using tools such as ElasticStack, Datadog, and others
    • Experience with automation tools and technologies such as Terraform, Helm, etc
    • Coding skills in languages such as Java or Golang
    • Experience with SOC 2, FedRAMP, HIPAA, and other compliance-related programs
    • Experience managing multiple Kubernetes clusters / clusters spanning multiple cloud providers
    • Advanced knowledge of infrastructure management in GCP
We believe that AI has the power to transform every aspect of our lives -- from healthcare to agriculture. The exponential impact of artificial intelligence will mean mammograms can happen quickly and cheaply irrespective of the limited number of radiologists there are in the world and growers will know the instant that disease hits their farm without even being there.

At Labelbox, we’re building a platform to accelerate the development of this future. Rather than requiring companies to create their own expensive and incomplete homegrown tools, we’ve created a training data platform that acts as a central hub for humans to interface with AI. When humans have better ways to input and manage data, machines have better ways to learn.Apply for this job [1]

  1. https://jobs.lever.co/labelbox/54b6c479-24ec-4d93-b109-3079b8ebfc03/apply

Let us send you new openings similar to Senior Operations Engineer straight to your Inbox. Weekly or Daily. 7-day free trial 💌

The ability to work remotely increases employee happiness by 20 percent.