Senior Software Engineer
We're building the Data Platform of the Future
Join us if you want to rethink the way organizations interact with data. We are a developer-first company, committed to building around open protocols and delivering the best experience possible for data consumers and publishers.
Splitgraph is a seed-stage, venture-funded startup hiring its initial team. The two co-founders are looking to grow the team to five or six people. This is an opportunity to make a big impact on an agile team while working closely with the founders.
Splitgraph is a remote-first organization. The founders are based in the UK, and the company is incorporated in both USA and UK. Candidates are welcome to apply from any geography. We want to work with the most talented, thoughtful and productive engineers in the world.
Data Engineers welcome! The job titles have "Software Engineer" in them, but at Splitgraph there's a lot of overlap between data and software engineering. We welcome candidates from all engineering backgrounds.
→← (same form for both positions)
What is Splitgraph?
Open Source Toolkit
is a tool for building, versioning and querying reproducible datasets. It's inspired by Docker and Git, so it feels familiar. And it's powered by PostgreSQL, so it works seamlessly with existing tools in the Postgres ecosystem. Use Splitgraph to package your data into self-contained data images that you can share with other Splitgraph instances.
Splitgraph Cloud is a platform for data cataloging, integration and governance. The user can upload data, connect live databases, or "push" versioned snapshots to it. We give them a unified SQL interface to query that data, a catalog to discover and share it, and tools to build/push/pull it.
Learn More About Us
Listen to our interview on the
Watch our co-founder Artjoms present
Read our HN/Reddit posts ()
Read the slides from our early (2018) presentations:,
Explore thewhere we index 40k+ datasets
How We Work: What's our stack look like?
We prioritize developer experience and productivity. We resent repetition and inefficiency, and we never hesitate to automate the things that cause us friction. Here's a sampling of the languages and tools we work with:
-for the backend. Our tech is written in Python (with to make it more interesting), as well as most of our backend code. The Python code powers everything from authentication routines to database migrations. We use the latest version and tools like , and to help us write quality software.
-for the web stack. We use TypeScript throughout our web stack. On the frontend we use with . For data fetching we use with fully-typed GraphQL queries auto-generated by based on the schema that creates by introspecting the database.
for the database, because of course.** Splitgraph is a company built around Postgres, so of course we are going to use it for our own database. In fact, we actually have three databases. We have auth-db for storing sensitive data, registry-db which acts as a so users can push Splitgraph images to it using , and cloud-db where we store the schemata that Postgraphile uses to autogenerate the GraphQL server.
and for stored procedures.** We define a lot of core business logic directly in the database as stored procedures, which are ultimately . We find this to be a surprisingly productive way of developing, as it eliminates the need for manually maintaining an API layer between data and code. It presents challenges for testing and maintainability, but we've built tools to help with database migrations and rollbacks, and an end-to-end testing framework that exercises the database routines.
for auto-generating a REST API for every repository.** We use this excellent library (written in ) to expose an -compatible REST API for every repository on Splitgraph ( ).
- Lua (5.x), C, and for scripting . Our main product, the "data delivery network", is a single SQL endpoint where users can query any data on Splitgraph. Really it's a layer of PgBouncer instances orchestrating temporary Postgres databases and proxying queries to them, where we load and cache the data necessary to respond to a query. We've added scripting capabilities to enable things like query rewriting, column masking, authentication, ACL, orchestration, firewalling, etc.
-for packaging services. Our CI pipeline builds every commit into about a dozen different Docker images, one for each of our services. A production instance of Splitgraph can be running over 60 different containers (including replicas).
-and for development. We use and docker-compose so that developers can easily spin-up a stack that mimics production in every way, while keeping it easy to hot reload, run tests, or add new services or configuration.
-for deployment and for provisioning. We use Nomad to manage deployments and background tasks. Along with Terraform, we're able to spin up a Splitgraph cluster on AWS, GCP, Scaleway or Azure in just a few minutes.
-for job orchestration. We use it to run and monitor jobs that maintain our catalog of , or ingest other public data into Splitgraph.
-, , , and for monitoring and metrics. We believe it's important to self-host fundamental infrastructure like our monitoring stack. We use this to keep tabs on important metrics and the health of all Splitgraph deployments.
-for company chat. We think it's absolutely bonkers to pay a company like Slack to hold your company communication hostage. That's why we self-host an instance of Mattermost for our internal chat. And of course, we can deploy it and update it with Terraform.
-for web analytics. We take privacy seriously, and we try to avoid including any third party scripts on our web pages (currently we include zero). We self-host our analytics because we don't want to share our user data with third parties.
-and for BI and . We use Metabase as a frontend to a Splitgraph instance that connects to Postgres (our internal databases), MySQL (Matomo's database), and ElasticSearch (where we store logs and DDN analytics). We use this as a chance to dogfood our software and produce fancy charts.
- The occasional best-of-breed SaaS services for organization. As a privacy-conscious, independent-minded company, we try to avoid SaaS services as much as we can. But we still find ourselves unable to resist some of the better products out there. For organization we use tools likefor video calls, for brainstorming, for documentation (you're on it!), , for ticketing, and .
- Other fun technologies including, , , and bash. We don't touch them much because they do their job well and rarely break.
Life at Splitgraph
We are a young company building the initial team. As an early contributor, you'll have a chance to shape our initial mission, growth and company values.
We think that remote work is the future, and that's why we're building a remote-first organization. We chat onand have video calls on Zoom. We brainstorm with and organize with .
We try not to take ourselves too seriously, but we are goal-oriented with an ambitious mission.
We believe that as a small company, we can out-compete incumbents by thinking from first principles about how organizations interact with data. We are very competitive.
Flexible working hours
Generous compensation and equity package
Opportunity to make high-impact contributions to an agile team
How to Apply? Questions?
If you have any questions or concerns, feel free to email us at
? WorldwideSee more jobs at Splitgraph
- mailto:[email protected]