portal resources jobs companies c cube (lead) nlp data scientist / ml engineer, regbrain

(Lead) NLP Data Scientist / ML Engineer, RegBrain


? AI at CUBE

CUBE uses AI and NLP to machine read the regulatory internet, at global scale. We collect, clean, standardise, translate, monitor, classify, and enrich regulatory data across 180 countries in over 60 languages. All in near real-time.

We've even built our own ontology of regulation—machine-driven and continuously refined by a team of subject matter experts.

On a high level, CUBE uses AI to transform regulatory data into regulatory intelligence. And this is exactly where RegBrain comes in.

? RegBrain

It's always a great time to become a CUBER, but now literally could not be a better time. This year, we are building out the core RegBrain team. RegBrain leverages the 10 years of global regulatory data that our existing AI teams have collected, cleaned, standardised, translated, and classified.

? The mission: to create the ultimate semantic map of global regulatory data, and to take CUBE's AI to the next level through data learning.

The RegBrain team will be responsible for the end-to-end research, design, and development of both the semantic map and a suite of AI-driven capabilities—including recommendation systems, prediction, and task automation.

As such, the team will be split into two core areas: research & data science and ML & data engineering. All with an NLP flavour, of course.

⚠️ Please note: While we're hiring across a wide range of experience levels over the next 4-6 months, the most immediate open roles are team lead positions (there will be one lead for each subteam). The leads will directly influence the hiring process for the rest of the team. If you are not interested in a lead role but think you'd be a great fit for RegBrain, you can still fill out the application. It's designed to be versatile.

Here are the core responsibilities of each RegBrain subteam. Note that the responsibilities are extremely complementary, to reflect how closely the subteams will work together.

? Research & data science

?  Core mission: Design ML & NLP prototypes for each RegBrain use case, and own the semantic map of CUBE's regulatory data.

  • Prepare, maintain, and refine the semantic map (knowledge graph) of CUBE's regulatory data.
  • Develop, test, and improve optimal ML & NLP models for each RegBrain use case.
  • Present information using data visualisation techniques (especially important for the semantic map).
  • Determine additional data sources and how to include them in the pipeline (another team will help with actually adding them).
  • Stay up-to-date with ML & NLP research, and experiment with new models and techniques.
?️ ML & data engineering

?  Core mission: Develop the ML & NLP prototypes from the data science team, resulting in APIs that can be consumed by CUBE's core platform.

  • Determine the cloud architecture strategy and overall ML & data systems for RegBrain.
  • Work closely with other AI engineering and data teams to ingest data from our core platform, our transformation engine, and other sources.
  • Improve the efficiency, performance, and scalability of ML & NLP models (this includes data quality, ingestion, loading, cleaning, and processing).
  • Improve the efficiency, performance, and scalability of the semantic map.
  • Verify that the quality of results in production meets the requirements.

? Core competencies

Just as the responsibilities of the RegBrain subteams overlap, the core competencies we're looking for overlap too. The good news for you is that we will use your preferences and the interview process to collaboratively determine which side of the spectrum you should sit on. The strongest candidates have competencies across both sides (and are as modular as CUBE's core product!).

  • End-to-end ML model design and development experience (design is more relevant for the data science team; deploying models to production and performance monitoring are especially important for the engineering team) ?
  • Experience with cloud infrastructure for data pipelining and model deployment (more relevant for engineering) ☁️
  • Experience with ML platforms, frameworks, and libraries ?
  • Experience analysing vast volumes of textual data ?
  • Strong familiarity with SQL and NoSQL/graph databases ?
  • Solid understanding of data structures, data modelling, and software architecture ?️
  • Ability to write clear, robust, and testable code, especially in Python ?
  • Strong grasp of data visualisation techniques (for dashboarding, reporting, etc.) ?
  • systems thinking [1] approach ?
  • A mathematically and statistically-oriented brain ?
  • A healthy sense of humour (you're going to need it... don't say we didn't warn you ?)

Experience matters. But what is more important than raw number of years of experience is demonstrated proficiency (through GitHub profiles/online portfolios and the interview process itself). Bonus points for Stack Overflow and Kaggle contributions! ?

? Why you'll love RegBrain (& CUBE)

If there is a best time to join RegBrain, it's now. Here are the many reasons why.

? Immediate global impact. CUBE is a well-established player in regtech (we were around before regtech was even a thing!), and our category-defining product is used by leading financial institutions around the world (including Revolut, Citi, and HSBC). We have an audience across 150 countries, and they love CUBE.

? Freedom & flexibility. Think of RegBrain as a fully-funded startup within a scaleup. The first to join will have a blank canvas, a tabula rasa. You'll be able to choose your own tech stack. GCP or AWS or Azure? To Spark or not to Spark? PyTorch or TensorFlow? You decide. As long as you can justify your choices, the rings of Saturn are the limit.

? Quantity & quality of data. The stage has literally been set: over the past 10 years, the five engineering teams at CUBE have built solid foundations for data collection, transformation, and classification. The RegBrain team will focus solely on learning from this mountain of structure.

?️ A rich & complex dataset. The main dataset is not only already structured, but also longitudinal and multilingual. We've tracked changes to regulation over time and built in-house translation models for 60+ languages.

? Always learning. Part of your job is to stay up-to-date with the latest research, and share your learning with the RegBrain team and other AI teams at CUBE. You'll have a training budget and a conference budget. In the mid-long term, we're aiming to collaborate with universities.

⚖️ Responsible AI. We will proactively address the inevitable biases that emerge for any AI system. Our Head of Product was trained at the Oxford Internet Institute [2] and has direct connections with ethicists who are influencing the future of AI regulation.

? Employee-first work-life policy. CUBE went fully remote before the pandemic even hit, because we wanted to define the future of work. As a CUBER, you'll be able to design your home office and choose your own work equipment. Unable to work from home one week, or desperate for in-person interaction with colleagues? No problem—book a room in a coworking space.

? Sustainable, customer-driven growth. We are a bootstrapped company funded by customers and strategic private investment. This means that growth is sustainable, and product development is very closely aligned with customer needs.

? Visa sponsorship if required. We know every single nuance of Skilled Worker visas.

? Extremely bespoke hiring process. At CUBE, we're trying to flip hiring on its head: the objective of the process is to create a personalised job description (and title). This page sets the general context. We'll collaboratively determine the best role for you, given your interests, CUBE's needs, and other members of the team.

⏱️ Hiring timeline

We know how insufferably long and complicated hiring processes can be. We've been there before.

That's why at CUBE, we aim to compress the hiring timeline to between 5 and 10 days (from the first-round interview to the final round). There's no HR screen, culture fit interview, or coding on a whiteboard. Just high-quality infoflow in both directions. ?

Here's what will happen:

  • Online application (link below ?)
  • First round video interview with RegBrain's Head of Product (30-45m)
  • Second round video interview with our CTO (30-45m)
  • Take-home challenge (it'll be fun, we promise, and we won't ask for more than a few hours of your time)
  • Final round panel interview, again over video (45-60m)

If you have any questions at this stage, feel free to use the live chat widget on the application page. Otherwise: what are you waiting for? This is your once-in-a-lifetime opportunity to define the future of regulation. The clock is already ticking. ?️


  1. https://en.wikipedia.org/wiki/Systems_theory
  2. https://www.oii.ox.ac.uk/

Other openings you might be interested in

Senior Software Engineer (ML/Data Ops)

Senior Software Engineer (ML/Data Ops)

About ActivTrak: The sudden rise in new remote work models has triggered an increase in the adoption of collaboration tools, an acceleration of digital transformations, and the need for visibility into how work gets done. Legacy metrics for assessing

last week
Senior Data Scientist

Senior Data Scientist

ABOUT QUORA: The vast majority of human knowledge is still not on the internet. Most of it is trapped in the form of experience in people’s heads, or buried in books and papers that only experts can access. More than a billion people use the internet

6m ago
Lead Flutter Software Engineer

Lead Flutter Software Engineer

Get in on the ground floor of a well-funded software start-up that has a depth of leadership with industry-specific experience. In this fully remote position, you will lead mobile development to create and architect mobile applications for customer-f

yesterday
Lead Security Engineer - Compliance, Audit & Risk

Lead Security Engineer - Compliance, Audit & Risk

InVision is the digital product design platform used to make the world’s best customer experiences. We provide design tools and educational resources for teams to navigate every stage of the product design process, from ideation to development. Today

this week
Lead Software Engineer (?Ontario, CAN)

Lead Software Engineer (?Ontario, CAN)

System1 is hiring a Remote Lead Software EngineerOur engineering team at System1 is scaling a next-generation data-aggregation and compliance platform that’s improving the way advertising targets consumers, while keeping their privacy in mind. We wor

this week
Data Engineer (Remote)

Data Engineer (Remote)

We’re SwissBorg, a fintech startup headquartered in Switzerland.  Our flagship product, the SwissBorg app, offers over 250,000+ users the best price and liquidity across 15 fiat and 12 cryptos, as well as giving them the opportunity to earn passive i

this week
Big Data Engineer (AWS)

Big Data Engineer (AWS)

As a full spectrum AWS integrator, we assist hundreds of companies to realize the value, efficiency, and productivity of the cloud. We take customers on their journey to enable, operate, and innovate using cloud technologies – from migration strategy

this week
Data Platform Engineer, Data Pipelines

Data Platform Engineer, Data Pipelines

Coinbase has built the world's leading compliant cryptocurrency platform serving over 30 million accounts in more than 100 countries. With multiple successful products, and our vocal advocacy for blockchain technology, we have played a major par

this week
Senior Data Scientist, Growth - Location Flexible

Senior Data Scientist, Growth - Location Flexible

ROLE DESCRIPTION We're looking for a Senior Data Scientist to partner with revenue, marketing, and product teams to answer key questions about how to grow revenue, scale and monetize the business, and launch high-impact initiatives for HelloSign

this week
More remote jobs

Other jobs at CUBE

One job in the last 60 days · 2 in total · avg 0 - 1 jobs/mo · 177 job visits

(Lead) NLP Data Scientist / ML Engineer, RegBrain

(Lead) NLP Data Scientist / ML Engineer, RegBrain

? AI AT CUBE CUBE uses AI and NLP to machine read the regulatory internet, at global scale. We collect, clean, standardise, translate, monitor, classify, and enrich regulatory data across 180 countries in over 60 languages. All in near real-time. We&

today
Head of Marketing

Head of Marketing

Business planning & forecasting is more critical than ever, yet our tools haven’t changed in decades.  Isn’t it time we changed the spreadsheet status quo? Cube is a high-growth company building modern financial planning & analysis (FP&A) technology

9m ago
CUBE