Archive - Machine Learning Engineer

Full-time
Remote
Flexible working hours (be ready to join a daily meeting with the team at 18:00 — 19:00 GMT +2)

We`re looking for a highly skilled and self-driven ML Engieer to join an American startup which is working on solving medicinal chemistry with the help of ML. In this role, you’ll be instrumental in handling enormous datasets, orchestrating cloud-based computing resources, and training a multitude of advanced machine-learning models.

The company builds models to predict molecular and protein interactions, aiming to revolutionize the field of medicinal chemistry. The team has an unparalleled ability to generate and analyze vast datasets, directly contributing to groundbreaking advancements in drug development.

Key Responsibilities:

Manage and optimize data processing workflows for large-scale datasets, with an approach akin to language data handling.
Scale and maintain machine learning model training processes, with a focus on cloud environments (primarily Google Cloud, with flexibility to other platforms).
Collaborate closely with ML researchers, data scientists, and lab automation teams to ensure seamless integration of lab data and ML model training.
Innovate and iterate on our existing technology stack, taking the initiative to solve problems and improve our ML operations.
Act as a self-sufficient project manager, overseeing your projects from conception to completion.

About You:

Strong experience in machine learning engineering, including data handling, model training, and scaling in cloud environments.
Comfortable building ML infrastructure
Experience working with large amounts of text data, NLP, or training LLMs
Demonstrated capability to make informed decisions, take ownership of solutions, and drive projects forward in a startup environment.
Excellent collaboration skills, with the ability to work effectively with cross-functional teams.

Requirements:

Successful candidates will have demonstrated the ability to do something interesting with datasets containing at least 1 million data points

Preferred Qualifications:

Familiarity with common MLops tooling (e.g., Dagster, Prefect, Airflow, Docker, MLflow, Kubeflow, W&B, Ray, etc.)
Experience with BERT or similar language models in PyTorch.
Experience or interest in biology, chemistry, or related fields is a plus.
Looking forward to your reply!