Apheris is on a mission to collaboratively solve the largest problems our planet is facing by enabling organizations to securely build and deploy data applications and AI across boundaries. Our product enables governed, private, secure data access for ML and analytics. As model architectures become increasingly commoditized, data becomes an organization’s key differentiator. However, businesses need to safeguard their data assets and IP while leveraging it for ML. The Apheris Compute Gateway ensures only approved computations can be launched on that data and enables federation across organizational boundaries, allowing ML-powered insights with no need to centralize data, while ensuring compliance with data privacy, security, and governance obligations. Founded in 2019, Apheris is backed by top investors and tech industry innovators, including Octopus Ventures, Heal Capital, LocalGlobe, another.vc, MuleSoft founder Ross Mason (Dig Ventures), and Twitter board chairman and former CFO of Google, Patrick Pichette.

About the role

At Apheris, we power federated data networks in life sciences to address the data bottleneck in training highly performant ML models. Publicly available, molecular datasets are insufficient to train high-quality ML models that meet industry requirements. Our product addresses this by hosting networks where biopharma organizations collaboratively train higher quality models on their combined data. The Apheris product is a set of drug discovery applications - enriched with the proprietary data of network participants. Our federated computing infrastructure with built-in governance and privacy controls ensure that the data IP and ownership always stays with the data custodians.

As we are doubling down on ADMET (absorption, distribution, metabolism, excretion, and toxicity) use cases as a focus area within our drug discovery work, we are looking for a Senior Data Engineer to help us build great ADMET models. This is a hands-on, high-impact role focused on advancing the state of the art in applying foundational models to drug discovery problems. You’ll work closely with our ADMET team and will serve as the technical authority on data preparation, data harmonization, and data pipelines in this domain.

You should bring deep expertise in data infrastructure and data preparation with domain knowledge in pharmacokinetics and toxicity with a focus on ADMET modelling and related tasks. You must also understand the application of these models within industrial drug discovery workflows.

If you want to be part of a mission-driven team building cutting-edge AI systems for life sciences – and you know what it takes to leverage domain-specific data – this role is for you.

What you will do

ADMET Data Pipeline Development: Design, build and maintain scalable pipelines for ingesting, processing, and harmonizing diverse ADMET datasets from public sources (e.g., ChEMBL, PubChem) and proprietary assays.
Data Harmonization: Standardize heterogeneous ADMET data formats (e.g., in vitro assays, in silico predictions) across network participants to enable modelling readiness of the data
Model-Ready Dataset Curation: Preprocess raw ADMET data (e.g., normalizing units, handling missing values) to support AI/ML model training for a variety of endpoints (like bioavailability, hERG inhibition, or CYP450 interactions)
Data Quality Assurance: Implement and automate validation checks to ensure ADMET data integrity
Cross-Functional Integration: Work with computational chemists to optimize data structures for AI-driven ADMET models (e.g., graph-based representations for metabolic pathways)
Work with our customers and potentially academic partners to define data preprocessing, selection, and benchmarking strategies for novel training tasks involving ADMET data, including leveraging and harmonizing assay data from different sources.
Collaborate cross-functionally to ensure data and resulting models address real-world drug discovery needs.
Mentor and guide team members on a content level, supporting the planning and breakdown of complex ADMET data preparation.
Influence strategic decisions on data infrastructure and data quality assurance
Contribute to publications or open-source contributions where relevant.

What we expect from you

By month 3: Develop a deep technical understanding of the Apheris product and how it maps to the current ADMET use-cases we are working on. Take ownership of an ADMET data preparation stream. Build relationships with product and engineering leadership. Develop a roadmap and experiment plan for preparing data and adapting models to one high-value use case.
By month 12: Lead multiple data preparation efforts in ADMET and demonstrate measurable progress in model performance and real-world impact. Mentor colleagues and set strategic direction for the domain.

You should apply if

You have a background in computational chemistry, cheminformatics, computational biology, bioinformatics, data engineering or computer science, and a track record of preparing data for ML models addressing real-world drug discovery problems.
You have deep experience in pharma/biotech ADMET data pipelines for machine learning.
You have deep experience in ADMET data, including an understanding of assay protocols and how to map protocols to each other.
You’re comfortable navigating complex technical landscapes and can break down and drive execution on ambitious modeling plans.
You understand how ADMET data and models are used in the drug discovery lifecycle and can align your work to practical use cases.

Bonus points if

You have experience in federated learning, privacy-preserving ML, or secure model training.
You have experience in benchmarking predictive models against standardized datasets.
You have experience working with ML and MLOps systems at scale, including CI/CD, model versioning, Docker, Kubernetes, cloud platforms, and orchestration tools.
You’ve contributed to open-source data or cheminformatics tooling.
You have hands-on experience working with ADMET assays and DMPK stakeholders.
You have experience guiding technical direction in a fast-paced, research-oriented environment.

What we offer you

Industry-competitive compensation, incl. early-stage virtual share options
Remote-first working – work where you work best, whether from home or a co-working space near you
Great suite of benefits, including a wellbeing budget, mental health benefits, a work-from-home budget, a co-working stipend and a learning and development budget
Regular team lunches and social events
Generous holiday allowance
Quarterly All Hands meet-up at our Berlin HQ or a different European location
A fun, diverse team of mission-driven individuals with a drive to see AI and ML used for good
Plenty of room to grow personally and professionally and shape your own role

About Apheris

Apheris powers federated life sciences data networks, addressing the critical challenge of accessing proprietary data locked in silos due to IP and privacy concerns. Publicly available datasets are insufficient to train high-quality ML models that meet industry requirements. Our product addresses this by enabling life sciences organizations to collaboratively train higher quality models on complementary data from multiple parties. We are now doubling down on two key areas of interest: structural biology and ADMET.

Logistics

Our interview process is split into three phases:

Initial Screening: If your application matches our requirements, we invite you to an initial video call to explore the fit. In this 30-45 minutes interview, you will get to know us and the role. The interviewer will be interested in your relevant experiences and skills, as well as answer any question on the company and the role itself that you may have.
Deep Dive: In this phase, a domain expert from our team will assess your skills and knowledge required for the role by asking you about meaningful experiences or your solutions for specific scenarios in line with the role we are staffing.
Final Interview: Finally, we invite you for up to three hours of targeted sessions with our founders, talking about our culture and meeting future co-workers on the ground.

Apply for this job

Senior Data Engineer – ADMET

Offer summary

Qualifications:

Key responsabilities:

Job description

About the role

What you will do

You should apply if

Bonus points if

What we offer you

About Apheris

Logistics

Required profile

Experience

Hard Skills

Other Skills

Data Engineer Related jobs

Staff Data Engineer

Data Engineer

Senior Data Engineer

Senior Data Engineer (Databricks Platform)

Sr Data Engineer