← Back to London Jobs

Senior Site Reliability Engineer (GPU & ML Infrastructure)

Company: Criteo

Location: Paris, London

Posted: June 03, 2026

Position Details

What You'll Do:
At Criteo, the Platform Core group builds the foundational infrastructure powering our global advertising platform. We design and operate large-scale, resilient systems supporting real-time decision-making and data processing across thousands of services.
As we expand our distributed computing and ML infrastructure capabilities, we are building a new team focused on GPU platforms and high-performance model serving technologies.
As a Site Reliability Engineer in the GPU team, you will help design, operate, and scale the infrastructure powering machine learning training and inference workloads.
You will work on technologies such as:
Ray on Kubernetes
Build and operate scalable Ray clusters running on Kubernetes.
Develop reliable self-service distributed computing platforms for ML workloads.
Improve provisioning, observability, reliability, and operational efficiency of ray-as-a-service ...
        

SearchLondonJobs.co.uk

Senior Site Reliability Engineer (GPU & ML Infrastructure)

Apply for This Position

Position Details