SearchLondonJobs.co.uk

🏛️ London's Premier Job Portal

← Back to London Jobs

Site Reliability Engineer – GenAI Platform

Company: Astra North Infoteck Inc.

Location: Mirabel, London

Posted: March 17, 2026

Apply for This Position

Submit Application

Position Details

Job Description
  • Experience: 8+ years of experience as a Site Reliability Engineer or in a similar role, with hands-on experience in supporting IaaS platforms with networking and system engineer-ing knowledge.

  • Roles and Responsibilities:

    • Operate, monitor, and maintain the infrastructure supporting GenAI applications (training, inference, feature store, data ingestion, model serving)

    • Design and build automation for core platform capabilities, reducing manual toil

    • Develop and maintain infrastructure-as-code (IaC) for provisioning and managing compute, storage, network, GPU clusters, Kubernetes / container orchestration, etc.

    • Establish, monitor, and enforce SLOs/SLIs/SLAs, error budgets, alerting, and dashboards