← Back to London Jobs

Lead Inference Platform Support Engineer - AI I

Company: PowerToFly

Location: toronto, London

Posted: May 25, 2026

Position Details

About the Role As a Lead Inference Platform Engineer, you will: 
Optimize LLMs and ML models for high-performance inference using techniques such as quantization, pruning, distillation, and hardware specific tuning 
Deploy and scale inference workloads on GPUs across AWS, Azure, GCP and internal Kubernetes clusters, ensuring predictable performance during peak traffic hours, especially during business hours 
Implement routing and failover strategies for OpenAI/Anthropic/Vertex AI traffic 
Integrate models into production grade APIs supporting TR products and enterprise workflows. 
Develop highly optimized environment and eliminate performance bottlenecks to reduce latency 
Collaborate with Platform Engineering teams (Landing Zones, Network, Storage, Compute, AI) to ensure inference workloads align with TR’s cloud native patterns (AWS, Azure, GCP, OCI) 
Build and optimize containerized inference pipelines...
        

SearchLondonJobs.co.uk

Lead Inference Platform Support Engineer - AI I

Apply for This Position

Position Details

About the Role