Position Details
What We’re Looking For
Strong experience designing and operating production systems on AWS.
Deep understanding of distributed systems, cloud architecture, scalability, reliability, and service design.
Hands-on experience with infrastructure-as-code, CI/CD, Docker, Kubernetes, and production deployment workflows.
Experience building or supporting production ML, AI, data, or high-scale backend systems.
Strong system design skills, including the ability to reason about tradeoffs, failure modes, data flow, service boundaries, and operational complexity.
Ability to communicate clearly across data science, ML engineering, backend engineering, platform engineering, product, and leadership stakeholders.
Nice to Have
Experience with SageMaker, Bedrock, ECS, EKS, Lambda, S3, RDS, OpenSearch, Aurora, EventBridge, Step Functions, or related AWS services.
Experience with model serving, batch inference, embedding pipelines, vector databases, RAG systems, or LL...