We are looking for an engineer to join an already established SRE team for the SAP Business Technology Platform. As a Site Reliability Engineer, you will operate and support business‑critical cloud services, proactively monitor service behavior, and develop tools for monitoring and troubleshooting cloud services based on open‑source and SAP technologies, following SRE principles.
Responsibilities
- Act as technical expert during live‑site incidents, investigate and solve incidents on a deep technical level.
- Drive root cause analysis and follow‑up improvements to prevent issues from reoccurring.
- Perform in‑depth troubleshooting and log analysis to identify and solve complex issues in accordance with internal and external SLAs.
- Build software‑based solutions to improve service reliability and stability.
- Enhance infrastructure and platform monitoring by gathering system metrics (4 golden signals) and implementing recovery tools....