Position Details
<ul> <li>Monitoring and alert/event management (acknowledge, triage, investigate, resolve/escalate).</li> <li>Incident management end-to-end (including major incident communications and stakeholder engagement).</li> <li>Request fulfilment and operational queries (where agreed).</li> <li>Batch/job monitoring and failure recovery, including periodic processing (EOD/EOM/EoQ/EoY where relevant).</li> <li>Production hygiene / maintenance activities (housekeeping, service restarts, certificate tracking/renewal coordination, routine checks).</li> <li>Change support (pre-checks, implementation support, post-change monitoring, PVT/verification where applicable).</li> <li>Problem management support (RCA coordination, action tracking, technical debt backlog support).</li> <li>Knowledge management (runbooks, SOPs, KEDB articles, continuous refresh).</li> <li>Improve observability/mon...