Senior Site Reliability Engineer

Location: Kuala Lumpur
Job ID: OHW34

Specialization: IT OR COMPUTER NETWORK OR SYSTEM OR DATABASE ADMIN

Job description:

Lead OpenShift Cluster Management:
Design, deploy, and maintain Red Hat OpenShift clusters on Azure with focus on scalability, availability, and security.

Automate Everything:
Build and maintain IaC and automation workflows using Terraform, Ansible, ArgoCD for provisioning, upgrades, monitoring, and self-healing.

Ensure Platform Resilience:
Collaborate with architects, cloud, security, and network teams to enforce best practices in disaster recovery, scaling, and compliance (PCI, SOX, SOC 2).

Monitor & Respond:
Use tools like Prometheus, Grafana, and Dynatrace to track performance, respond to incidents, and lead root cause analysis and remediation.

DevOps Leadership:
Maintain CI/CD pipelines, manage container security, support 24/7 operations, and mentor junior engineers to drive a culture of reliability and automation.


Apply Now   Back to Job Vacancies


AsiaRecruit CV