Job Description

Are you a passionate engineer dedicated to building robust, scalable, and highly available systems? Mindteck is seeking a talented Site Reliability Engineer (SRE) to join our growing team in Cyberjaya. In this pivotal role, you will bridge the gap between development and operations, ensuring our mission-critical services perform optimally under pressure.
You will be instrumental in defining how our applications are deployed, managed, and monitored. We look for individuals who treat operations as a software engineering problem. You will not only maintain system health but proactively automate manual processes, minimize toil, and champion a culture of reliability across our engineering organization.
If you thrive in a fast-paced environment and are obsessed with system performance, latency optimization, and automated infrastructure, we want to hear from you.

Responsibilities

Design, implement, and maintain scalable infrastructure to support high-traffic production environments.
Monitor system availability, latency, and overall health using industry-standard observability tools.
Automate manual operational tasks to increase efficiency and reduce 'toil'.
Drive incident response processes, conduct blameless post-mortems, and implement long-term fixes to prevent recurrence.
Collaborate with development teams to ensure software is designed with performance, scalability, and reliability in mind.
Participate in an on-call rotation to ensure 24/7 service availability.
Manage cloud infrastructure using Infrastructure as Code (IaC) principles.

Qualifications

Bachelor’s degree in Computer Science, Information Technology, or a related field.
3+ years of experience in SRE, DevOps, or Systems Engineering roles.
Strong proficiency in scripting languages such as Python, Bash, or Go.
Solid experience with cloud platforms (AWS, Azure, or GCP).
Hands-on experience with containerization and orchestration tools like Docker and Kubernetes.
Expertise in monitoring and logging stacks (e.g., Prometheus, Grafana, ELK, or Datadog).
Strong understanding of CI/CD pipelines and version control systems (Git).
Excellent problem-solving skills and the ability to thrive in high-pressure environments.

Site Reliability Engineer

Job Description

Responsibilities

Qualifications

Required Skills

Ready to Take on This Challenge?

Related Jobs

ERP Strategist (Power BI)

IT Executive

Project Manager