Home Job Details
K
Information & Communication Technology 🏢 Full Time ⭐️ Verified

Site Reliability Engineer – Remote (US Shift)

KMC Solutions
Metro Manila
Estimated Salary
PHP 80.000 – PHP 120.000
Posted Date
4 Mei 2026
Application Deadline
4 Mei 2027

Job Description

Join KMC Solutions as a Site Reliability Engineer (SRE) and play a pivotal role in ensuring the reliability, availability, and performance of our production environments. In this remote position, you’ll work a US shift schedule, collaborating with global teams to support product launches on AWS.

You’ll be responsible for designing and maintaining scalable infrastructure, implementing automation, and providing rapid incident response. By leveraging AWS Organizations and best‑in‑class monitoring tools, you’ll drive continuous improvement and help us deliver a seamless experience to our customers.

We value a proactive mindset, a love for solving complex problems, and the ability to communicate across time zones. If you thrive in a fast‑paced, collaborative environment and are passionate about reliability engineering, this is the perfect opportunity for you.

Working with us means you’ll have the chance to expand your skill set on cutting‑edge cloud technologies, contribute to high‑visibility projects, and grow your career in a supportive, innovation‑driven culture.

We offer competitive compensation, benefits, and flexible remote work options, ensuring you can balance professional growth with personal well‑being.

As part of our SRE team, you will define and enforce reliability standards, create runbooks, and conduct regular game days to test resilience. Your insights will directly influence our product roadmap and help us achieve industry‑leading uptime.

In addition to technical challenges, you’ll enjoy a collaborative culture that encourages continuous learning, regular knowledge‑sharing sessions, and access to certifications and training programs to keep your expertise at the forefront of cloud technology.

Responsibilities

  • Design, implement, and maintain CI/CD pipelines and automation scripts to streamline deployments and reduce manual toil.
  • Monitor system health, performance, and availability using tools such as CloudWatch, Prometheus, Grafana, and ELK stack.
  • Respond to and resolve production incidents, performing root‑cause analysis and documenting lessons learned.
  • Collaborate with development teams to define SLIs, SLOs, and error budgets that align with business objectives.
  • Manage and optimize AWS resources across multiple accounts and regions using Terraform, CloudFormation, and AWS Organizations.
  • Conduct capacity planning, cost optimization, and security reviews to ensure scalable, cost‑effective infrastructure.
  • Participate in on‑call rotation and contribute to the continuous improvement of runbooks and operational procedures.

Qualifications

  • Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role with hands‑on AWS experience.
  • Strong proficiency in scripting languages such as Python, Bash, or PowerShell.
  • Experience with container orchestration platforms (Docker, Kubernetes, ECS) and infrastructure as code (Terraform, CloudFormation).
  • Solid understanding of networking, DNS, VPN, and security best practices in cloud environments.
  • Excellent problem‑solving skills and a data‑driven approach to monitoring and incident response.
  • Ability to work a US‑based shift schedule (e.g., 9 am – 6 pm PST) while collaborating with teams in the Philippines.
  • Bachelor’s degree in Computer Science, Information Technology, or a related field (or equivalent practical experience).

Required Skills

AWS Python Bash Docker Kubernetes ECS Terraform CloudFormation CI/CD CloudWatch Prometheus Grafana ELK Linux Monitoring Incident Management Automation AWS Organizations

Ready to Take on This Challenge?

Make sure your resume is ready. Submit your application now before the deadline.

Apply Now

Related Jobs

Similar job recommendations for you

View All