Full-Time - Staff Site Reliability Engineer (Remote)
Job Description
Join a world-class automation and cloud engineering team!
We’re hiring a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of mission-critical systems for a global automation platform.
If you thrive in high-impact environments where automation meets innovation, this is your chance to make a difference from anywhere in the Philippines.
Role Summary:
As a Site Reliability Engineer (SRE), you’ll be responsible for maintaining the performance, uptime, and reliability of complex distributed systems.
You’ll design monitoring frameworks, optimize infrastructure performance, and prevent downtime before it happens. This is a hands-on, automation-driven role for someone passionate about reliability, scalability, and continuous improvement.
Key Responsibilities:
- Ensure reliability and uptime across production systems through proactive monitoring and automation.
- Design and maintain CI/CD pipelines to improve deployment speed and stability.
- Manage infrastructure in AWS or Azure using Infrastructure-as-Code (AWS CDK, Terraform, etc.).
- Use observability tools (Prometheus, Grafana, OpenTelemetry, or SigNoz) to detect and resolve performance bottlenecks.
- Work with engineering teams to implement best practices in scalability, fault tolerance, and system health.
- Troubleshoot incidents, drive root-cause analysis, and document solutions for long-term prevention.
- Automate everything from testing and deployment to scaling and recovery processes.
- Contribute to continuous improvement through documentation, dashboards, and performance reviews.
Must-Have Qualifications:
- Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent experience).
- 5+ years of experience as a Site Reliability Engineer, DevOps Engineer, or similar role.
- Strong proficiency with PostgreSQL (design, tuning, and performance optimization).
- Proven experience with AWS or Azure infrastructure management.
- Skilled in Docker and Kubernetes for containerization and orchestration.
- Proficient with Python or TypeScript for automation scripting.
- Experienced in building and managing CI/CD pipelines.
- Strong debugging skills for complex, distributed systems.
- Excellent communication and collaboration skills in a global, remote-first team.
Nice-to-Have Skills:
- Experience working in startups or fast-paced tech environments.
- Familiarity with low-code platforms or robotic process automation.
- Knowledge of advanced backend concepts (state machines, distributed systems, or network protocols).
- Contributions to open-source SRE/DevOps projects.
- Certifications such as AWS Certified DevOps Engineer or Google Cloud DevOps Engineer.



