The Site Reliability Engineer is responsible for building, designing, maintaining, and troubleshooting various business solutions that use modern technologies including but not limited to Kubernetes, Open shift, GitLab CI/CD, Ansible, and Docker containers.
Job Duties:
- Build and maintain automation pipelines.
- write and demonstrate procedural operations to build continuity in the systems.
- Apply patches and updates on various systems in automated process.
- Maintain security, backup, and redundancy strategies.
- Deploy and maintain container orchestration platforms.
- Create custom scripts to automate various tasks and workflows using (Python, Bash, Ansible, Gitlab CI, Helm charts).
- Preform troubleshooting on various systems.
- Create monitoring scripts to enable more accurate alerting.
- Respond to system alerts and developer requests.
- Perform timely resolution of trouble ticket queue.