Cobre
Senior Platform Engineer
Salary
Job description
What we are looking for: The Cobre Infrastructure team is responsible for designing, operating, and continuously evolving the cloud platform that powers our financial services. As a Platform Engineer - Senior, you will focus on maintaining and improving the reliability of our infrastructure and developer platform while building scalable CI/CD pipelines and automation systems. This role combines platform operations, CI/CD engineering, infrastructure and test automation, and AI-assisted operations, helping development teams ship faster and safer while maintaining high reliability standards. You will collaborate closely with engineering teams (Development and Cybersecurity) to streamline delivery workflows, reduce operational overhead, and continuously improve our platform capabilities.
What would you be doing: Design, maintain, and evolve CI/CD pipelines used by engineering teams to build, test, and deploy services.
Improve pipeline performance, reliability, and observability. Standardize deployment workflows and automate release processes. Build reusable pipeline components and templates to simplify developer workflows.
Develop automation to reduce manual operational tasks across infrastructure and delivery pipelines. Leverage AI-driven tooling and automation frameworks to improve operational efficiency (incident analysis, remediation automation, pipeline optimization). Explore and implement AIOps capabilities, such as automated runbooks, intelligent alert triage, and self-healing infrastructure. Support teams in defining the infrastructure that will support the solution architecture. Support all the infrastructure (AWS services and K8s clusters) and company products Culture zero-downtime deployments. Assisting with troubleshooting application issues and incidents related with infrastructure services cross environments. Perform periodic load and scalability testing to establish baselines, drift, and capacity planning. Design and implement peak readiness reviews for anticipated high-volume times. Contribute to incident management, postmortems, and reliability reviews.
What do you need: Proven experience of at least 5 years as SRE and/or DevOps, with a strong focus on highly available and scalable environments, cloud infrastructure, observability, and incident management. In-depth technical knowledge of microservices architecture and cloud platforms (e.g., AWS, Kubernetes, Argo CD). Proven experience building and maintaining CI/CD pipelines (CircleCI, Jenkins, GitHub Actions, or similar).
Hands-on experience with Infrastructure as Code tools such as Terraform or Pulumi. Strong mindset for automation and continuous improvement with a huge interest in AIOps / AI-driven auto-remediation (n8n, AWS Bedrock, Python scripting…) Understanding of secure-by-design infrastructure principles. Exposure to GitOps and declarative configuration patterns. Proven experience troubleshooting, mitigating, and resolving issues in a distributed system. Ability to define and execute the SRE strategy, aligning it with company goals and driving the adoption of SRE practices across multiple teams.


