MCPNew: now works with Claude & AI assistants
Xtb

Xtb

Senior Site Reliability Engineer

Company

Xtb

Role

Senior Site Reliability Engineer

Job type

-

Found on Mokaru

14 hours ago

Share this job

Salary

Not disclosed by employer

Job description

XTB is a global company from the financial industry, focusing on online trading of financial instruments. We are the largest FinTech in Poland and a leader in Central and Eastern Europe, and the range of our operations covers several countries, including Asia and South America. At XTB, we focus on the development of our employees, giving them opportunities to gain knowledge and skills in various fields, as well as offering a number of training and development programs. If you are looking for challenges and want to gain valuable experience in an international business environment, XTB is the right place for you.

We are a certified Great Place to Work company.

We are seeking a Senior Site Reliability Engineer to define and drive the reliability of XTB systems at the scale of millions of clients. In this role, you will strengthen SRE practices and shape the resilience of our entire technology stack through high-impact observability, ensuring our systems remain robust and scalable.

Responsibilities

  • Observability Platform Engineering: Develop a standardized observability ecosystem. Implement a conscious telemetry model focusing on structured events, distributed tracing, and intelligent sampling strategies - that provides deep, actionable insights into system behavior.
  • Reliability Enablement: Act as a strategic partner to product engineering teams, providing the platform, standards, and data they need to own service reliability. Use error budgets and alerting as the primary language for balancing feature velocity with stability.
  • Proactive Resilience & Protection: Enhance detection capabilities to identify issues before they impact the customer. Leverage early-warning systems and AI/ML for automated anomaly detection and intelligent data analysis to continuously verify and strengthen system resilience.
  • Operations & Tooling: Build internal automation and tooling that streamlines SRE workflows, automates routine operational tasks, and enhances efficiency across the technology stack.
  • Incident Management & On-Call Rotation: Participate in an on-call rotation to provide incident management, ensuring rapid incident resolution, effective communication, and post-incident analysis to drive continuous improvement.

Requirements

  • Professional Background: At least 5 years of professional experience in SRE, Infrastructure, or DevOps roles managing high-scale, distributed environments.
  • Technical Engineering: Advanced programming skills in Python, with a strong focus on building scalable automation, internal tooling, and robust scripts.
  • Cloud & Orchestration: Hands-on expertise in managing production-grade Kubernetes environments, configuration management tools like Ansible, and designing resilient infrastructure architectures within Azure Kubernetes Service and on-prem environments.
  • Observability Engineering: Deep proficiency in building standardized telemetry ecosystems. You have mastered self-hosted opensource tools for observability data collection, storage and visualization. like Prometheus, Grafana, ELK Stack, Tempo, Thanos, Jaeger and similar.
  • Operational & Soft Skills: Ability to drive incident management, conduct thorough post-incident analysis, and foster a culture of reliability and shared ownership.
  • AI & Automation: Ability to leverage AI/ML techniques for SRE tasks, such as AIOps, automated anomaly detection, log analysis, and optimizing reliability workflows.
  • Bonus Tech: While we prioritize open-source standards, experience with commercial observability and APM solutions (e.g., Datadog, Splunk, New Relic) or chaos engineering frameworks is highly valued.

What we offer

  • Real influence on the development of the company and the product.
  • Work in an experienced team that is happy to share its knowledge.
  • A clear vision of development thanks to regular feedback and clear career paths.
  • Regular team-building meetings.

Benefits

  • A training budget for courses and conferences that interest you.
  • An extra day off on your birthday.
  • An extra day off for parents.
  • Equipment tailored to your needs.
  • Private medical care and group insurance.
  • Access to an e-learning platform for learning English and a benefits platform.
  • Access to a wellbeing platform and the opportunity to take advantage of workshops and private therapy sessions.
  • Remote work, from the office in Warsaw or from a coworking space in your city.
Resume ExampleCover Letter Example

Explore more