Site Reliability Engineering (SRE) Manager

Xebia
100% remoto desde cualquier lugar del mundo
Jornada completa

Descripción del puesto de trabajo

Hello, let's meet!

We are Xebia - a place where experts grow. For nearly two decades now, we've been developing digital solutions for clients from many industries and places across the globe. Among the brands we've worked with are UPS, McLaren, Aviva, Deloitte, and many, many more.

We're passionate about Cloud-based solutions. So much so, that we have a partnership with three of the largest Cloud providers in the business - Amazon Web Services (AWS), Microsoft Azure & Google Cloud Platform (GCP). We even became the first AWS Premier Consulting Partner in Poland.

Formerly we were known as PGS Software. In 2021, we joined Xebia Group - a family of interlinked companies driven by the desire to make a difference in the world of technology.

Xebia stands for innovation, talented team members, and technological excellence. Xebia means worldwide recognition, and thought leadership. This regularly provides us with the opportunity to work on global, innovative projects.

Our mission can be captured in one word: Authority. We want to be recognized as the authority in our field of expertise.

What makes us stand out? It's the little details, like our attitude, dedication to knowledge, and the belief in people's potential - emphasizing every team members development. Obviously, these things are not easy to present on paper - so make sure to visit us to see it with your own eyes!

Now, we've talked a lot about ourselves - but we'd love to hear more about you.

Send us your resume to start the conversation and join the #Xebia.

You will be:

  • recruiting, developing, and mentoring the SRE team, including setting goals and tracking their achievement,
  • supporting engineers' skill development through coaching and clear expectation setting,
  • defining and implementing SRE best practices, standards and processes, including Service Level Objectives (SLOs), to ensure service reliability and performance,
  • delivering Terraform-based automation in Google Cloud, including project creation, user management, and service enablement, while optimizing cloud costs,
  • designing secure IAM roles, permissions, and monitoring systems to enhance security, user experience, and proactive issue detection,
  • collaborating with development and security teams to ensure reliability, system security, and compliance, while proactively addressing potential issues,
  • prioritizing a customer-focused approach, delivering exceptional user experiences for infrastructure services with clear and effective communication,
  • analyzing system metrics to identify performance bottlenecks and opportunities for improvement and implement capacity planning strategies for resilience under high load,
  • continuously monitoring and optimizing system performance.

Requirements

Your profile:

  • 8 years of experience with data structures or algorithms,
  • 5 years of experience with software development in one or more programming languages,
  • 3 years of experience managing people or teams, leading projects, and designing, analyzing, and troubleshooting distributed systems,
  • excellent problem-solving and analytical skills,
  • strong understanding of software development lifecycle (SDLC) and DevOps principles,
  • deep technical expertise in cloud computing platforms (GCP preferred),
  • proficiency with Infrastructure-as-a-code (IaC) tooling, such as Terraform,
  • proven experience with monitoring tools (Prometheus, Datadog, New Relic),
  • experience with automation frameworks (Ansible, Puppet, Chef),
  • fluent in English (B2-C2),
  • Bachelor's degree in Computer Science,, a related field, or equivalent practical experience.

Work from the European Union region and a work permit are required.

Nice to have:

  • Google Cloud, Azure or Kubernetes certifications.

 

Recruitment Process:

CV review - HR call - Interview - Client Interview - Decision

Originally posted on Himalayas

Categoría

Puesto de trabajo: DevOps
Conocimientos/habilidades: Informática

Tipo de empleo

Jornada completa, 100% remoto.

Ubicación

Cualquier lugar del mundo.

Xebia

Publicada hace 5 días
23 visualizaciones
1 inscrito

Suscríbete a nuestra newsletter y recibe nuevas ofertas de empleo en remoto en tu email 👇