Azure Site Reliability Engineer 

Job Location: Belgium
Job Category: Infrastructure
Job Type: Full Time

Service description:

We are seeking an experienced Azure Site Reliability Engineer to join our Engineering chapter team. You will play a critical role in ensuring the reliability, scalability, monitoring, and performance of our cloud-based services in the Consumer Centricity product organization. Your responsibilities will include designing, implementing best practices, and managing our infrastructure. You are working within cross-functional teams to improve systems and processes and ensure uptime and efficiency. 

Requirements:

Key Responsibilities: 

  • Automation and CI/CD: Design, create, and maintain automation frameworks for deployment, scaling, and managing productive environments. 
  • System Monitoring and Maintenance: Implement and manage monitoring tools to ensure system health and performance. Proactively identify and fix issues before they impact users. 
  • Incident Management: Respond to and resolve incidents in a timely manner, perform root cause analysis, and implement measures to prevent recurrence. 
  • Performance Optimization: Analyze system performance and implement improvements to ensure scalability and efficiency. 
  • Capacity Planning: Conduct capacity planning assessments to predict system needs and ensure resources are in place to handle growth. 
  • Collaboration: Work closely with development teams to integrate systems reliability into the development lifecycle through continuous integration and deployment practices. 
  • Documentation: Create and maintain comprehensive documentation related to systems architecture, configuration, and operational procedures. 
  • Tool Development: Develop and maintain internal tools to streamline processes and improve system reliability. 
  • Security: Ensure that security controls are implemented, monitored, and maintained across all systems. 
  • Service Level Objectives (SLOs): Define and track Service Level Objectives (SLOs) to ensure reliability metrics meet business requirements. 
  • On-call Support: Participate in on-call rotations to provide 24/7 support for critical systems and infrastructure. 

Qualifications: 

  • Experience: Minimum of 5 years in a Site Reliability Engineer or DevOps role with extensive experience in Microsoft Azure. 

Technical Skills: 

  • Proficient in scripting languages (Python, Azure CLI, PowerShell). 
  • Experience with containerization technologies (Docker, Kubernetes). 
  • Proficiency in Azure Cloud services (VMs, Storage, Networking, etc.). 
  • Experience in Infrastructure as Code (IaC) tools such as Terraform, ARM templates, or Bicep to automate secure provisioning and configuration of Azure resources. 
  • Strong experience with monitoring, logging and alerting tools such as Azure Monitor, Application Insights, or Log Analytics and third-party solutions like Grafana, Splunk or Elastic Stack. 
  • Strong understanding of cloud networking, hybrid cloud, and virtual networking concepts (e.g.: VPNs, subnets, NSGs, load balancing, hub & spoke). 
  • Experience in Azure governance and cost management using Azure Cost Management, Azure Policies, and management groups. 

Soft Skills: 

  • Excellent problem-solving and analytical abilities. 
  • Strong communication and collaboration skills. 
  • Ability to work in a fast-paced environment and manage multiple priorities. 

Languages: English (C1). 

Not a must, but advantageous: 

  • Microsoft Azure certifications, such as Azure Solutions Architect Expert or Azure DevOps Engineer Expert. 
  • Experience with the following technologies: Kong, Event Hubs, Dapr 
  • Extra Languages: French (B1), Dutch (B1)
Sorry! This job has expired.