Job 1000 van 1000



Match jouw profiel Solliciteren



Site Reliability Engineering (SRQ159374)


Site Reliability Engineering (SRQ )

Service description:
We are looking for Site Reliability Engineering service in our Engineering chapter team. The goal is to ensure the reliability, scalability, monitoring, and performance of our on-premises services in the ERA product organization. Responsibilities will include designing, implementing best practices, and managing our infrastructure. The role includes working within cross-functional teams to improve systems and processes and ensure uptime and efficiency.

  • Design and maintain monitoring infrastructure
  • Create custom dashboards, alerts, and visualization solutions
  • Implement distributed tracing and log aggregation systems
  • Establish monitoring best practices and SLI/SLO frameworks
  • Maintain security compliance for on-premises monitoring tools
  • Automate deployment and configuration management
  • Collaborate with development teams on application instrumentation
  • Participate to on-duty rotations

Requirements:

  • Core Technologies
    Advanced Grafana,
    Prometheus (PromQL),
    OpenTelemetry,
    Elasticsearch
  • Programming
    Python,
    Bash, or Go for automation
  • Experience
    3+ years monitoring/observability,
    2+ years Grafana/Prometheus in production,
    strong Linux system administration experience,
    proven track record with on-premises infrastructure solutions
  • Security
    Enterprise security practices,
    compliance requirements
  • Ability to balance technical trade-offs with business needs and prioritize effectively.
  • Participation to on-duty rotations (24/7 Incident support)
  • Reduced MTTD/MTTR through effective monitoring
  • Comprehensive observability across all systems
  • Automated monitoring, deployment, and management
  • Security-compliant monitoring practices

Additional information:
Location: Brussels (Empereur)
Onsite presence: By default, a physical presence on site is required for 2 days per week.
Work regime: fulltime

Question: Would the consultant want to participate to on-duty rotations (24/7 Incident support)?

#J-18808-Ljbffr

Match jouw profiel Solliciteren

Meer banen van je zoekopdracht