(EN) Monitoring Nomad with Prometheus and Icinga
Things like Infrastructure as Code, Service Discovery and Config Management can and have helped us to quickly build and rebuild infrastructure but we haven’t nearly spend enough time to train our self to review, monitor and respond to outages. Does our platform degrade in a graceful way or what does a high cpu load really mean? What can we learn from level 1 outages to be able to run our platforms more reliably. We all love infrastructure as code, we automate everything ™. However making sure all of our infrastructure assets are monitored effectively can be slow and resource intensive multi stage process. During this talk we will investigate how we can setup and monitor a cloud native container platform that scales using hashicorp’s consul and nomad service discovery and container scheduling tools and Traefik a edge router. This talk will focus on making sure we can have alerts and metrics in this quickly changing infrastructure landscape. We’re going to show how to integrate icinga2 with consul and nomad. To finish off we´ll show how to visualize the prometheus data in a way that resembles netflix’s vizceral using freely available grafana dashboards and plugins.
Bram spent the first part of his career as a Molecular Biologist , he then moved on to supporting his peers by building tools and platforms for them with a lot of Open Source technologies, after which he joined Inuits to focus on helping more people to deliver their software with Open Source tools.