James Shubin – (EN) Next Generation Config Mgmt: Monitoring

Mgmt is a next gen config management tool that takes a fresh look at existing automation problems.

Three of the main design features of the tool include:
* Parallel execution
* Event driven mechanism
* Distributed architecture

The tool has two main parts: the engine, and the language.
This presentation will demo both and include many examples showing how monitoring is built-in to each resource, and how events can cause the system to react and fix a problem before your pager even goes off.
Finally we’ll talk about some of the future designs we’re planning and make it easy for new users to get involved and help shape the project.

A number of blog posts on the subject are available. https://ttboj.wordpress.com/?s=mgmtconfig Attendees are encouraged to read some before the talk if they want a preview!

Kris Buytaert – (EN) Groovy There is a Docker in my Dashing Pipeline

Dashing or rather Smashing is an awesome Monitoring Dashboard, but it’s a pita to deploy. This talk will document the efforts we went trough to make the deployment of both dashing and the dashboards fully automated. It also will show how we test these dashboards using docker and how we build these pipelines with the JenkinsDSL.

Alba Ferri Fitó – (EN) With a little help from…the community

When we think in Open Source, we can’t forget the fundamental essence that sustains it, this is the Community. From my point of view, Icinga2 has shown to have a great and generous community, that has helped me to customize the monitoring platform of Vodafone Spain, in a fantastic and optimal way. I would like to share with you the different approaches we have choose to monitor such a heterogeneous environment. These are some of the integrations I want to tak about:

  • MQ series
    We have created very nice Monitoring scripts for the new IBM V.8 MQ Series.
    This was pretty challenging cause the “black box” has a non *nix like OS.
    The only way to access is ssh, but we can not exchange ssh keys… It doesn’t have SNMP traps either… The solution has been found using part of these… github.com/ibm-messaging/mq-appliance
  • Vcenter API integration
    Historically, we have always used SNMP traps for the VMWARE platform…until the SDK was release!
    We changed the whole Monitoring aproach, from passive to active, with check_vmware_esx .
  • Console certs @windows
    Biztalk people ask us to monitor the Certificates that where installed on Biztalk servers.
    Searching by internet I found the command that shows that info with Powershell.
    Created the script with threasholds….
  • Check_logfiles
    We had a classic monitor to check logfiles.
    The behavior was to read the log, search for pattern and if no error found, save the last line for next execution. If an error is found, then exits with the corresponding serverity but…being Icinga2 a state type of Monitoring solution, if in the next execution, no error is found on the log, then it changes to OK, BUT maybe the error is still there!
    Cause depending on the nature of the problem, it can keep on writing on the log file or not… We have had several issues with this, because operators had missed the error  I was searching different solutions…so I ended up finding https://labs.consol.de/nagios/check_logfiles/
    It fits just perfect for what we need.
  • API Icinga2 vRealize
    We have a vRealize solution, that we use to serve VMs on demand.
    With the help of the Icinga2 API, we have put a curl call form the vRA to Icinga, to create and delete the VM machines in Icinga, when someone asks for a VM.
  • VF theme
    I’ve changed the IcingaWeb2 theme to customize it for VF-ES.
    Since I know in VF Czech Republic use Icinga2, I shared my experience with them.yx

It’s not too much of a technical thing, but it’s a good example of collaboration.
(some more in the pocket…)

Jochen Schalanda – (EN) Dig in the dirt

Now that you have your logs centralized within your IT Infrastructure, what’s next?
With an increased sophistication of cyber security attacks, it’s imperative to have measures set in place to detect suspicious activity. In this talk, we will demonstrate how to analyze DNS data to build dashboards, streams, and alerts using Graylog, an open source log management tool, in unison with open source log shippers.
DNS, as one of the most unknown protocols, gives you deep insights about activities inside your network without the need of active agents on every device or deep package inspection.

Christian Stein – (DE) Windows Monitoring – Einrichtung und Prüfung mit Icinga 2

Even with an open source monitoring solution one cannot avoid proprietary systems completely. The Windows monitoring is covered by Icinga 2 with its own agent, but how can it be deployed to hundreds of systems without actually touching each system? How does Icinga 2 assist me, and how do I monitor my Windows systems? This talk will show deployment possibilities of Icinga 2 agents on windows, and bring examples on how to perhaps improve its` monitoring

Carsten Köbke & Michael Friedrich – (EN) Ops and dev stories: Integrate everything into your monitoring stack

There are many tools and possibilities available in the Icinga monitoring ecosystem. Setup Icinga 2 within a distributed architecture, put Icinga Web 2 on top and visualise alerts. Further you’ve already setup your preferred graphing solution (Graphite, InfluxDB, etc.) and Grafana greets you with a shiny metrics dashboard. Log events are processed and Elastic Stack or Graylog add more possibilities to correlate them with monitoring alerts.

Having so many tools requires you to know how to connect them, or simply integrate an existing data source or frontend into your all-in-one user dashboard. Be it Graphs in your detail view, additional log events on a critical service view or yet a global map for location based alerts. Maybe you’ll also want to provide as much details in your notifications as possible, think of fancy Grafana graphs.

This talk dives into existing and possible integrations, and explains why you sometimes just write your own integration. Carsten is the author of the Icinga Web 2 Grafana module while Michael focusses on Dashing and log processing. We’ll catch up with war stories (“meh, nothing exists”) to hero stories (“hooray, users love my integration”) and hope to motivate everyone out there doing the same. We will continue with practical development on Friday during the OSMC hackathon.

Dave Kempe – (EN) Icinga 2 in a 24/7 Broadcast Environment

I will present some war stories and implementation details from our Icinga2 deployments into television broadcast environments. From plugins we needed to develop, to challenges in effecting change in staff practices I will walk through the projects and share my experiences on the way.
This will be a useful talk for anyone looking to run a monitoring project and the approach used to get management and general staff on board.
Then we will cover the implementation of distributed monitoring in Icinga2 with strict firewalls, building dashboards using Nagvis and integration of Opsgenie for alerting.
In addition, the process of training staff and using the Windows Agent installer to deploy Icinga to various windows servers will also be covered.

Jochen Lillich – (EN) Monitoring with Sensu — it’s the sensuble thing to do

Well, I guess if I don’t get beaten up by the folks at Netways for presenting an alternative to Icinga, I certainly will be for the title of this talk. But my #monitoringlove for this software is just too strong! After suffering from Nagios for too many years, discovering Sensu saved what was left of my mental health. It also allowed us at freistil IT to grow our web hosting infrastructure without worries about check delays and scalability nightmares.

Sensu just went into its sixth year and counts big names like Yelp, GE, GoDaddy, T-Mobile and OpenTable to its large user community. In this update of my OSMC talk from 2014, I’m going to explain the basics of the Sensu monitoring framework and the advantages of its distributed architecture. Attendees of my talk will not only learn how easy Sensu is to use for health checks and metrics collection, but also what its current limitations are and what’s coming with Sensu 2.0 (aka “sensu-go”).

Michael Medin – (EN) Extending NSClient++

A beginners guide to extending the NSClient++ monitoring agent using the REST API and Python scripts (I really hate Lua so I wont cover that). Some basic python and REST knowledge is probably good but apart from that we will start at the basics. The goal is to have some audience live hacking during the presentation so please feel free to big your computer (with curl installed) along and some great ideas on how to botch my demos! Also please note that given the nature of demos this sessions might sadly end up with some Lego(TM) free sections but I will do my best to include as much Lego(TM) images as possible as backgrounds in terminal windows instead…

Falk Stern – (EN) Network Monitoring with LibreNMS and Icinga

This talk will give a brief overview of LibreNMS, network monitoring and the ecosystem that grew around LibreNMS. It will explain how to integrate LibreNMS with your current Icinga monitoring to ease network monitoring but keeping alarms in the same place. Also, you will gain some insights about metric based monitoring through graphite.

Toshaan Bharvani – (EN) Icinga 2 Multi Zone HA Setup using Ansible

This presentation demonstrates how to use Ansible to deploy Icinga2 in a Multi Zone, Distributed, High Available method.. The presentation demonstrates how to install a virtual machines as a HA master system. It will also show how to install Icinga2 as an zone master with all features available on the zone masters. The last step is to install Icinga2 as an agents on the end nodes. it needs to monitor.

Tobias Kempf & Michael Kraus – (DE) Hochautomatisiertes Warenlogistik-Monitoring bei Europas größtem Handelsunternehmen

In cooperation with ConSol Consulting & Solutions Software GmbH in the last years a worldwide distributed highly automatic monitoring of logistics centres based on OMD was developed. Now, it consists of more than 200 cascaded and self-sufficient units that provide the administrators on site and in the central control with valuable information concerning their infrastructure and reporting-data. In addition to automatically generated infrastructure and server checks, also detailed business process- and end-to-end monitoring is used, to create detailed visualization in addition to notifications. The system, which has run smoothly for several years, is currently being transferred onto the OMD Labs Edition, to ensure that it will handle future challenges such as container-monitoring. Furthermore, all approximately 11,000 branches of the subsidiaries are to be connected to the system.

Julien Pivotto – (EN) Monitoring MySQL with Prometheus and Grafana

Databases monitoring is not a new topic, so what can we still improve? With Prometheus, you can collect a lot of data at a high frequency, and decide later which ones are useful. Grafana, with Percona graphs, offers a very efficient dashboard solution. We will see how to glue everything and get the best way to monitor your databases using open source tools only.

Rob Hassing – (EN) SNMP explained

An in depth overview of the possibilities of SNMP. How to monitor your environment using SNMP.
Learn what you can do with SNMP and what SNMP can do for you within one hour. Most aspects of SNMP are addressed. Getting the information, setting values, but also how the information is presented and the difference between the OID and the MIBs.
In this presentation I’m trying to make SNMP “simple” again and understandable for everybody.

Team Icinga – (EN) Current State of Icinga

Current state of the Icinga project.

Richard Huber – (DE) MoTMa (Monitoring Ticketing Manager)

Eine zuverlässige Kommunikation zwischen dem Monitoring und dem ITSM ist ein Muss auch aus SLA Sicht. ITSM bieten verschiedene Schnittstellen an. Ideal ist der WEB-Service, dieser ermöglicht eine bidirektionale Kommunikation.

Dabei stellen sich verschiedene Herausforderungen:

– Der Webservice nicht funktioniert, wer ist für die Ablieferung eines Incidents verantwortlich
– Mehrere Monitoring Tools sind im Einsatz sind
– Für den gleichen Vorfall sendet das Monitoring mehrere Incidents
– Events/Incidents müssen unterschiedlich priorisiert werden

MoTMa agiert als Schnittstelle dazwischen und löst diese Anforderungen.

https://github.com/RealStuff/MoTMa

Markus Thiel – (DE) Monitoring – dos and don’ts

Which monitoring- responsible does not know this or similar questions? How could a CRM fall-out remain undetected for hours, despite profound monitoring? Why do we have 3000 events in the console even if everything works? In monitoring- projects there are always multi-layered problems and challenges, and the causes may be of technical or non-technical nature. The goal of this talk is to present field-tested approaches and tips on how to identify the causes by analysing the environment and thereby deducing countermeasures and strategies.

Walter Heck – (EN) Monitoring and Alerting for logs

Many of us are using elastic stack with logstash as a way to gather logs in a central place and parse them into understandable information. Throw on Kibana for root cause analysis and Grafana for beautiful dashboards and the picture is almost complete. But there has been one thing missing: monitoring logs for issues and taking action on them in icinga. This has recently been made possible by the logstash output for icinga (https://github.com/Icinga/logstash-output-icinga). This not only allows us to raise alerts, it also allows us to do things like schedule downtimes and add comments to hosts. In this session we’ll explore the possibilities brought on by this new logstash output and show you some examples of what you can do with it.

 

Anthony Goddard – (EN) Monitoring Challenges in a World of Automation

Public and private cloud infrastructures promise to make fully dynamic infrastructure a reality – compute instances can be provisioned and terminated at a moments notice, all in response to customer demand. Though “auto-scaling” was once held as the pinnacle of infrastructure automation, it is now considered table stakes. And while this has relieved certain operational burdens (developers can now have access to “on-demand” compute!), it has also created new challenges.

Bodo Schulz – (DE) Automatisiertes und verteiltes Monitoring in einer CI Umgebung

Integrating a Blackbox Monitoring into a fully automatic Continuous Integration / Deployment environment can be challenging.
The talk shows which techniques are used. The underlying structure should also be sketched out and explained.
I would also like to present my own solutions, which are available as OpenSource. An automatic Icinga2 Master / Satellite or a service discovery developed for Java applications.

Thomas Gelf – (DE) Automated Monitoring in heterogeneous environments

The world out there is neither perfect nor uniform, and that’s good as it is. You’re using VMware, but also running KVM. A little bit of AWS is a must, and something has been deployed to Azure. Evaluation projects for Mesos/Marathon and Kubernetes are on the run, some of them already running in production. A lot of information is in your Active Directory, but some departments are only half-way in. A lot of orphaned entries are to be found. Some use Puppet, experiments with other tools are going on, and quite some things are still under manual control. There are three CMDBs, but none of those are complete. There is an Excel sheet for IP address reservations. Oh, and by the way, network people are of course using their very own tool-chain.

In such kinds of environments, Icinga Director is in full force. Given concrete implementations from daily practice, this presentation shows how to build a fully automated monitoring system based on varying data sources. Optionally, you can have different degrees of automation to accommodate varying speeds within individual teams.

This shouldn’t be an introduction to Director. Given dedicated solutions for specific problems in real projects, the possibilities of this software will be shown.

Rihards Olups – (EN) How is Zabbix doing – an outside look

Zabbix is an opensource monitoring tool that has been rapidly evolving during the last few years. We will talk about the growth of the product and look at it from several perspectives :

  • technical – how Zabbix has developed functionally, important decisions made
  • project management – which processes help to improve the software quality and which ones help less
  • community – how open is Zabbix and how that has changed over the years both towards more and less openness

The talk will illustrate points made with examples from the Zabbix community as well as from an extensive Zabbix use at Nokia.

Marianne Spiller – (DE) Ich sehe was, was du nicht siehst (… und das ist CRITICAL!)

If a department spends most of its time reacting to incoming catastrophes, this is mostly due to a non- existent or not optimal monitoring. Though, what does a “perfect” monitoring consist of? What must be considered in the run-up? And does the benefit justify the effort? This talk will also be a report on my experience with Icinga 2, though not only about integration of hardware, but also about integration of the team. And about interaction and the Icinga Director, the charm of graphs and of contributing to the community, about taking a trip into the world of home-automation- and why “finished” never means “really finished”.

Ronny Trommer – (EN) Another year with OpenNMS

Built on an event-driven architecture, OpenNMS monitors applications and networks at an enterprise scale. Developers can configure monitoring workflows and data easily using the ReST API. OpenNMS is developed under the AGPLv3, so it’s completely open source.
In this talk, I will discuss improvements we’ve made in the last 3 major releases of OpenNMS Horizon and what is in the development pipeline for the next major release.

Martin Schurz – (EN) Building a monitoring solution for modern applications

Modern applicatons require modern monitoring solutions that can react fast on changes in the monitored applications (think of autoscaling, updates). And after many years our old monitoring system, based on Nagios and Cacti, was not holding up anymore. This talk tells the story of your journey from our old system through defining our requirements and multiple tool evaluations (Zabbix, Prometheus, Icinga2) to our current impementation based on Icinga2. I will also show some of our implementation details and how we solved problems in our deployment.

Lennart Betz & Janina Tritschler – (DE) Verteilte Icinga 2-Umgebungen realisieren und automatisieren mit Puppet

Building and automating distributed Icinga 2 environments with Puppet.
The talk contains an introduction into distributed monitoring with Icinga 2, the integration of Icinga with Puppet, and automating the monitoring of hosts as well as different services.

Kevin Honka – (EN) Icinga 2 + Director, flexible Thresholds with Ansible

An introduction to the utilization of flexible Thresholds and how to adapt your checks to an ever changing Environment.
Central points:
– Where can I use flexible Thresholds
– What are the advantages and disadvantages
– Is it worth it?

Thomas Widhalm – (EN) Troubleshooting Icinga 2

What do I do if Icinga 2 stops working?
How can I find out what’s wrong and how do I fix it?
Where can I find help and what information should I provide?
And foremost how do I know *if* Icinga 2 is doing what I want?
If you ask yourself at least one of these questions regularly than this talk is for you.