(EN) On the Bleeding Edge of OpenTelemetry
Tracing and telemetry are popular topics right now, but the development is so quick that it also confuses:
- Starting with OpenTracing, then W3C Trace-Context, and now OpenTelemetry there are plenty of standards, but what do or don’t they cover?
- How do the Cloud Native Computing Foundation (CNCF) and its projects like Jaeger play into that.
- Where is OpenTelemetry headed, and how can projects tie into it?
This talk gives an overview of standards, projects, and how they all tie together.
Jacob Baungård Hansen
(EN) Scaling Naemon deployments to Kubernetes with Merlin
Merlin is a module that adds redundancy and load balancing to Naemon. With Merlin it is possible to horizontally scale your monitoring deployment as your monitoring estate increases. In this presentation we’ll give an overview of Merlin and its latest features. We’ll demonstrate how Merlin can be used to scale a Naemon deployment to multiple servers, including new functionality that allows deployment to Kubernetes and making use of Kubernetes autoscaling feature.
(EN) inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at Runtime
If you want to trace or extract specific data from a Java application with OpenTelemetry, you usually have to modify the application’s code. However, this is often not possible, especially with bought-in software. We would like to show, how the open source inspectIT Ocelot Java agent can be used to dynamically inject OpenTelemetry code at runtime for extracting specific application and business data – and all this without having to adapt the application itself.
(EN) Contributing to Open Source with the example of Icinga
Have you ever contributed to an open source project? There are tonnes of different ways to help out, and we want to show you how: From GitHub workflows and general contributing as well as more specific Icinga related topics. We at Icinga have been working on some guidelines for getting started with development on our projects – contributing to the Icinga project has never been easier! That could be working on a plugin, a webmodule, fixing bugs in Icinga Web 2 or Icinga 2, adding features to the director or simply adapting the documentation.
(EN) Advanced MySQL optimization and troubleshooting using PMM 2
Optimizing MySQL performance and troubleshooting MySQL problems are two of the most critical and challenging tasks for MySQL DBA’s. The databases powering your applications need to be able to handle changing traffic workloads while remaining responsive and stable so that you can deliver an excellent user experience. Further, DBA’s are also expected to find cost-efficient means of solving these issues. In this presentation, we will demonstrate the advanced options of PMM version 2 that enables you to solve these challenges, which is built on free and open-source software. We will look at specific, common MySQL problems and review them.
Ignite | (EN) Icinga-Installer – the easy way to your Icinga
This presentation shows you how the Icinga-Installer can be used: ranging from an easy Single-Icinga-Installation with agents to integrating Satellites and using it in HA-Environments.
Ignite | (EN) Pipeline your Dashboards as Code
Manage your Dashboards
Reproducible dashboards for the masses , as code,
Reproducible dashboards for the masses , in different environments,
Reproducible dashboards for the masses , reusable.
Ignite | (EN) Overengineering your personal website; the hold my beer edition
Let’s be honest, whether consciously or not we all do it. In this talk we’ll discuss how far down the rabbit whole one can go, while serving only a single static html page. From the humble beginnings as a markdown file to automation and several layers of monitoring and automation. We are now 2 years on from the last time we discussed this, so it’s time to show how many more layers of stupid can be added.
(EN) Monitoring Open Infrastructure Logs – With Real Life Examples
This session is a mix of discussion & live demo topics:
- Intro to OpenInfra/OpenStack (Why you need your own Cloud)
- What Service Logs to gather and how to format and filter them
- Optimizing data as time series indeces
- Visualizing large quantity of Logs – what’s important?
- Demo Scenario: Response Times – maintaining your SLAs
- Demo Scenario: Tracking Storage growth over time – predicting when to expand
- Demo Scenario: Identifying priority service problems
- Demo of building custom visualizations
(EN) Gamification of Observability
Athletes, Firemen and Doctors train every day to be the best at their chosen profession. As engineers we spend much of our time getting stuff to production and making sure our infrastructure doesn’t burn down out right. We however spend very little time learning to understand and respond to outages. Things like Infrastructure as Code, Service Discovery and Config Management can and have helped us to quickly build and rebuild infrastructure, but we haven’t nearly spent enough time to train our self to review, monitor and respond to outages. Does our platform degrade in a graceful way or what does a high cpu load really mean? What can we learn from level 1 outages to be able to run our platforms more reliably? In this talk we´ll discuss the need for and the options of creating a game day culture. Where we as engineers not only write, maintain, and operate our software platforms but actively pursue ways to learn and predict its (non-functional) behaviour. We´ll look at tools like Prometheus, Loki, Tempo and toxiproxy for ways to prepare teams to tweak their testing and monitoring setup and work instructions to quickly observe, react to and resolve problems.
(EN) Use OpenSource monitoring for an Enterprise Grade Platform
There are many tools and frameworks for monitoring. Usually when you think of an Open Source solution, you don’t think to implement it in a COTS product. Nevertheless, this session will tell you how you can implement tools such as Prometheus, Grafana and ELK into such an Enterprise application platform. Monitoring performance, throughput and error rate is important to be in control of your transactions. If you use a Service Bus or SOA/BPM suite product there are a lot out of the box diagnostics waiting for you. The puzzle here is how to get it out in a useful way. Besides of the many commercial solutions also Open Source tools can help you out with it. You can export runtime diagnostics out of the Diagnostics framework, monitor your SOA Composites and trace down Service Bus statistics using Prometheus and Grafana. The session will elaborate how to set up a proper monitoring using these tools, also in a proactive way where automated monitoring is a must for every application environment.
(EN) Handling 250K flows per second with OpenNMS: a case study
What does it take to go from no flow support, to handling huge volumes of heterogeneous flow data in a 100% open-source monitoring stack, in a real-world environment? Expect a brief refresher on flows, an overview of the customer environment, and discussion of the engineering challenges faced. A medium dive follows into the movement of flow data from ingest to query and display, the solution architecture as it exists today, and lessons learned and their application to the project roadmap.
Stephan Schmidt – Tobias Berdin
(EN) Thola – A tool for monitoring and provisioning network devices
Thola is a new open source tool for reading, monitoring and provisioning network devices written in Go. This talk will inform about the current state of development as well as planned features, including reading out inventory, configuring network devices, support for other monitoring systems like Prometheus and many more. It serves as a unified interface for communication with network devices and features a check mode which complies with the monitoring plugins development guidelines and is therefore compatible with Nagios, Icinga, Zabbix, Checkmk, etc.
(EN) Secure Password Vaults with Naemon
All monitoring systems are a preferred target for malicious intruders because they have access to many interesting things in your network. And they store many many passwords for websites, SNMP credentials for network devices, etc… And usually they are stored in plain text and they
are not hard to find either.
So to make things a bit harder for black hats, Naemon introduces the Vault API for secure
storage of passwords and other things you won’t like to store in plain text.
This talk introduces the API and gives some examples of how to use it.
(EN) Monitoring Open Source Hardware
As part of a new initiative to enable open source hardware, multiple manufacturers including IBM, HPE, and others have open source hardware machines with open source hardware, firmware, and software. This provides more opportunities for monitoring and getting agent-less data but also agent-based data. This presentation will show some of the open source hardware and will show how you can enable you to get control and monitor this hardware using Icinga2.
(EN) pg_stat_monitor: A cool extension for better database (PostgreSQL) monitoring
The pg_stat_monitor is the statistics collection tool based on PostgreSQL’s contrib module pg_stat_statements. PostgreSQL’s pg_stat_statements provides only basic statistics, which is sometimes not enough. The major shortcoming in pg_stat_statements is that it accumulates all the queries and statistics, but does not provide aggregated statistics or histogram information. In this case, a user needs to calculate the aggregate, which is quite expensive. Pg_stat_monitor provides the pre-calculated aggregates. pg_stat_monitor collects and aggregates data on a bucket basis. The size and number of buckets should be configured using GUC (Grand Unified Configuration). The buckets are used to collect the statistics and aggregate them in a bucket. The talk will cover the usage of pg_stat_monitor and how it is better than pg_stat_statements.
(EN) Open Source API-HUB – Connect Icinga2, Zabbix, CheckMK and more with OpenCelium
Using a smart service bus system with a good web access GUI to synchronize your data from one to another system. How is it possible? A loose coupled architecture, combined with the newest technologies and smart backend core system. We are using default parameters of api documentations like WSDL to integrate other systems to OpenCelium. How do we define the communications? We provide an overview, where you can setup the order of calls and also the using of operators like iterations and conditions and all based on Open Source tools. I will explain architecture and show you the overview with real examples like synchronizing host between Icinga2, Zabbix and CheckMK. So please don’t miss it.
(EN) Open Source Application Performance Monitoring in the Enterprise
I will show our journey of the implementation/integration of an Open Source Application Performance Monitoring solution, based on
– inspect-IT (http://inspectit.rocks)
– OpenCensus (http://opencensus.io)
– Jaeger (http://jaegertracing.io)
– InfluxDB (http://influxdata.com)
– Grafana (http://grafana.com)
We are instrumenting more than 1.000 JVMs and more than 100 applications. We are using JVM-Instrumentation and JS/Browser/End-User-Monitoring to measure the performance from our applications. I will pitfalls and success of the Implementation. And how it could help for application-performance-monitoring.
(EN) Observability is More than Logs, Metrics & Traces
You know the drill: DevOps is using tool(s) X. So obviously, observability can be solved by throwing some tools together as well; generally, logs, metrics, and traces often called the trifecta of observability. But observability is not a tool — it is a property of a system. Moving from many small black boxes to a more holistic view of your system. It includes tools, but not exactly three distinct features (especially if your solution happens to support those). For example, if half your user base cannot access your service because of some bad DNS settings and external health checks are not part of your trifecta, you are none the wiser. This is not (just) a rant, but a look at the actual value to be added and some approaches to it. Like turning your logs into richer events that align with your business. Which is not solved by fancy tools alone.
(DE) SNMP Monitoring mit Prometheus / OIDs dynamisch auswählen und im Griff behalten
SNMP und Prometheus sind mittlerweile ein eingespieltes Team wenn es um das einsammlen von Metriken via SNMP geht. Doch vor den bunten Dashboards steht die Herausforderung für jedes Device den passende Satz an Metriken bzw. OID zu finden. Bei einem Mix aus den unterschiedlichsten Herstellern mit jeweils verschiedenen Produkten keine leichte Aufgabe.
In diesem Vortrag zeige ich, wie mittels eines SNMP discovery die interessanten OIDs ermittelt und alle für Prometheus notwendigen Konfigurationen anschließend automatisiert erzeugt und deployed werden können.
Daniel Uhlmann – Sebastian Gumprich
(EN) Still directing the director… and more!
For the monitoring of our systems, we make extensive use of Icinga, its director, and the business process monitoring module. We also make broad use of automation (at least we try to!). In this talk we would like to tell you how we automated the monitoring of our services using our self-written Ansible collections. We will cover how we developed the Ansible components and how we use them. We’ll also show you what we plan to do with them in the future.
(EN) ITSM by Asterix and friends
You can only monitor systems that you know!
GLPI is a very successful open source ITSM solution, the project follows a modular approach and can therefore be extended by many very useful plugins. And yes … GLPI is mainly “French” 😊 !
In this very short introduction, I’ll will give you a rapid overview how to:
- automate your IT inventory to manage pc’s, servers, vm’s, vmware, …
- add printers and network components via “snmp”
- add special assets like databases, appliances, URL’s, lines, racks, datacenters…
- add additional information’s to all this components
- add people from your LDAP / AD
- add plugins to GLPI
- build reports
- import / export your data
- handle tickets, problems, changes, or projects
In my second presentation “Monitoring @ G&D ” I will later show you how we’ve automated our monitoring with the help of GLPI, some db view’s and python scripts.
(EN) Observability will not fix your broken Monitoring , or Culture
Plenty of people are jumping on the new hype, Observability, lots of them are replacing their “legacy” monitoring stack. Not all of them achieve the goals they set. This talk will talk about the pittfals of adopting new technologies the wrong way, it will teach you how to improve your monitoring by adapting your culture and then maybe your tooling. Based on some real life stories.
(EN) Monitoring @ G&D
At G&D we have one ICINGA system specialized in monitoring our complex SAP environment. To keep ICINGA “up to date” the “Config Build” is automated with the help of GLPI.
All technical information’s are collected by GLPI’s “Fusioninventory” plugin, some custom ICINGA fields are added with the “Fields” plugin to our Server- , Database- and SAP Objects.
To build the ICINGA configuration we use various database views (GLPI’s mysql) and some python scripts … but it would be possible to use the “Icinga Director” as well.
Finally, we are informed if the monitoring configuration would change due to system changes detected by GLPI. This means that we can adjust our monitoring fully- or semi-automatically.
(EN) Icinga for Windows – Evolution
Icinga for Windows has grown over the past years and increased its popularity and usage in environments by a lot. In this talk we will discuss the current state of Icinga for Windows, the future plans and new features and improvements that will follow with v1.7.0. In addition, we will provide a short summary, on how Icinga for Windows will help developers to easily write their own plugins with minimal effort.
(EN) Robotmk: You don’t run IT – you deliver services!
Business applications have to be available, performant and functioning. Full stop. Even with thousands of infrastructure monitoring checks, you won’t be able to even begin to monitor the end-user’s perspective. The fact is: you monitor your IT, but you can only hope that your services will work. Time to change that. Time to use a framework. Time to use Robot Framework. My presentation will show you the demand for End2End-Monitoring and why Robot Framework is an excellent choice for automated application tests. You will also get to know Robotmk, the link between Robot Framework and Checkmk. It dovetails both tools extremely closely and gives your infrastructure monitoring a holistic approach. It is used by companies of diverse branches, as well as by authorities and governments. And once you have discovered the KubernetesLibrary, DataDriver, RequestsLibrary and all the many more libraries, you will not want to put Robot Framework down again. But that’s another story…