Monitoring in the Era of LLMs: How to Make Your Monitoring Work for You
As cloud environments grow in scale and complexity, traditional monitoring approaches increasingly overwhelm engineers with data while providing limited diagnostic insight. Large Language Models (LLMs) offer a new paradigm: transforming raw telemetry into actionable understanding. This session explores how on-premises LLMs can be integrated into cloud monitoring pipelines to augment observability, accelerate troubleshooting, and reduce cognitive load for operators. We examine practical techniques for using LLMs to correlate metrics, logs, and events; summarize anomalous behavior; and assist in root-cause analysis during live incidents Through real-world examples and architectural patterns, attendees will learn how LLMs can “work for you” by acting as an intelligent monitoring assistant rather than another tool to manage. The talk emphasizes pragmatic adoption, highlighting design considerations, limitations, and operational trade-offs when deploying LLM-powered monitoring in production environments.
Speaker
-
Mohamed ElsakhawyWestern UniversityMohamed Elsakhawy received his B.Sc. from Alexandria University, Egypt and his M.E.Sc. in Ph.D. in Computer Science from Western University, Canada. He served as the Operational Lead of Compute Canada’s Cloud National Team between 2017 and 2021.
He served as the Vice-Chair/Chair of the OpenStack user committee between 2019 and 2020. Between 2022 and 2023, he was an Adjunct Professor in the Electrical and Computer Engineering department at the University of British Columbia, Canada.