Inside AI Observability and Multimodal RAG Systems for Real-World Impact

Overview

There will be two speakers sharing their experience working with AI Observability and Multimodal RAGs.

Course Description & Learning Outcomes

6 pm - Drinks and snacks

7 pm - AI Observability with Envoy AI Gateway: From Black Box to Glass Box by Adrian Cole

When AI applications break, traditional metrics fail. This session introduces Envoy AI Gateway, the CNCF project co-founded by Tetrate and Bloomberg to solve the real-world chaos of scattered LLM APIs, credential sprawl, and runaway costs. Built on the Kubernetes Gateway API specification, it provides a unified, cloud-native control plane for all LLM traffic.

But taming traffic is just the first step. This session reveals how to transform AI black boxes into debuggable systems using OpenTelemetry standards, specifically tuned for AI workloads with OpenInference conventions.

Key points include:

AI-Specific Metrics: Capture metrics that matter—Time to First Token, inter-token latency, and token usage—and export them to tools like Prometheus for real-time cost and performance debugging.
Distributed Tracing with Phoenix: See how OpenInference enriches traces with prompts and model responses, enabling powerful 'LLM-as-a-Judge' evaluations of your production traffic.
Kubernetes-Native Integration: Leverage standard Gateway API resources for seamless integration with existing cloud-native tooling.

A live demo will showcase not just built-in gateway metrics, but also how to configure and visualize evaluations for correctness, relevance, and hallucination detection.

8 pm - Beyond Text_Building Multimodal RAG Systems with ColPali for Visually Rich Documents by Eng-Hwa

Retrieval-Augmented Generation (RAG) has revolutionized how we search and reason over enterprise knowledge, but traditional systems fall short when documents are visually rich—filled with charts, tables, diagrams, or intricate layouts. Enter ColPali: an innovative multimodal retrieval framework that encodes entire document pages—text and visuals together—as embeddings, enabling accurate, context-aware retrieval no matter how complex the source.

This talk will unveil a practical pipeline for building a Multimodal RAG systems using ColPali.

Key highlights:

Why ColPali? Unlike classic RAG which relies on error-prone OCR and fragmented text extraction, ColPali leverages vision-language models to treat pages as images—preserving layout, figures, and context. This allows us to find, for example, not just the table with last quarter’s revenue but the exact page as visually presented.
Modern System Architecture: We'll demonstrate a streamlined stack using ColPali for multimodal embedding and Qdrant for ultra-efficient vector storage. Live Demo with a Langgraph Agent using ColPali for Agentic Vision RAG.

Schedule

Date: 04 Sep 2025, Thursday
Time: 6:00 PM - 9:00 PM (GMT +8:00) Kuala Lumpur, Singapore
Location: Red Hat Singapore, 88 Market Street, Level 45 CapitaSpring, 048948

Skills Covered

PROFICIENCY LEVEL GUIDE
Beginner: Introduce the subject matter without the need to have any prerequisites.
Proficient: Requires learners to have prior knowledge of the subject.
Expert: Involves advanced and more complex understanding of the subject.