Barcelona

Apache Flink® & Everything Streaming Data

Barcelona 2025
Agenda

Barcelona 2025 the future of AI is real-time

Agenda

This year, Flink Forward Barcelona 2025 offers a four day conference schedule.

The first two days are dedicated to optional in-person, expert-led training courses. At the conclusion of training, the main conference includes two additional days packed with exciting content, including sessions selected by the Program Committee from the streaming data community.

Flink Forward Barcelona 2025 is sure to provide a vibrant social environment, including plenty of opportunities to network with expert speakers, other attendees and partners in the engaging expo hall. In celebration, at the conclusion of Flink Forward Barcelona, join the community at Flink Fest, the closing party you won’t want to miss.

Morning

Afternoon

Evening

Day 1

Monday, Oct 13

Training

Training

Day 2

Tuesday, Oct 14

Training

Training

Speakers Dinner

Day 3

Wednesday, Oct 15

Keynote sessions

Conference

Sponsors Cocktail

Day 4

Thursday, Oct 16

Conference

Conference

Flink Fest

Bootcamp Program

The Ververica Bootcamp Program is an intensive initiative that transforms Apache Flink users into proficient data processing professionals. By translating complex Flink concepts into practical exercises rooted in real-world scenarios, we empower participants to tackle their toughest data challenges. Leveraging Ververica technology, participants gain a deep understanding of Flink and learn to optimize the scalability and efficiency of our solutions. This program is not just about learning; it’s about mastering Apache Flink and leading the future of data processing.

Level Up Your Stream Processing Skills

This intensive, 2-day face-to-face program is designed for Apache Flink users with 1-2 years of experience who want to take their skills to the intermediate level. We'll delve into advanced Flink concepts and techniques, empowering you to build and deploy highly scalable and efficient real-time data processing pipelines. Leveraging Ververica technology, you'll gain a deeper understanding of Flink and explore best practices for production deployments.

Target Audience

Apache Flink users with a minimum of 1-2 years of experience who are comfortable with core concepts and want to become proficient in advanced functionalities.

Key Topics

  • Advanced Windowing Operations
  • Time Management Strategies
  • State Management Techniques
  • Serialization Optimization
  • Exactly-Once Processing
  • Fault Tolerance
  • Enrichment Techniques
  • Scalability Optimization
  • Flink SQL Functions
  • Table API Features
  • Workflow Design
  • Using Paimon Effectively

Master Advanced Windowing Operations in Apache Flink:

  • Understand and implement session windows, tumbling/sliding windows with triggers, and time management strategies (Event Time, Processing Time, Ingestion Time).

Optimize State Management for High Performance in Flink Applications:

  • Apply advanced state management techniques including state partitioning and RocksDB integration.
  • Optimize state size and access patterns for enhanced performance.

Improve Workflow Performance via Advanced Serialization Techniques:

  • Learn how to reduce time spent serializing and deserializing data, for data sources and sinks (connectors), and over the network.

Deep Dive into Exactly-Once Processing and Failure Recovery:

  • Understand the differences between at-least-once, exactly-once, and exactly-once end-to-end. Learn how to effectively use exactly-once processing when faced with bad data, infrastructure failures, and workflow bugs.

Develop Complex Real-Time Pipelines:

  • Build a workflow that processes a continuous stream of events to generate both dashboard and analytics results.
  • Learn how best to enrich data from a variety of data sources.
  • Optimize complex workflows using pre-filtering, pruning, async I/O, broadcast streams, parallel partial enrichments, and other techniques.

Use Flink SQL & Table APIs to Implement Workflows:

  • Utilize the advanced functionalities of Flink SQL, including UDFs and Table Functions, and master the Flink Table API for unified data transformations and real-time analytics.
  • Compare and contrast the resulting workflow with the Java API.

Designing Optimized Workflows:

  • Learn about situations where splitting a workflow into multiple components improves efficiency and reduces operational complexity.
  • Learn how to use Paimon as an efficient and low-overhead data bridge between workflows.

Prerequisites

Programming Skills

  • 2+ years Java experience
  • Basic Java/Python knowledge
  • SQL proficiency

Apache Flink Experience

  • Hands-on experience with Flink APIs
  • Ability to deploy and manage Flink jobs
  • Understanding of event time and state concepts

System Knowledge

  • Stream processing fundamentals
  • Distributed systems experience
  • Basic cloud platform understanding
  • ETL and data pipeline concepts

Workshop Program

Flink Ecosystem - Building Pipelines for Real-Time Data Lakes

This intensive, 2-day face-to-face program is designed for Apache Flink users with 1-2 years of experience who want to take their skills to the intermediate level. We'll delve into advanced Flink concepts and techniques, empowering you to build and deploy highly scalable and efficient real-time data processing pipelines. Leveraging Ververica technology, you'll gain a deeper understanding of Flink and explore best practices for production deployments.

Target Audience

Intermediate to Advanced; 2-4 years of experience working with data lakes, stream processing, or Apache Flink

Why this workshop?

Flink helps unlock the full potential of data lakes by providing the necessary stream and batch processing capabilities to transform raw data into actionable insights. This course will:

  • Focus on the relationship between data lakes and how they are rooted in Flink's ability to process and analyze large-scale data that often resides in data lakes.
  • Highlight Ververica/Alibaba expertise with the technology behind the Flink ecosystem.

What will this workshop cover?

  • Data Lakes as a Source: These lakes are often built on distributed storage systems like Hadoop or cloud-based solutions (e.g., AWS S3, Azure Data Lake, etc.).
  • Flink for Real-time Processing: In the context of a data lake, Flink processes and analyzes the data stored in these lakes in real-time or in batch modes.
  • Data Integration: Flink can integrate with data lakes to perform complex transformations, aggregations, and analytics on data as it flows in. This enables real-time analytics, insights, and decision-making based on data that is continuously ingested into the lake.
  • Long-Term Storage and Retrieval: While Flink processes and analyzes data, the results are often stored back in the data lake, making it a crucial part of the ecosystem for both storing processed data and enabling further downstream analytics.
  • Unified Data Processing: Flink provides a unified framework for both stream and batch processing, which is important for managing and processing data at scale in data lakes. It helps provide consistency in how data is processed and ensures that data workflows are optimized.

How the Flink ecosystem technologies fit into the workshop?

We will use Flink ecosystem technology along with real-world use cases and hands-on exercises. These technologies (Flink CDC, Paimon, Fluss, and Streamhouse) are integral to different parts of the Flink ecosystem when dealing with large-scale data in data lakes.

  • Flink CDC allows businesses to keep data in the lake up-to-date with changes from operational databases, enabling near real-time updates for continuous analytics.
  • Paimon helps enhance the data lake architecture by turning it into a lakehouse, providing both efficient storage and scalable analytics that can work in both batch and streaming modes. It connects the best features of data lakes (large-scale storage) and data warehouses (structured analytics). By integrating seamlessly with Flink, it can process real-time and batch data and store the results back in the lake, supporting more structured and queryable data formats for analytics.
  • Fluss supports low-latency stream processing, ideal for real-time analytics on data that is being fed into the lake or processed within the lake. It enables high-performance event processing, making it easier to handle data that is continuously ingested and stored in a lake.
  • Streamhouse simplifies stream processing by providing a serverless, scalable framework that can handle real-time data processing and seamlessly integrate with data lakes. It allows businesses to process data from various sources, including data lakes, without worrying about managing infrastructure, and directly feeds processed data back into the lake for further use or analysis.

Interested in being a partner for the
next Flink Forward conference?