No Pandas Were Harmed: Elegant and Efficient Data Analytics with Polars
Workshop (INTERMEDIATE level)
ws2
Data analytics doesn't have to be slow or cumbersome. With Polars, a next-generation DataFrame library built for performance and simplicity, you can process massive datasets faster than ever. In this hands-on workshop, you will discover how to leverage Polars to support your data workflows from the fundamentals to the more advanced analytics scenarios.
In the first part of the workshop, we will introduce Polars as the modern library for data analytics in Python, covering its data model, engine design, and how it compares to other analytics tools like Pandas, which is effective for smaller datasets, and Apache Spark, which is horizontally scalable but inefficient for smaller datasets because of the overhead of the distributed infrastructure.
The second part will dive into a real-world dataset of NYC taxi trips, containing hundreds of millions of rows. We will walk through efficient data ingestion and cleaning, and then explore how to perform lightning-fast queries and groupings directly in a notebook environment. You will also learn how to visualize your findings with simple yet powerful plotting techniques.
Finally, we will tackle more complex and realistic analytics scenarios, such as identifying the top-earning drivers over a specific time window, and visualizing their most profitable trips and neighborhoods. Through these scenarios, we will showcase how Polars empowers you to move seamlessly from raw data to meaningful insights.
In the first part of the workshop, we will introduce Polars as the modern library for data analytics in Python, covering its data model, engine design, and how it compares to other analytics tools like Pandas, which is effective for smaller datasets, and Apache Spark, which is horizontally scalable but inefficient for smaller datasets because of the overhead of the distributed infrastructure.
The second part will dive into a real-world dataset of NYC taxi trips, containing hundreds of millions of rows. We will walk through efficient data ingestion and cleaning, and then explore how to perform lightning-fast queries and groupings directly in a notebook environment. You will also learn how to visualize your findings with simple yet powerful plotting techniques.
Finally, we will tackle more complex and realistic analytics scenarios, such as identifying the top-earning drivers over a specific time window, and visualizing their most profitable trips and neighborhoods. Through these scenarios, we will showcase how Polars empowers you to move seamlessly from raw data to meaningful insights.
Andrea Mocci
Software Institute, Università della Svizzera italiana
Andrea is a Junior Group Leader at CodeLounge, the software engineering R&D group at the Software Institute. His main responsibilities include being the tech lead for CodeLounge team and projects. In the context of the REFLEX project, he has fun with machine learning and natural language processing to support economists and social scientists in their research.
Passionate about programming languages, Andrea has a particular love for functional programming. While he doesn’t publish as often these days, he still enjoys speaking at developer conferences, including Scala Days Seattle (2023) and Voxxed Days Zurich (2024), and Voxxed Days Thessaloniki (2025). In the past, Andrea has been a postdoctoral researcher at MIT and USI Lugano, and his alma mater is Politecnico di Milano, where he’s got a PhD advised by Prof. Carlo Ghezzi.
Passionate about programming languages, Andrea has a particular love for functional programming. While he doesn’t publish as often these days, he still enjoys speaking at developer conferences, including Scala Days Seattle (2023) and Voxxed Days Zurich (2024), and Voxxed Days Thessaloniki (2025). In the past, Andrea has been a postdoctoral researcher at MIT and USI Lugano, and his alma mater is Politecnico di Milano, where he’s got a PhD advised by Prof. Carlo Ghezzi.
Jesper Findahl
CodeLounge, Università della Svizzera italiana
Jesper Findahl is a Senior R&D Software Engineer at CodeLounge, a software research and development center at the Software Institute, Università della Svizzera italiana (USI) in Switzerland. Since joining CodeLounge in 2018, Jesper has worked across the full stack — from frontend and backend engineering to data visualization, CI/CD automation, and software analysis.
In recent years, his focus has expanded toward data analytics and machine learning, exploring how modern tools like Polars can make data pipelines faster, simpler, and more enjoyable. Jesper is passionate about software design, developer productivity, and discovering new technologies that push the boundaries of what’s possible.
In recent years, his focus has expanded toward data analytics and machine learning, exploring how modern tools like Polars can make data pipelines faster, simpler, and more enjoyable. Jesper is passionate about software design, developer productivity, and discovering new technologies that push the boundaries of what’s possible.
Marco D'Ambros
USI
Marco D’Ambros is the director of CodeLounge, the R&D Center of the Software Institute at Università della Svizzera italiana. CodeLounge blends academic knowledge and industry experience to pursue both fundamental and applied research in software engineering. Marco earned his PhD in 2010, specializing in mining software repositories. After that, he joined Palantir Technologies, a leading Silicon Valley data analytics & AI firm, where he helped government agencies and major enterprises analyze large-scale, fragmented data, leading technical project execution worldwide. Since returning to academia, Marco has led numerous research and development projects at CodeLounge. His work spans several domains including artificial intelligence, natural language processing, software quality assessment, and software engineering. Marco’s contributions to the research community have earned him international recognition. Notably, he received the Distinguished Paper Award at ESEC/FSE 2022 (International Conference on the Foundations of Software Engineering), and the Most Influential Paper Award at MSR 2020 (International Conference on Mining Software Repositories).