The O'Reilly Author Showcase 2024

O'Reilly authors have been a part of Data Day Texas since the very beginning. Some of those who've presented over the years are: Gwen Shapira, Holden Karau, Jay Kreps, Emil Efrim, WesMcKinney, Sandy Ryza, Russell Jurney, Ted Dunning, Eric Sammer, Josh Wills, Julia Silge, Sean Owen, Amy Hodler, Andy Petrella, Ryan Mitchell, Joey Echeverria, Denise Gosnell, Matthias Broecheler, Matthew Russell, Eric Lubow, Eli Bressert, Matthew Kirk, Carl Anderson, Ed Capriolo, Charity Majors, Jeff Carpenter, Tim Bergland, Ryan Boyd, David Robinson, Hadley Wickham, Laine Campbell, Hari Shreedharan, Dean Wampler, Mark Grover,
Bonny Mclain, Patrick McFadin, and many more.

Based on enthusiastic response from our attendees, we’re bringing back the O'Reilly Author Showcase for Data Day Texas 2024. We’ll be hosting book signings, office hours, and Ask Me Anything sessions. Here’s your chance to connect with the authors in person.

The O'Reilly Author Showcase from Data Day Texas 2015.

Confirmed O'Reilly Authors for 2024

Susan Shu Chang, Machine Learning Interviews

In Machine Learning Interviews, author Susan Shu Chang shows how to tackle the ML hiring process.
As tech products become more prevalent today, the demand for machine learning professionals continues to grow. But the responsibilities and skill sets required of ML professionals still vary drastically from company to company, making the interview process difficult to predict. Susan will take you through the highly selective recruitment process by sharing hard-won lessons she learned along the way. You'll quickly understand how to successfully navigate your way through typical ML interviews.

Hubert Dulay, Streaming Data Mesh

In Streaming Data Mesh, co-authors Hubert Dulay and Stephen Mooney examine the vast differences between streaming and batch data meshes. Data engineers, architects, data product owners, and those in DevOps and MLOps roles will learn steps for implementing a streaming data mesh, from defining a data domain to building a good data product. Through the course of the book, you'll create a complete self-service data platform and devise a data governance system that enables your mesh to work seamlessly.

Ron Itelman, Unifying Business, Data, and Code

In Unifying Business, Data, and Code, co-authors Ron Itelman and Juan Cruz Viotti show how to collaborate more effectively and design intelligent systems without having to become a data scientist. Map your team, objectives, data, actions, and outcomes as a holistic network and discover connections that may not always be obvious. You'll learn how to reveal hidden root problems and explain how information flows across your organizational networks in order to innovate better, faster.

Ryan Mitchell, Web Scraping with Python, 3rd ed.

In thoroughly updated 3rd edition of Web Scraping with Python, author Ryan Mitchell not only introduces you to web scraping but also offers as a comprehensive guide to scraping almost every type of data from the modern web.
Part I focuses on web scraping mechanics: using Python to request information from a web server, performing basic handling of the server's response, and interacting with sites in an automated fashion. Part II explores a variety of more specific tools and applications to fit any web scraping scenario you're likely to encounter - including how to avoid scraping traps and bot blockers.

Alex Merced, Apache Iceberg: The Definitive Guide

In Apache Iceberg: The Definitive Guide, Alex Merced and his co-authors provide the capabilities, performance, scalability, and savings that fulfill the promise of an open data lakehouse. By following the lessons in this book, you'll be able to achieve interactive, batch, machine learning, and streaming analytics with this lakehouse.
Alex will be hosting an Apache Iceberg: Ask Me Anything session at Data Day Texas.

Holden Karau, Scaling Python with Dask

In Scaling Python with Dask, authors Holden Karau and Mika Kimmins show you how to use Dask computations in local systems and then scale to the cloud for heavier workloads. This practical book explains why Dask is popular among industry experts and academics and is used by organizations that include Walmart, Capital One, Harvard Medical School, and NASA.

Adi Polak, Scaling Machine Learning with Spark

Scaling Machine Learning with Spark, by author Adi Polak, examines various technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLFlow, TensorFlow, PyTorch, and Petastorm. The book covers data ingestion, preprocessing, feature engineering, training models, and bridging Spark and deep learning frameworks.

Hala Nelson, Essential Math for AI

In Essential Math for AI, author Hala Nelson walks you through the math necessary to thrive in the AI field such as focusing on real-world applications rather than dense academic theory. Engineers, data scientists, and students alike will examine mathematical topics critical for AI--including regression, neural networks, optimization, backpropagation, convolution, Markov chains, and more--through popular applications such as computer vision, natural language processing, and automated systems. And supplementary Jupyter notebooks shed light on examples with Python code and visualizations. Whether you're just beginning your career or have years of experience, this book gives you the foundation necessary to dive deeper in the field.

Joe Reis / Matt Housley, Fundamentals of Data Engineering

In Fundamentals of Data Engineering, currently a category bestseller at Amazon, Joe Reis and Matt Housley walk you through the data engineering lifecycle and show you how to stitch together a variety of cloud technologies to serve the needs of downstream data consumers. You'll understand how to apply the concepts of data generation, ingestion, orchestration, transformation, storage, and governance that are critical in any data environment regardless of the underlying technology.

Matthias Broecheler, The Practitioners Guide to Graph Data

In The Practitioners Guide to Graph Data, authors Denise Koessler Gosnell and Matthias Broecheler show data engineers, data scientists, and data analysts how to solve complex problems with graph databases. You’ll explore templates for building with graph technology, along with examples that demonstrate how teams think about graph data within an application.

Amy Hodler, Graph Algorithms

In what has been called the book on Graph Algorithms, co-authors Amy Hodler and Mark Needham explain how graph algorithms describe complex structures and reveal difficult-to-find patterns—from finding vulnerabilities and bottlenecks to detecting communities and improving machine learning predictions. The book provides hands-on examples that show how to use graph algorithms in Apache Spark and Neo4j, two of the most common choices for graph analytics.

More authors to come...

Do you have a favorite O'Reilly author you'd like us to invite to Data Day Texas? Send us a note and let us know!

O'Reilly author Denise Gosnell leading a session on graph thinking at Data Day Texas 2020.

O'Reilly author Andy Petrella leading a session on observability at Data Day Texas 2022.

O'Reilly authors Matt Housley, Holden Karau, Andy Petrella, Patrick McFadin, Gwen Shapira, Adi Polak, and Joe Reis hanging out the night before Data Day Texas 2023.

Data Day Texas: it grew out of a bookstore...

In the early 90s, a small foreign language bookstore situated on the southeast corner of the University of Texas was adopted and transformed by the Austin open source / hacker community. Over the course of a decade, through several incarnations, the shop came to be known as the guerrilla computer book store. In its final years, the store's subterranean lounge became home to weekly Friday afternoon gatherings of the tech community. The gatherings were referred to as GeekAustin. When the store finally closed in 2000, the community carried on and grew, hosting hackathons, happy hours, training, community benefit events such as Linux Against Poverty. In 2009, former bookstore owner and GeekAustin organizer, Lynn Bender, launched the first MongoDB Day; in 2010, the first Cassandra Summit; and in 2011, Data Day Texas.

I first became aware of O'Reilly 1990, when I took over management of a bookstore across the street from the University of Texas. No sooner than I got behind the counter I started hearing requests for animal books. Several months later, the store had a whole case devoted to them. A few years later, it was a whole wall. Except for a few classics, we didn't have a computer book section - we had an O'Reilly section. This endeared us to the hackers and CS students who frequented the store. It wasn't long before I was reading the books myself. O'Reilly provided my initial CS education. When other publishers were printing 1000 page "bibles" - the HTML Bible, the Bash Bible - O'Reilly was publishing inexpensive right-sized books with just the information you need - and they were always first with the new technologies. -Lynn Bender, Data Day Texas