Who spoke at Data Day Texas 2026

We are continuing to announce confirmed speakers. However, speaking proposals are now closed. For the latest news, follow us on Linkedin.

Lena Hall (Seattle)

Lena Hall is an expert in practical AI adoption, data engineering, cloud and pragmatic architecture, driving strategic AI integration at scale combined with extensive experience leading large, high-performing technical teams. She is the Founder and CEO of Droid AI, where she helps businesses get real results from AI, defining Data + AI strategies, integrating LLMs into complex systems, connecting AI solutions with custom data and business tools, and optimizing outcomes through proven and innovative architectures. Lena has 15+ years of deeply technical background as a solution architect, technical leader in large-scale data, analytics, machine learning, and cloud computing. Prior to Droid AI, Lena worked as Head of Developer Experience, North America at Amazon Web Services and led Big Data DevRel at Microsoft. She frequently shares practical knowledge on her YouTube Channel and at industry conferences as an international keynote speaker.

Lena will be presenting the Saturday Data Day Texas session:
Context >Prompts: Context Engineering Deep Dive

Shachar Meir (London)

Shachar Meir is a Data Executive with 20+ years of experience scaling data teams and driving growth at global organisations. He has built and led high-performing data and analytics teams at both high-growth startups (WorldRemit, Pontis) and industry leaders (PayPal, Meta). As Director of Data Engineering at Meta, Shachar led Trust and Safety data initiatives overseeing critical infrastructure that protected billions of users, and served as the London Data Engineering site lead. Today as a Data Advisor, Shachar helps companies unlock growth through data via strategic initiatives, org development, and transformation programs. Outside of work, Shachar enjoys quality time with his family, cooking, and flying airplanes and helicopters.

Shachar will be presenting the Saturday Data Day Texas session:
The $1M Data Professional

Shachar will also host the following Sunday Data Discussion:
How to Elevate your Data and Analytics Teams

Alexandra Pasi (Salt Lake City)

Alexandra (Lexi) Pasi (Linkedin) is the CEO and co-founder of Lucidity Sciences. A mathematician and logician by training, Alexandra has applied that knowledge in a variety of business and technical domains, establishing herself as a dynamic strategist, data leader, and ML innovator. She completed her PhD in Mathematics at Baylor University, receiving the Outstanding Dissertation Award for her interdisciplinary work in logic and formal systems. Her work in machine learning began in 2011, with theoretical and applied research in kernel learning methods. With a group of collaborators, she expanded on that work to develop a novel, general-purpose, highly-scalable ML approach - one which significantly and consistently outperforms state-of-the-art ensemble solutions in accuracy, latency, and compactness. With her team at Lucidity Sciences, she has brought that same technology to market with successful applications in various industries from medicine to finance. Check out Alexandra's 2024 interview on The Joe Reis Show (spotify), and her recent interview on The Applied AI Podcast . (spotify / youtube), and her recent interview on Working / Broken (spotify). Lexi will be appearing as part of the #SLCdata invasion.

Alexandra will be presenting the following Saturday Data Day Texas session:
Learning Beyond Language: A New Geometric Paradigm for Better ML

Patrick McFadin ( California )

Patrick McFadin is Principal Technical Strategist at IBM, where he works on distributed databases and production AI systems. An Apache Cassandra committer, PMC member, and Apache Software Foundation member, he's been building scale infrastructure for over two decades. He co-authored "Managing Cloud Native Data on Kubernetes" for O'Reilly and previously served as VP of Developer Relations at DataStax, helping organizations build some of the largest Cassandra deployments in production.
I first met Patrick McFadin around 2011 - in the early days of Cassandra and DataStax. He was still at Hobsons then, and gave a presentation comparing Cassandra to Oracle that had the room both learning and laughing—his metaphors were that good. DataStax co-founder Matt Pfeil worked hard to recruit him, and succeeded. Over the years I've watched Patrick work behind the scenes—smoothing over disputes in the Cassandra community, helping people with good ideas bring them to fruition. No one better personifies the DeMarco and Lister "Peopleware" idea of a catalyst than Patrick.
—Lynn Bender

Patrick will be presenting the Saturday Opening Session:
The Skills that Matter When Everything Changes

Hannes Mühleisen (Amsterdam)

Hannes Mühleisen (Linkedin) is a Professor of Data Engineering at Radboud University Nijmegen and a senior researcher at the Centrum Wiskunde & Informatica (CWI) in Amsterdam - the same place where Python was invented by Guido van Rossum. Hannes is also Co-founder and CEO of DuckDB Labs - and creator of the DuckDB database management system - for which he earned the 2025 Dutch Prize for ICT Research. DuckDB is named after Hannes' pet duck "Wilbur." Hannes had chosen a duck as a pet because he lives on a historic sailing ship in the Amsterdam city center with his family. Wilbur has since flown away to start a duckie family.

Hannes will be presenting the following Saturday Database Keynote:
The Joy of SQL - If Properly Implemented

Kierra Dotson (Austin)

By day, Kierra Dotson (Linkedin) wrangles Kubernetes clusters at IBM's Cloud VPC platform. By night, she's on a mission to help professionals across all disciplines discover data as their secret weapon for career growth and organizational impact. With a diverse background spanning roles as an HR Data Architect at Charles Schwab and Chewy, a BI Developer at Acxiom, a Data Engineer at Visa, and a cloud and DevOps engineer at a major cloud company, Kierra works to bridge the gap between infrastructure and data/AI engineering.
As the CEO and founder of The Data Bloq, Kierra provides cutting-edge data and GenAI strategy consulting, empowering organizations to harness the full potential of their data assets. Her impact extends far beyond corporate environments; Kierra has conducted over 150 tech consultations and resume revamps via The Data Bloq, helping professionals secure over $250,000+ in salary increases. Follow Kierra at The Data Conversationalist.

Kierra will be presenting the following Data Day Texas session:
The Engineer’s Guide to AI Strategy: Bridging the Gap Between Business and Technical Reality

Mark Freeman (Sacramento)

Mark Freeman (Linkedin) is a data scientist turned data engineer with a deep obsession for data quality. As the Tech Lead at Gable, Mark builds internal systems and data products that drive go-to-market strategies, leveraging his extensive experience in creating robust, scalable data solutions. He is also the first employee at Gable where he aims to help bring a data contract solution to market. Mark is co-author of the upcoming O’Reilly book: Data Contracts, in which he shares insights and best practices on ensuring reliable, high-quality data flows within organizations. With a passion for turning complex data challenges into actionable solutions, Mark is committed to advancing the field of data engineering and fostering a culture of trust in data across the industry. Check out Mark's courses on Linkedin Learning.

Mark will be presenting the following Saturday Software Engineering Keynote:
Code: The Untapped Metadata Source Driving Most Data Failures

Mark will also lead the following Sunday Workshop:
Implementing Your First Data Contract

Clair Sullivan (Breckenridge, Colorado) @cjlovesdata1

Dr. Clair Sullivan is currently the Founder and CEO of Clair Sullivan and Associates, a company dedicated to providing data science consulting services. Prior to starting her company, she was the Director of Data Science at Vail Resorts leading a team of data scientists and machine learning engineers providing production models for operations and marketing. Previously she was a data science advocate at Neo4j, working to expand the community of data scientists and machine learning engineers using graphs to solve challenging problems. She received her doctorate degree in nuclear engineering from the University of Michigan in 2002. After that, she began her career in nuclear emergency response at Los Alamos National Laboratory where her research involved signal processing of spectroscopic data. She spent 4 years working in the federal government on related subjects and returned to academic research in 2012 as an assistant professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign. While there, her research focused on using machine learning to analyze the data from large sensor networks. Deciding to focus more on machine learning, she accepted a job at GitHub as a machine learning engineer while maintaining adjunct assistant professor status at the University of Illinois. In 2021 she joined Neo4j as a Graph Data Science Advocate. Additionally, she founded a company, La Neige Analytics, whose purpose is to provide data science expertise to the ski industry. She has authored 4 book chapters, over 20 peer-reviewed papers, and more than 30 conference papers. Dr. Sullivan was the recipient of the DARPA Young Faculty Award in 2014 and the American Nuclear Society's Mary J. Oestmann Professional Women's Achievement Award in 2015. Check out Clair's GraphRAG Essential Training on Linkedin Learning

Clair will be presenting the following Saturday Data Day Texas session:
Your Skills, Your Business: Layoff-Proof Your Career through Solopreneurship

Clair will also lead the following Sunday Workshop:
Build Your Solopreneur Roadmap: A Workshop for Data Professionals

Data Governance Keynote
Winfried Etzel (Stavanger, Norway)

An increasingly prominent voice in the global data community, Winfried Etzel champions Data Governance, Data Strategy, and organizational design. Through his #MetaDAMA podcast and his work building the Nordic data community, Winfried drives professional development across the region. His upcoming book, «Data Governance in the Wild», explores how Data Governance must evolve for distributed landscapes, automation, and AI.
Catch Winfried's insights on the Data Democracy Podcast, Catalog and Cocktails., and his A Journey around the World of Data with Joe Reis.

Winfried will be presenting the following session:
Existence over Essence? Data Governance in times of AI

Thais Cooke (Raleigh-Durham-Chapel Hill)

After transitioning from dentistry into data analytics in 2021, Thais Cooke (Linkedin), quickly established herself as a versatile data professional, currently serving as a Senior Data Analyst. Her unique background bridges clinical expertise with analytical thinking, bringing fresh perspectives to data challenges across industries. Thais is the creator of the LinkedIn Learning course: SQL for Healthcare Professionals and an upcoming course on Data Governance with a focus on healthcare.
Thais has shared insights on popular industry podcasts including Data Podcast for Nerds!, Mavens of Data, and How to Get an Analytics Job, and has been featured as a speaker at MDS Fest. Thais is dedicated to making data accessible, focusing on practical solutions that connect technical concepts with real-world applications. When not immersed in datasets and visualizations, Thais can be found buried in books, writing, experimenting with new recipes in the kitchen, or enjoying family time.

Thais will be presenting the following Saturday Data Day Texas session:
The Human Layer of Data: Why Trust Lives in the Work Behind the Numbers

Bill Inmon (Castle Rock, Colorado)

Bill Inmon (Wikipedia / LinkedIn) is an American computer scientist, recognized by many as the father of the data warehouse. Inmon wrote the first book, held the first conference, wrote the first column in a magazine and was the first to offer classes in data warehousing. Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions. Bill is among the most prolific and well-known authors in the big data analysis, data warehousing and business intelligence arena. In addition to authoring more than 50 books and 650 articles, Bill has been a monthly columnist with the Business Intelligence Network, EIM Institute and Data Management Review. In 2007, Bill was named by Computerworld as one of the “Ten IT People Who Mattered in the Last 40 Years” of the computer profession.

Bill will be presenting the following Saturday Data Day Texas session:
Generative AI and Business Value: Why Corporate Deployment Falls Short

Bill will also join Joe MF Reis for the following Sunday Data Discussion:
A Bill Inmon Ask Me Anything - with Joe MF Reis
(You must have a Sunday ticket for this)

Sarah McKenna (NYC)

Sarah McKenna (Linkedin) is the CEO of Sequentum, a leading web data extraction company that provides high-quality, custom data solutions to Fortune 500 companies, financial institutions, and government agencies. With a 20-year career in New York's tech ecosystem, Sarah has established herself as a thought leader in the data extraction industry, serving as the Technical Advisor for the Web Data Collection Considerations published by the SIIA Financial Information Standards Division (FISD) Alt Data Council.
A Georgetown University graduate, Sarah began her career in finance before pivoting to technology, where she developed deep expertise in quality assurance and large-scale, data-driven automated operations. Her leadership has been instrumental in Sequentum's evolution from a web scraping tool provider to a comprehensive data solutions company, culminating in the recent launch of Sequentum Cloud, their next-generation PaaS platform for web data extraction.
Sarah has guided several notable data technology companies to successful acquisitions, including Summit Systems (acquired by Misys), Vitech (acquired by CVC Capital Partners), and Massive Incorporated (acquired by Microsoft). Her focus on establishing ethical standards and best practices in web data collection has helped shape the rapidly evolving alternative data marketplace.

Sarah will be co-presenting the following Saturday Data Day Texas session:
Web Scraping's 25-Year War: From HTML Parsing to AI That Builds Itself

Jenna Jordan (Ashville)

Data Librarian turned Data Engineer, Jenna Jordan (Linkedin) learned to code while earning her Master of Science in Library and Information Science at the University of Illinois. No surprise that her data engineering practice is informed by and grounded in information science principles. As a senior analytics engineer with Ratio PBC, Jenna helps build data systems that support public health & human services organizations, as well as advises clients on dbt best practices.
During her time as a senior consultant at Analytics8, Jenna developed particular expertise in dbt Mesh architecture and the governance strategies that accompany it. Her experience working with dbt Mesh led to a peer exchange session at Coalesce 2024, where she helped attendees explore governance challenges through a role-playing simulation game. Jenna spearheaded the adoption of dbt at the City of Boston Analytics Team, where she architected and built the project and from scratch, and reorganized the data warehouse.
An occasional blogger on topics like analytical data warehouses and data engineering best practices, Jenna is also a passionate community builder and founder of the City Analytics Exchange, a network for data analytics practitioners in local government. When not transforming data, she's a knitter, board gamer, and dog mom.

Jennifer will be co-presenting the following Saturday session:
Think Like a Librarian: Fresh Perspectives for Data Teams from a Time-Honored Tradition

MF Joe Reis (Salt Lake city) @joereis

MF Joe Reis (Linkedin) is a “recovering data scientist,” and a business-minded data nerd who’s worked in the data industry for 20 years. His responsibilities have ranged from statistical modeling, forecasting, machine learning, data engineering, data architecture, and everything else in between. Joe was co-host of the popular Monday Morning Data Chat (Spotify / Apple) and currently the host of the Joe Reis Show (Apple / Spotify). Joe is also co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering. Joe also teaches at the University of Utah as well as runs several meetups, including The Utah Data Engineering Meetup and SLC Python. When he’s not busy running a company, teaching, or creating content, Joe often finds himself DJing/making music, rock climbing, or trail running in the mountains around Salt Lake City, Utah.

Joe and Matt Housley will co-host their annual Data Town Hall at 5pm Saturday

Dylan Anderson (London)

Dylan Anderson is the Head of Data Strategy at Atombit and a leading voice in the Data Strategy space. As an experienced consultant, he focuses on helping large and small companies bridge the gap between data and strategy. Over his career, he has worked for Deloitte, Accenture, and multiple boutique consultancies, giving him experience helping over 40 clients across dozens of industries understand how to approach data from a more strategic, value-led perspective.
Dylan’s new fascination is with popularising the concept of the Data Ecosystem, bringing it beyond its previous technologically-focused definition and reframing it to include all the considerations data teams need to keep in mind. This has led to a rapidly growing Substack newsletter exploring each domain of the Data Ecosystem, with a holistic outlook and a focus on the ‘so what’ implications that data professionals often gloss over. With this viewpoint, Dylan doesn’t focus on one data domain, but on all of them, trying to explain the interdependencies and strategic value of taking a more generalised approach. Dylan has also spoken at conferences and maintains a significant presence on LinkedIn, with ~50k followers. He also wears a lot of bow ties and loves data memes!

Dylan will be presenting the following Data Day Texas (Saturday) session:
Beyond the Tech Stack: Navigating the Complete Data Ecosystem

Matthew Mullins (Raleigh-Durham-Chapel Hill)

Matthew Mullins (Linkedin) has spent over 20 years at the intersection of government and enterprise, building and delivering data solutions that drive informed decision-making. Currently CTO and co-founder at Coginiti, the collaborative data operations platform, Matthew he leads the product and engineering teams. With deep expertise in data strategy, analytics, and technology implementation, he has helped organizations harness the power of data to solve complex challenges. If he’s not solving data challenges, you’ll find him floating or standing in a river fly fishing.

Matthew will be presenting the following session:
DataOps Is Culture, Not a Toolchain

Chris Brousseau (Salt Lake City)

Chris Brousseau (Linkedin) is a self confessed word nerd. Starting in Slavic Languages and linguistics at Brigham Young, he migrated into linguistically informed NLP, and then MLOps and LLMs. Currently VP of AI at Veox (founded by Jepson Taylor), Chris is co-author (with Matthew Sharp) of LLMs in Production (Manning). Chris is a down to earth guy with actionable tips to take your production to the next level. Check out Chris’ and Matt’s 2023 Interview with Joe Reis, his 2023 MLOps.Community presentation, and his interview with Tech Bros Mary and Lauren. Chris will be appearing as part of the #SLCdata invasion.

Chris will be presenting the following Data Day Texas session:
Local AI Saves People (Not Clickbait)

Russell Spitzer (New Orleans)

Russell Spitzer (Linkedin) received his Ph.D from UCSF in 2013 after performing a lot of comparisons of protein binding sites. Following that, he became deeply invested in distributed computing and involved in several Apache projects. While working at Datastax he was a key contributor to the DataStax Spark-Cassandra Connector and also worked on many Apache projects. After leaving Datastax he worked at Apple growing the then nascent Apache Iceberg project where he worked on data file management and advancing the table format. Currently Russell working on OSS software at Snowflake and is a PMC member of the Apache Iceberg project, and PPMC member of the Apache Polaris (Incubating) project.Russell Spitzer received his Ph.D from UCSF in 2013 after performing a lot of comparisons of protein binding sites. Following that, he became deeply invested in distributed computing and involved in several Apache projects. While working at Datastax he was a key contributor to the DataStax Spark-Cassandra Connector and also worked on many Apache projects. After leaving Datastax he worked at Apple growing the then nascent Apache Iceberg project where he worked on data file management and advancing the table format. Currently Russell working on OSS software at Snowflake and is a PMC member of the Apache Iceberg project, and PPMC member of the Apache Polaris (Incubating) project.
#lakehouse #systems

Russell will be presenting the following Data Day Texas session:
What Apache Iceberg is Bad At …. For Now

Paul Blankley (Denver)

Paul Blankley (Linkedin) has a master’s degree from Harvard in AI and is Co-founder and CTO of Zenlytic. He has over nine years of experience in data & AI.
Paul and his co-founder Ryan Janssen started building Zenlytic in 2020, before ChatGPT existed, betting that large language models represented a fundamental platform shift in analytics. They set out to solve the last-mile analytics problem by creating an AI that could genuinely understand business context and answer questions like a mid-to-senior level data analyst. The company has raised $14M across three rounds. Most recently, M13 Ventures led their Series A, with backing from Bain Capital Ventures, Primary, and others.
When not building the future of analytics, he's rock climbing and snowboarding in the mountains around Denver, Colorado.

Paul will be presenting the following Data Day Texas Saturday session:
Agents are eating the semantic layer

Matthew Sharp (Salt Lake City)

Co-author of LLMs in Production (Manning), Matthew Sharp (Linkedin) is a seasoned expert in the world of machine learning and artificial intelligence. With over 10 years of experience, Matthew has worked across the entire ML/AI spectrum—from data science to MLOps infrastructure—deploying models to production and building the tools and platforms to support them. Currently, Matthew is an AI Engineer at Flexion, where he leads the advancement of Flexion’s AI application development, with a focus on Generative AI. He also teaches a graduate-level course on the development, deployment, optimization, and real-world applications of large language models (LLMs) at Utah State University. Matthew will be appearing as part of the #SLCdata invasion.

Matthew will be presenting the following Data Day Texas session:
How to Hack An Agent in 10 Prompts: and other true stories from 2025

Arvind Prabhakar (SF Bay) @aprabhakar

Arvind Prabhakar is the co-founder and CEO of Tabsdata. He previously co-founded StreamSets, one of the earliest data ingestion platforms, and was an early employee at Cloudera during the formative years of the modern data ecosystem. Earlier in his career, he worked at Informatica, contributing to large-scale enterprise data integration systems. He is a member of the Apache Software Foundation and the ACM, and has spent his career building and operating data infrastructure used in production by enterprise teams. Across multiple generations of the data stack, Arvind has seen how pipelines, orchestration, and downstream tooling evolved to compensate for deeper architectural gaps. His work today focuses on rethinking data platforms around data state, dependency, and correctness rather than continuous reprocessing.

Arvind will present the following Data Engineering session: I Built Pipelines. I Don’t Trust Them Anymore..

Sanjeev Mohan (San Francisco)

Sanjeev Mohan (Linkedin), is an established thought leader in the areas of cloud, modern data architectures, analytics, and AI. He researches and advises on changing trends and technologies and is the author of Data Product for Dummies. Until recently, he was a Gartner vice president known for his prolific and detailed research, while directing the research direction for data and analytics. He has been a principal at SanjMo for over two years where he provides technical advisory to elevate category and brand awareness. He has helped several clients in areas like data governance, generative AI, DataOps, data products, and observability. He regularly presents on topics pertaining to end-to-end data pipelines and helps businesses maximize their data assets.

Sanjeev will be presenting the following Data Day Texas session:
2026 Trends: Building Foundations That Endure

Adriano Vlad-Starrabba (London)

Adriano Vlad-Starrabba is a researcher and entrepreneur working at the intersection of AI, data infrastructure, and logic-based reasoning. He is the Co-Founder & CEO of Prometheux and guest-lectures the Knowledge Graphs course at the Oxford University. Originally from Rome and now based in London, Adriano focuses on building the data foundation layer to connect fragmented data and enable organizations to define shared business concepts, track value-level lineage, and eliminate duplicate logic across teams in pharma, financial institutions and more.

Adriano will be presenting the following Saturday Data Day Texas session:
The Enterprise of the Future Runs on Ontologies: Making AI Agents Actually Work

Data Visualization Keynote
Christian Miles (Vancouver Island)

Well known for his widely-read graph visualization newsletter source/target (2021-2024), Christian Miles (Linkedin) specializes in graph database visualization and analytics. Since completing his Masters in Mathematics and Computer Science from the University of Bristol, his work has spanned fraud detection, cybersecurity, and law enforcement at BAE Systems, Wynyard Group, and Cambridge Intelligence.
Christian recently joined the red-hot graph visualization company G.V() - answering the question many in the graph community had been asking (where's Christian going). Christian is a big picture guy with a reputation for simplifying the complex. If your work intersects with either graph or visualization - don't just attend his talk. Reach out and schedule time to meet with him in person.

Christian will be presenting the Saturday Data Visualization Keynote:
"Who Needs a Chart When You Can Just Chat?" - the role of data visualization in a post-LLM world

Glauber Costa (Dallas)

Glauber Costa (Linkedin) first spoke at Data Day Texas in 2018, and people have been asking us to invite him back ever since. Currently the founder and CEO of Turso, Glauber is leading the effort to build the next evolution of SQLite - in Rust. Before founding Turso, Glauber spent over 20 years in systems programming: as a Staff Software Engineer at Datadog (where he authored the Glommio Rust async executor), Distinguished Engineer at ScyllaDB (designing core database features during its evolution from concept to production-grade Cassandra alternative), and as a core contributor to the Linux Kernel at Red Hat, focusing on virtualization, storage, and containers.
At Turso, Glauber also leads the development of libSQL—an open-contribution fork of SQLite with 12,000+ GitHub stars—and its namesake database, Turso, a complete rewrite of SQLite in Rust with deterministic simulation testing built in from the ground up. Turso powers production workloads for Astro DB, Val.town, and installations like U2's immersive experience at The Sphere in Las Vegas. The company has raised $7M from Norwest Venture Partners and is backed by a roster of infrastructure-focused investors.
Glauber is known for making complex distributed systems concepts accessible and for his pragmatic approach to evolving foundational technology without breaking compatibility.

Glauber will be presenting the following Data Day Texas session:
We're Rewriting SQLite in Rust. Here's Why That's Not Crazy.

Trey Blalock (Portland)

A highly respected Chief Information Security Officer and security researcher, Trey Blalock (Linkedin) has performed extensive work in almost every security domain for some of the world's largest corporations and governments. Trey has trained thousands of people on advanced security topics, and has taught security classes at many organizations - including AT&T, BCBS, BECU, CIA, CISA, DHS, DIA, FBI, IBM, NSA, RCMP, T-Mobile, U.S. Air Force, U.S. Army, U.S. Marines, U.S. Navy, U.S. Secret Service. He has served as a Computer Forensic Expert Witness for the U.S. Department of Justice on multiple cases, including handling all aspects of computer forensics on some high-profile cases such as "Donald Vance vs. Donald Rumsfeld," "John Doe vs. Donald Rumsfeld" and "American Boat Company vs. United States.
As Chief Information Security Officer for Coinstar, Trey managed several teams across multiple projects during a major overhaul of the company's infrastructure, architecting significant changes to protect over 25,000 kiosks and data operations on several cloud platforms, reducing the attack surface by more than 95%.
Trey also specializes in defending large-scale systems from advanced threat actors, and currently serves on several forensic, red teaming, and penetration testing advisory boards. Through his consulting practice, Verification Labs, Trey has managed hundreds of security events for companies, including dozens of ransomware events, security breaches, denial-of-service attacks, and over one hundred forensic incidents.

Trey will be presenting the following Data Day Texas session:
The Weaponization of AI, Its Impact on Organizations, and How to Respond.

Weimo Liu (Sunnyvale)

Weimo Liu is the CEO and co-founder of PuppyGraph, bringing his expertise in databases and query engines from his time at Google, where he worked on the F1 team developing the unified SQL analytic engine that supports most data formats/sources and serves billions of queries per day. Before his tenure at Google, he excelled as a research scientist at TigerGraph, creating a query language for parallel distributed graph databases and a compiler to translate queries into executable C++ code. With a PhD in Computer Science from George Washington University and a role as a program committee member and reviewer for top conferences and journals in the database area (r.g., TKDE, KDD, and SIGSPATIAL), Weimo is recognized as a distinguished expert in his field.

Weimo will present the following Saturday Data Day Texas session:
Context Trace/Graph: Observability for AI Agents

Arthur Bigeard (Glasgow, Scotland)

A few years ago, Arthur Bigeard, was tasked with putting together a small proof of concept using JanusGraph - a fork of Titan. Feeling that he lacked integrated development environment tools like those he had come to know as a professional working with relational databases, Arthur set out to create a graph database client for people who needed a tool to show their work to their team and stakeholders, as well for an IDE for those who wanted to learn about graph databases.
And so G.V() was born.

Arthur will present the following Saturday Data Day Texas session:
Building G.V(): Why Graph Databases Desperately Need Better Tools

Aaron Black (Indianapolis)

Aaron Black has spent the last 25+ years building the learning products you’ve probably used to level up—leading content innovation at O’Reilly, Springer Nature, Wiley, and Pearson. As a content strategist and platform builder, he’s helped bring thousands of technical books, courses, and videos to market, working directly with engineers, data scientists, and subject matter experts to translate deep expertise into impactful learning experiences.
If you’ve ever thought about writing, teaching, or scaling your knowledge for a broader audience, Aaron’s the person to talk to. He knows how to turn real-world experience into publishable, teachable, high-impact content. And he’s always scouting for the next great voice.

Aaron will be presenting the Saturday Data Day Texas session:
How Ideas Become Books: Inside O'Reilly and Data Thought Leadership

Shane Gibson (London)

Shane Gibson is the creator of the Information Product Canvas and the author of the Agile Data Guide series, beginning with An Agile Data Guide To Information Product Canvas. With over 35 years of experience in data strategy, architecture, and consulting, and more than a decade coaching data teams, Shane is a recognized coach in helping data teams and organisations change how they work. Known for his practical, no-nonsense approach, Shane’s focus is on simplifying complex data challenges and bridging the gap between technical data teams and stakeholders. He empowers data professionals by providing actionable patterns and pattern templates, enabling them to deliver value faster, reduce waste, while having more fun. Shane’s extensive background includes roles as an Agile Data Coach, Consultant, Enterprise Architect, Product Manager, and founder of multiple data companies. He is also the host of the AgileData Podcast and has authored courses focused on practical data practices, including the popular “Gather data requirements in 30 minutes using a shared language”. His work, drawing inspiration from the Agile Data Way of Working, provides a proven toolkit for navigating the complexities of modern data delivery. Shane is passionate about helping others succeed in data and analytics and sharing patterns that allow teams to become “data and business translators” and build trust with stakeholders, ensuring data work actually gets used. More details are also available on Shane’s personal website, https://shagility.com/

Aaron will be presenting the Saturday Data Day Texas session:
How to gather data requirements in 30 minutes or less - the Information Product Canvas

Jon Haddad (Redondo Beach)

Jon Haddad is one of the top performance goto guys on the planet for all things Cassandra. As an Apache Cassandra PMC member and committer, Jon has over a decade of experience working with some of the world's largest Cassandra deployments, including Netflix and Apple. Previously at The Last Pickle, Jon worked with over a hundred customers across finance, streaming media, cloud providers, and other mission-critical systems, developing deep expertise in performance analysis, observability, and data modeling. As founder of RustyRazorblade Consulting, Jon specializes in performance tuning and optimization for distributed systems. Jon created and maintains several open-source tools including easy-cass-lab and easy-cass-stress, which are widely used by the Cassandra community for testing and benchmarking. He is recognized for his contributions to pushing the boundaries of what's possible with Cassandra and provides training through his "Operator Excellence" program for engineers seeking to master Cassandra operations.

Jon will be presenting the following Data Day Texas session:
Stop Guessing, Start Measuring: A Decade of Database Experimentation and Tuning

Jonathan Ellis (Austin) @spyced

Jonathan Ellis became involved with Apache Cassandra in 2008 when Rackspace hired him to build their next-generation database infrastructure. As the first Cassandra PMC chair, and later as co-founder of DataStax, Jonathan was largely responsible for leading Cassandra through its first decade of development. More recently, Jonathan was working on Vector Search to facilitate its integration with Cassandra for the next generation of AI applications.

Seeing the promise of coding with AI while building JVector, and experiencing the frustration of seeing the same tools completely fail with Cassandra (a 10x larger codebase), Jonathan was motivated to create Brokk - a tool to tame large codebases for AI.

Jonathan will be presenting the following Saturday session:
Brokk: Context Engineering for Large Codebases

Juan Sequeda (Austin) @juansequeda

Dr. Juan Sequeda is a Principal Fundamental Researcher at ServiceNow, joining through the acquisition of data.world. He holds a PhD in Computer Science from The University of Texas at Austin. Juan’s research and industry work has been on the intersection of data and AI, with the goal to reliably create knowledge from inscrutable data, specifically designing and building Knowledge Graph for enterprise data and metadata management. Juan is the co-author of the book “Designing and Building Enterprise Knowledge Graph” and the co-host of Catalog and Cocktails, an honest, no-bs, non-salesy data podcast. Juan serves as a strategic and innovation bridge across Product/Engineering, Marketing, Sales, and Customer, ensuring technical insights are aligned with business value.

Juan has researched and developed technology on semantic data virtualization, graph data modeling, schema mapping and data integration methodologies. He pioneered technology to construct knowledge graphs from relational databases starting in the mid 2000s, resulting in W3C standards, research awards, patents, software and his startup Capsenta acquired by data.world in 2019. Juan strives to build bridges between academia and industry as former co-chair of the LDBC Property Graph Schema Working Group, member of the LDCB Graph Query Languages task force, standards editor at the World Wide Web Consortium (W3C). Juan continues to be an active member of the scientific community through academic research partnerships, advising students, and member of data and AI scientific conference committees.

Juan will be presenting the following Saturday Data Day Texas session:
Scar Tissue: Lessons from 20 Years of Building Ontologies and Knowledge Graphs

Alex Merced (Winter Park, FL) @alexmerced

Alex Merced (Linkedin) has a history of creating content to enable developers of all types through his personal projects like DevNursery.com, The Web Dev 101 Podcast, and the DataNation podcast. Currently Head of DevRel at Dremio, Alex has held positions with companies like Crossfield Digital, CampusGuard, GenEd Systems and others along with being an Instructor for General Assembly Bootcamps. Alex is co-author of Apache Iceberg: The Definitive Guide and the upcoming Apache Polaris: The Definitive Guide, both from O'Reilly.
#oreilly-showcase

Alex will be presenting the following Saturday Data Day Texas session:
Designing an Apache Iceberg Lakehouse: From Requirements to a Stakeholder-Ready Architecture

Jean-Georges Perrin (Albany, New York) @jgp

Jean-Georges Perrin (Wikipedia / LinkedIn) is an IT software engineer, lecturer, and serial entrepreneur from Alsace, France. The first French citizen to become an IBM Champion in 2009, he became a Lifetime IBM Champion in 2021, and a PayPal champion in 2024. Formerly
Principal Enterprise Data Architect at Expedia, and Group Intelligence Platform Lead at PayPal, Jean-Georges is currently Senior Product Manager at Actian. In addition, Jean-George is Chair of the Technical Steering Committee for Bitol, which works toward an Open Data Contract Standard.
Jean-Georges is author of Spark in Action from Manning, and co-author Implementing Data Mesh from O'Reilly. Check out his thoughts on Data Mesh at Youtube

Jean-Georges will be presenting the following Saturday Data Day Texas session:
Hands-on Data Product: let's build a data product in 30 minutes

Matthew Housley (Salt Lake city)

Matthew Housley,“Recovering Data Scientist”, is Co-Founder / CTO of Ternary Data. Also a “Reformed Academic,” Matthew holds a PhD in Math and dual Masters degrees in both Math and Physics. It was only natural that he began his career in Academia as a Professor of Mathematics, before joining one of the largest e-commerce companies as a data scientist. Matt's STEM background in combination with his knack for teaching makes him a mastermind at overhauling processes, improving teamwork, and incorporating engineering best practices so that real value is delivered to companies. While making the journey from data scientist to data engineer, Matt began to focus more on data & cloud engineering, working extensively with Amazon Web Services, Google Cloud Platform, Containers, Apache Airflow and GPUs, among other technologies. Matt (or should we say, “Dr. Housley”) is an adjunct faculty member in the David Eccles School of Business at The University of Utah. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) and co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering. Matthew will be appearing as part of the #SLCdata invasion.

Matt and Joe Reis will co-host their annual Data Town Hall at 5pm Saturday

Jonathan Mugan (Austin)

Jonathan Mugan (Linkedin), Principal Scientist at De Umbra, is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.

Jonathan will be presenting the following Data Day Texas session:
LLMs Expand Computer Programs by Adding Judgment

Prashanth Rao (Toronto)

Prashanth Rao is an AI Engineer at LanceDB, where he works in a role at the intersection of data infra tooling, AI/ML/LLMs and developer education. Most recently, he was an AI engineer at Kùzu Inc., an embedded graph database startup in Ontario.
In recent years, Prashanth has worked on a variety of data engineering, data science, and machine learning problems and has thought deeply about databases and data modeling paradigms. He has two master's degrees: one in Aerospace engineering from the University of Michigan, and another in Computer Science from Simon Fraser University in Vancouver. Prashanth’s primary interests include Natural Language Processing (NLP), information extraction, graph theory and database systems. In his spare time, Prashanth enjoys hiking, biking, trying out new cuisines, engaging with the AI developer community, and blogging about all things data at thedataquarry.com. Check out his most recent blog post, Why I'm excited to work at LanceDB.

Tim Berglund (Mountain View) @tlberglund

Tim Berglund is a teacher, author, and technology leader with Confluent, where he serves as the Vice President of Developer Relations. For almost two decades, Time has been a first-call speaker at conferences around the world, You can find time all over YouTube, where he has a reputation for explaining complex technology topics in an accessible way. He tweets as @tlberglund, blogs every few years at timberglund.com, and lives in Mountain View, California. He has three grown children and two grandchildren, a fact about which he is rather excited. Check out Tim's latest interview on The Joe Reis Show

Tim will be presenting the following Saturday Data Day Texas session:
Streams of Future Past

Casey O'Neill (Denver)

Casey O'Neill (Linkedin) is the Executive Vice President of AI & Engineering at Sequentum. Casey brings nearly two decades of expertise in software development, data analytics and enterprise solutions. Since joining Sequentum, Casey has led the AI and engineering team in delivering innovative, high-performance solutions that are the foundation of the groundbreaking Sequentum Cloud platform.
Casey has a proven track record of success in leadership, entrepreneurial ventures and product development. Earlier in his career he founded Barreled, an innovative online community and app platform that utilized data analytics to provide insights for whiskey enthusiasts and industry experts. Following this success, he founded Altitude Development Group, a consulting firm specializing in enterprise-grade data acquisition, analytics, and high-availability web services.

Casey will be co-presenting the following Saturday Data Day Texas session:
Web Scraping's 25-Year War: From HTML Parsing to AI That Builds Itself

Jans Aasman (SF Bay)

Jans Aasman (Wikipedia / LinkedIn) is a Ph.D. psychologist and expert in Cognitive Science - as well as CEO of Franz Inc., an early innovator in Artificial Intelligence and provider of the graph database, AllegroGraph. As both a scientist and CEO, Dr. Aasman continues to break ground in the areas of Artificial Intelligence and Knowledge Graphs as he works hand-in- hand with numerous Fortune 500 organizations as well as US and Foreign governments. Dr. Aasman spent a large part of his professional life in telecommunications research, specializing in applied Artificial Intelligence projects and intelligent user interfaces. He gathered patents in the areas of speech technology, multimodal user interaction, recommendation engines while developing precursor technology for tablets and personal assistants. He was also a professor in the Industrial Design department of the Technical University of Delft. Dr. Aasman is a noted conference speaker at such events as Smart Data, NoSQL Now, International Semantic Web Conference, GeoWeb, AAAI, Enterprise Data World, Text Analytics, and TTI Vanguard to name a few.

Jans will be co-presenting the following Saturday Data Day Texas session:
The future cognitive OS uses a semantic layer knowledge graph.

Vaibhav Gupta (Seattle)

Across nearly a decade in software engineering, Vaibhav Gupta (Linkedin) has built predictive pipelines at D. E. Shaw, Augmented Reality systems at Google, and real-time 3D reconstruction at Microsoft HoloLens. His tenure at Google included leading performance optimizations for ARCore and the Pixel 4 face unlock, along with significant contributions to depth algorithms for the Pixel Visual Core. He also founded LifePlusPlus, a computer science bootcamp for non-traditional tech hires.
Currently, Vaibhav is CEO and Co-Founder of Boundary, (YC W23), creator of BAML - the first domain-specific programming language designed specifically for structured data extraction from LLMs. BAML achieves state-of-the-art results in function-calling with GPT 3.5 over all other models and techniques, including OpenAI's new strict structured outputs. What makes BAML revolutionary is its Schema-Aligned Parsing (SAP) algorithm - instead of constraining LLM outputs upfront (which often fails), BAML fixes broken JSON like trailing commas, unquoted keys, unescaped quotes, new lines, and even fractions in milliseconds post-generation, making cheaper models perform like expensive ones.
Vaibhav holds a BS in Computer Science and Electrical Engineering from UT Austin.