We are just now beginning to announce speakers. Are you doing something cool with data / ai and would like to share it at Data Day Texas? Send us a proposal. For the latest speaker / session updates, follow us on Linkedin.
AI Engineering Keynote
Chip Huyen (San Francisco) @chipro
Chip Huyen (Linkedin) is a writer and computer scientist, currently at Voltron Data, where she works on GPU-native data processing and open data standards (Ibis, Apache Arrow, Substrait). Previously, Chip built machine learning tools at NVIDIA, Snorkel AI, and Netflix. She also founded Claypot AI, which was acquired. Chip graduated from Stanford University, where she taught CS 329S: Machine Learning Systems Design. Her lectures became the foundation for the book Designing Machine Learning Systems, which after two years, continues to be a #1 bestseller in multiple Amazon categories. Advance copies of her upcoming book, AI Engineering, also from O'Reilly, will be available at Data Day Texas for your perusal. In her free time, Chip travels, writes, and reads. Follow her on GoodReads.
Xinran Waibel (SF Bay Area)
Xinran Waibel (Linkedin) is a Senior Data Engineer at Netflix, where she builds batch and event-streaming data systems to enable personalization. Prior to Netflix, she was a Data Engineer at Confluent and Target, where she leveraged big data technologies to enable data-driven decision making in the marketing and membership space. An active writer and blogger known for her commitment to data education, Xinran is founder of the 5000+ member Data Engineer Things community. Checkout their YouTube channel.
Vin Vashishta (Reno)
With over 25 years' experience in technology, Vin Vashishta, Founder and technical strategist at V Squared, is a recognized AI thought leader. Named a LinkedIn Top Voice, Gartner Ambassador, and an IBM and SAP insider, Vin has built a community of over 200K followers across social media, including tech leaders: Uber, Microsoft, Salesforce, NVIDIA, and Intel. His recently published book, From Data to Profit, is considered the goto playbook for monetizing data and AI. Check out his YouTube channel, The High ROI Data Scientist.
Joe MF Reis (Salt Lake city)
Joe MF Reis (Linkedin), Co-Founder and CEO of Ternary Data, is a “recovering data scientist,” and a business-minded data nerd who’s worked in the data industry for 20 years. His responsibilities have ranged from statistical modeling, forecasting, machine learning, data engineering, data architecture, and everything else in between. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) as well as the newly launched Joe Reis Show (Apple / Spotify). Joe is also co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering. Joe also teaches at the University of Utah as well as runs several meetups, including The Utah Data Engineering Meetup and SLC Python. When he’s not busy running a company, teaching, or creating content, Joe often finds himself DJing/making music, rock climbing, or trail running in the mountains around Salt Lake City, Utah.
Adam Sroka (Edinburgh)
Adam Sroka (LinkedIn) is co-founder and director of Hypercube data & AI consultancy, host of the energy, utilities and trading Hypercube Podcast, board member of the data science & AI innovation centre The Data Lab, author of the Beyond Data Community & Newsletter with 3000+ weekly subscribers and a LinkedIn Top Voice for data strategy, leadership and management. With a MSc in Photon Science and PhD in Engineering high-power peak lasers for defence applications, Adam has built deep expertise in machine learning and data strategy with a strong mathematical background. He has over 10 years of industry experience, specialising in solving complex technical problems and building high-performing data teams in the energy sector.
Adam will be presenting the Energy Session:
Optimisation Platforms for Energy Trading
Lisa Cao (San Francisco)
Lisa Cao (LinkedIn) is a former data analyst, now data engineer and software engineer interested in observability, validation, and reliability in data systems. She is a Google Women TechMakers Ambassador, Linux Foundation LiFT recipient for Women in Open Source, founder and chair of the Vancouver Datajam, and lead maintainer of the BiocSwirl project. Currently, Lisa makes her home in the San Francisco Bay Area where she leads project management at DataStrato and is a co-organizer at Data Engineer Things.
Lisa will be leading the Data Discussion on Data Catalogs.
Leann Chen (Minneapolis)
Leann Chen (LinkedIn / YouTube / GitHub) is a Generative AI Developer Advocate at Diffbot, where she specializes in creating educational content to leverage the power of knowledge graphs to improve LLM-based systems. Leann first hit our radar when she created a knowledge graph powered RAG to use as a recommendation assistant for Data Day Texas 2024 (video). Check out Leann's recent interview on the Neo4j channel, and her latest video : Reliable Graph RAG with Neo4j and Diffbot
.
Serg Masís (Raleigh-Durham-Chapel Hill)
Serg Masís (LinkedIn) is a Climate and Agronomic Data Scientist at Syngenta, a leading agribusiness company with a mission to improve global food security. Whether it pertains to leisure activities, plant diseases, or customer lifetime value, Serg is passionate about providing the often-missing link between data and decision-making. Serg is author of Interpretable Machine Learning with Python, now in its 2nd edition. Serg is also working on two upcoming titles: DIY AI: Step-By-Step Artificial Intelligence Projects for Makers and Hackers, and Building Responsible AI with Python. Learn more at Serg.ai.
Data Mesh Keynote
Jean-Georges Perrin (Albany, New York)
Jean-Georges Perrin (Wikipedia / LinkedIn) is an IT software engineer, lecturer, and serial entrepreneur from Alsace, France. The first French citizen to become an IBM Champion in 2009, he became a Lifetime IBM Champion in 2021, and a PayPal champion in 2024. Formerly Intelligence Platform Lead at PayPal, Jean-Georges is currently co-founder and Chief Innovation Officer at Abea Data. Jean-Georges is author of Spark in Action from Manning, and co-author of the upcoming Implementing Data Mesh from O'Reilly. Check out his thoughts on Data Mesh at Youtube.
Jess Haberman is Director of Product Content at Anaconda, where she leads content strategy and education. Previously, Jess was an acquisitions editor at O’Reilly Media, collaborating with tech industry leaders to develop instructional books and online content in data science and data engineering. She has presented at and facilitated technology conferences (O’Reilly’s Strata and Data Superstreams, PyCon US, Scale by the Bay, DataCon LA), webinars, live training courses, podcasts, publishing seminars, and writing retreats. Jess earned her BA in English Literature from Denison University and spent 14 years in nonfiction book publishing.
Jess will be presenting the following session:
The Future of Data Education
Jonathan Mugan (Austin) @jmugan
Jonathan Mugan (Linkedin), Principal Scientist at De Umbra, is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.
Bill Inmon (Castle Rock, Colorado)
Bill Inmon (Wikipedia / LinkedIn) is an American computer scientist, recognized by many as the father of the data warehouse. Inmon wrote the first book, held the first conference, wrote the first column in a magazine and was the first to offer classes in data warehousing. Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions. Bill is among the most prolific and well-known authors in the big data analysis, data warehousing and business intelligence arena. In addition to authoring more than 50 books and 650 articles, Bill has been a monthly columnist with the Business Intelligence Network, EIM Institute and Data Management Review. In 2007, Bill was named by Computerworld as one of the “Ten IT People Who Mattered in the Last 40 Years” of the computer profession.
Susan Shu Chang (Toronto) @susan_shuc
Susan Shu Chang (Linkedin) is currently Principal Data Scientist at Elastic. Originally trained in Economics, Susan is a 5x PyCon speaker, founder of Indie game studio Quill Game Studios and organizer of the 3700+ member Toronto Women's Data Group. Susan is also author of the upcoming O'Reilly book: Machine Learning Interviews. To learn how she finds time for all this and more, check out her personal site, susanshu.com, for her writings on focus optimization and daily routines.
Michelle Yi (SF Bay Area) @ YulleYi
Michelle Yi is a technology leader that specializes in machine learning and cloud computing. She has 15 years of experience in the technology industry, contributed to the original IBM Watson showcased on Jeopardy, and enjoys building and leading teams that develop and deploy AI solutions to solve real-world problems. Michelle is passionate about diversity, STEM education/careers for our minority communities, and serves both on the board of Women in Data and as an avid volunteer for Girls Who Code.
Amy Hodler (Kettle Falls, Washington) @amyhodler
Amy Hodler is an evangelist for graph analytics, network science, and responsible AI. Amy has decades of experience in emerging tech at companies such as Microsoft, Hewlett-Packard (HP), Hitachi IoT, Neo4j, Cray, and Relational AI. Amy has a love for science history and a fascination for complexity studies. Amy is the co-author of the O'Reilly book: Graph Algorithms, as well as co-author of an upcoming volume on the history of graph analytics.
Clair Sullivan (Breckenridge, Colorado) @ProfCJSullivan
Dr. Clair Sullivan is currently the Founder and CEO of Clair Sullivan and Associates, a company dedicated to providing data science consulting services. Prior to starting her company, she was the Director of Data Science at Vail Resorts leading a team of data scientists and machine learning engineers providing production models for operations and marketing. Previously she was a data science advocate at Neo4j, working to expand the community of data scientists and machine learning engineers using graphs to solve challenging problems. She received her doctorate degree in nuclear engineering from the University of Michigan in 2002. After that, she began her career in nuclear emergency response at Los Alamos National Laboratory where her research involved signal processing of spectroscopic data. She spent 4 years working in the federal government on related subjects and returned to academic research in 2012 as an assistant professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign. While there, her research focused on using machine learning to analyze the data from large sensor networks. Deciding to focus more on machine learning, she accepted a job at GitHub as a machine learning engineer while maintaining adjunct assistant professor status at the University of Illinois. In 2021 she joined Neo4j as a Graph Data Science Advocate. Additionally, she founded a company, La Neige Analytics, whose purpose is to provide data science expertise to the ski industry. She has authored 4 book chapters, over 20 peer-reviewed papers, and more than 30 conference papers. Dr. Sullivan was the recipient of the DARPA Young Faculty Award in 2014 and the American Nuclear Society's Mary J. Oestmann Professional Women's Achievement Award in 2015.
Jessica Talisman (Santa Cruz)
Jessica Talisman is a taxonomist, ontologist, information architect, and professional data wrangler. Over her 25 years of experience in the information & data architecture world, Jessica has worked in galleries, libraries, museums, the federal government, e-commerce, as tech and currently is the Information Architect for Sellers Platform at Amazon. Jessica holds a Master of Library and Information Science and Masters in Teaching. Check out Jessica’s recent interviews on the Monday Morning Data Chat and Discovering Data.
Dr. Juan Sequeda is the Principal Scientist at data.world. He joined through the acquisition of Capsenta, a company he founded as a spin-off from his PhD research in Computer Science from The University of Texas at Austin. His goal is to reliably create knowledge from inscrutable data. His research and industry work has been on designing and building Knowledge Graph for enterprise data and metadata management.
Juan has researched and developed technology on semantic data virtualization, graph data modeling, schema mapping and data integration methodologies. He pioneered technology to construct knowledge graphs from relational databases, resulting in W3C standards, research awards, patents, software and his startup Capsenta acquired by data.world in 2019. Juan strives to build bridges between academia and industry as past co-chair of the LDBC Property Graph Schema Working Group, member of the LDCB Graph Query Languages task force, standards editor at the World Wide Web Consortium (W3C) and organizing committees of scientific conferences, including being the general chair of The Web Conference 2023.
David Hughes (Seattle)
David Hughes is the Principal Graph Consultant for Graphable. He has 10 years of experience designing and building graph solutions which surface meaningful insights. His background includes clinical practice, medical research, software development, and cloud architecture. David has worked in healthcare and biotech within the intensive care, interventional radiology, oncology, cardiology, and proteomics domains. He enjoys endurance running, hiking, and spending time with his family in the outdoors when he is not enabling clients to have data epiphanies from their complex data.
Sean Robinson (Charlotte)
Sean Robinson is a versatile data scientist with several years of experience optimizing data processes and building intelligent data systems. Specifically, he specializes in the use of graph data science and Neo4j to abstract complex systems within a domain into a highly dimensional, interconnected knowledge graphs to uncover novel insights which would otherwise remain dormant in other data structures. Sean currently serves both as Lead Data Scientist at Graphable as well as creating and instructing new network science courses at the University of North Carolina at Charlotte’s Data Science graduate program where he instructs the next generation of data scientists on how to integrate graph data science into their toolkit.
Chris Tabb (London)
Chris Tabb, co-founder of LEIT DATA started his career in the Business Intelligence/Analytics domain 30 years ago. Beginning at Cognos in the 90’s working in the back office before becoming an expert in all their products, and leaving to become an independent BI consultant in 1998. Chris has followed the evolution of the analytics industry, working hands-on with all the technologies in the ecosystems: – Databases, ETL/ELT, BI/OLAP /Visualisation Tools, Big Data Technologies, Infrastructure On premises / Cloud across many vendors, some old some new. Recently with a focus on the Modern Data Stack Evolution Chris has started many movements with a focus on Business Value using a number of hashtags to raise awareness #bringbackdatamodelling / #bringbackdatamodeling #bringbackdocumention under the umbrella of the #meandatastreets that is focused on simplification of the Data Platform architecture and to focus on Business Value.
Matthew Housley (Salt Lake city)
Matthew Housley,“Recovering Data Scientist”, is Co-Founder / CTO of Ternary Data. Also a “Reformed Academic,” Matthew holds a PhD in Math and dual Masters degrees in both Math and Physics. It was only natural that he began his career in Academia as a Professor of Mathematics, before joining one of the largest e-commerce companies as a data scientist. Matt's STEM background in combination with his knack for teaching makes him a mastermind at overhauling processes, improving teamwork, and incorporating engineering best practices so that real value is delivered to companies. While making the journey from data scientist to data engineer, Matt began to focus more on data & cloud engineering, working extensively with Amazon Web Services, Google Cloud Platform, Containers, Apache Airflow and GPUs, among other technologies. Matt (or should we say, “Dr. Housley”) is an adjunct faculty member in the David Eccles School of Business at The University of Utah. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) and co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering.