Who will speak at Data Day Texas 2023

We're just now sending invites and beginning to confirm speakers for the upcoming edition in January. If you'd like to join us as a speaker, take a look at our Proposals page.

Data Engineering Keynote
Adi Polak (Israel) @AdiPolak

As Vice President of Developer Experience at Treeverse, Adi Polak shapes the future of data & ML technologies for hands-on builders. She also contributes to the lakeFS open-source, a git-like interface for object stores. In her work, she brings her vast industry research and engineering experience to bear in educating and helping teams design, architect, and build cost-effective data systems and machine learning pipelines that emphasize scalability, expertise, and business goals. Adi is a frequent worldwide presenter and the author of O'Reilly's upcoming book, Machine Learning With Apache Spark. She is continually an invited member of multiple program committees and advisor for conferences like Data & AI Summit, Scale by the Bay, and others. Previously, she was a senior manager for Azure at Microsoft, where she focused on building advanced analytics systems and modern architectures. When Adi isn’t building data pipelines or thinking up new software architecture, you can find her on the local cultural scene or at the beach.

NLP Keynote
John Bohannon ( SF Bay Area) @bohannon_bot

John Bohannon (Wikipedia / Linkedin ) is currently Director of Science at Primer, an artificial intelligence company headquartered in San Francisco. Bohannon is widely known as a science journalist, most notably with his "Gonzo Scientist" online series at Science Magazine and his creation of the annual Dance Your PhD contest. Bohannon is involved in the effective altruism movement. In July 2015 he became a member of Giving What We Can, an organization whose members pledge to give at least 10% of their income to effective charities. Bohannon completed his Doctor of Philosophy degree in Molecular biology at the University of Oxford in 2002, supervised by Paul Rainey.
To see why we invited John to be our NLP Keynote for 2023, check out the following interviews:
Data Exchange Podcast (Episode 144): John Bohannon,
Multimodal, Multi-Lingual NLP at Hugging Face with John Bohannon and Douwe Kiela,
Talk with John Bohannon, Director of Science at Primer,
Trends in NLP with John Bohannon,
John Bohannon Interview - Taming arXiv with Natural Language Processing.

GIS Keynote
Bonny McClain ( Greensboro, North Carolina) @datamongerbonny

Dr Bonny McClain is a geospatial analyst & self described human geographer | social anthropologist. Dr McClain applies advanced data analytics, including data engineering and geo-enrichment, to poverty, race, and gender discussions. Her research targets judgments about structural determinants, racial equity, and elements of intersectionality to illuminate the confluence of metrics contributing to poverty. Moving beyond ZIP codes to explore apportioned socioeconomic data based on underlying population data leads to discovering novel variables based on location to build more context to complex data questions. Bonny is a member of the National Press Club, 500 Women Scientists, The Urban and Regional Information Systems Association (URISA), former member of Tableau Speaker’s Bureau, and Investigational Reporters and Editors allowing access to a wide variety of health policy and health economic discussions. Bonny is author of the upcoming O'Reilly publication: Python for Geospatial Data Analysis: Theory, Tools, and Practice for Location Intelligence.
Bonny will be presenting the GIS Keynote: "one ant , one bird, one tree"....

Hala Nelson (Alexandria, Virginia)

Hala Nelson (Linkedin) is an Associate Professor of Mathematics at James Madison University. She has a Ph.D. in Mathematics from the Courant Institute of Mathematical Sciences at New York University. Prior to her work at James Madison University, she was a postdoctoral Assistant Professor at the University of Michigan- Ann Arbor. Her research is in the areas of Materials Science, Statistical Mechanics, Inverse Problems, and the Mathematics of Machine Learning and Artificial Intelligence. Her favorite subjects are Optimization, Numerical Algorithms, Mathematics for AI, Mathematical Analysis, Numerical Linear Algebra and Probability Theory. She likes to translate complex ideas into simple and practical terms. To her, most mathematical concepts are painless and relatable, unless the person presenting them either does not understand them very well, or is trying to show off. Other facts: Hala Nelson grew up in Lebanon, during the time of its brutal civil war. She lost her hair at a very young age in a missile explosion. This event and many that followed shaped her interests in human behavior, the nature of intelligence, and AI. Her father taught her Math, at home and in French, until she graduated high school. Her favorite quote from her father about math is, "It is the one clean science''.
Hala is author of the upcoming O'Reilly book: Essential Math for AI.

Ryan Mitchell (Boston) @Kludgist

An expert in web scraping, web security, and data science, Ryan Mitchell has hosted workshops and spoken at many events, including Data Day and DEF CON. She teaches web programming and data science and has taught and designed courses at Northeastern University and Olin College of Engineering. Ryan holds a master’s degree in software engineering from Harvard University Extension School and is currently a senior software engineer at the Gerson Lehrman Group where she creates data science tools. Ryan is the author of Web Scraping with Python (O’Reilly) and Instant Web Scraping with Java (Packt Publishing), as well as multiple Linkedin courses, including: Python Data Structures with Trees and Web Scraping with Python.
Ryan will be presenting the following workshop: Hands-on Introduction to Web Scraping with Python 2023

Amy Hodler (Kettle Falls, Washington) @amyhodler

Amy Hodler is an evangelist for graph analytics, network science, and responsible AI. Amy has decades of experience in emerging tech at companies such as Microsoft, Hewlett-Packard (HP), Hitachi IoT, Neo4j, and Cray. At RelationalAI, she’s the Graph Evangelist and Sr. Director of Product Marketing. Amy has a love for science history and a fascination for complexity studies. Amy is the co-author of the O’Reilly books, Graph Algorithms: Practical Examples in Apache Spark and Neo4j, and Knowledge Graphs: Data in Context for Responsive Businesses.

Heather Hedden (Boston) @hhedden

Heather Hedden (LinkedIn) has been a taxonomist for over 26 years in various organizations and as an independent consultant. She is currently a data and knowledge engineer on the professional services team of Semantic Web Company, vendor of PoolParty software. Previously worked as a taxonomist at Cengage Learning, Gale, Viziant, First Wind, and Project Performance Corporation. Heather has designed and developed, taxonomies, ontologies, and metadata schema for internal and externally published content. She gives workshops on taxonomy creation at conferences, as corporate training, and through an independently offered online course. Heather is author of The Accidental Taxonomist.
Heather will host the following workshop:
Introduction to Taxonomies for Data Scientists (workshop)

Sanghamitra Deb (SF Bay) @sangha_deb

Sanghamitra Deb is a Data Scientist at Chegg, where she works on problems related school and college education to sustain and improve the learning process. Her work involves recommendation systems, graph modeling, deep NLP analysis , data pipelines and machine learning. Previously, Sanghamitra was a data scientist at a Accenture where she worked on a wide variety of problems related data modeling, architecture and visual story telling. Sanghamitra is active in Data Science outreach and believes in applying analytics to a range of domains such as pharma, HR, customer support, market research, etc. Prior to being data scientist she was an astrophysicist who studied the structure of the universe by modeling galaxy clusters.

Janet Six (DFW) @janetmsix

Janet Six is a Product Manager at Tom Sawyer Software, where she helps companies design easier-to-use products within their financial, time, and technical constraints. For her research in information visualization, Janet was awarded the University of Texas at Dallas Jonsson School of Engineering Computer Science Dissertation of the Year Award. She was also awarded the prestigious IEEE Dallas Section 2003 Outstanding Young Engineer Award. Her work has appeared in the Journal of Graph Algorithms and Applications and the Kluwer International Series in Engineering and Computer Science. The proceedings of conferences on Graph Drawing, Information Visualization, and Algorithm Engineering and Experiments have also included the results of her research.

Tomás Sabat (London, England) @tasabat

Tomás Sabat is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.

Jans Aasman (SF Bay)

Jans Aasman (Wikipedia / LinkedIn) is a Ph.D. psychologist and expert in Cognitive Science - as well as CEO of Franz Inc., an early innovator in Artificial Intelligence and provider of the graph database, AllegroGraph. As both a scientist and CEO, Dr. Aasman continues to break ground in the areas of Artificial Intelligence and Knowledge Graphs as he works hand-in- hand with numerous Fortune 500 organizations as well as US and Foreign governments. Jans recently authored an IEEE article on “Enterprise Knowledge Graphs”.
Dr. Aasman spent a large part of his professional life in telecommunications research, specializing in applied Artificial Intelligence projects and intelligent user interfaces. He gathered patents in the areas of speech technology, multimodal user interaction, recommendation engines while developing precursor technology for tablets and personal assistants. He was also a professor in the Industrial Design department of the Technical University of Delft. Dr. Aasman is a noted conference speaker at such events as Smart Data, NoSQL Now, International Semantic Web Conference, GeoWeb, AAAI, Enterprise Data World, Text Analytics, and TTI Vanguard to name a few.

Tim Berglund (Denver) @tlberglund

Tim Berglund is a teacher, author, and technology leader with StarTree, where he serves as the Vice President of Developer Relations. For over a decade, Tim has been a first-call speaker at conferences around the world. He can also be found on YouTube, where he has a reputation for explaining complex technology topics in an accessible way. He tweets as @tlberglund, blogs every few years at timberglund.com, and lives in Littleton, CO, USA. He has three grown children and two grandchildren, a fact about which he is rather excited.

Jeff Carpenter (Scottsdale, Arizona) @jscarp

Jeff Carpenter (Linkedin) , co-author of Cassandra: The Definitive Guide (3rd edition available soon!), has worked on large-scale systems in the defense and hospitality industries. Jeff leads the Developer Advocate team at DataStax, where he uses his background in system architecture, microservices and Apache Cassandra to help empower developers and operations engineers to build distributed systems that are scalable, reliable, and secure.

Rosaria Silipo (Zürich ) @DMR_Rosaria

Rosaria Silipo (LinkedIn),
Head of Data Science Evangelism at KNIME, has been a researcher in applications of Data Science and Machine Learning for over a decade. Her field of experience traverses many domains, including biomedical systems, IoT, customer intelligence, financial services, social media, cybersecurity, and automatic speech processing.
Rosara is the author of 50+ technical publications, including her most recent book Practicing Data Science: A Collection of Case Studies. She holds a doctorate degree in bio-engineering from Università degli Studi di Firenze.

Yue Cathy Chang ( Sunnyvale ) @yuec

Yue Cathy Chang is an executive recognized for thought leadership and execution in digital transformation. She is passionate about addressing business challenges and often finds herself and her team "parachuting" into situations to tackle challenging and meaningful data needs. Cathy has led teams and functions at blue-chip enterprises as well as startups, across financial services and high-tech industries, working with leaders of centralized and distributed data teams, all betting the next product differentiation on data. She is currently an AVP in banking and financial services at an American multinational technology corporation.
Cathy holds MS and BS degrees in electrical and computer engineering from Carnegie Mellon University, MBA and MS degrees from MIT, and two granted US patents. She's a co-author, with Jike Chong, of the Manning publication How to Lead in Data Science.

Jike Chong (Sunnyvale) @jikechong

Jike Chong is an executive who nurtures teams and crafts cultures to produce billion-dollar business impacts. He built and grew multiple high-performing data functions in public and private companies and nurtured dozens of ambitious individual contributor data scientists into leaders; some have gone on to lead teams of more than 70 data scientists. Jike was part of the executive team that took Yiren Digital Ltd public on NYSE. He also expanded and led the data team as the chief data scientist at Acorns, designed and executed a project predicting venture investment risks at Silver Lake, and led the Hiring Marketplace Data Science team at LinkedIn, serving a business line with $4B a year in revenue.
Jike received his bachelor’s and master’s degrees in electrical and computer engineering from Carnegie Mellon University and a PhD in electrical engineering and computer science from the University of California, Berkeley. He's a co-author, with Yue Cathy Chang of the Manning publication How to Lead in Data Science.

Shirshanka Das (Santa Clara) @shirshanka

Shirshanka Das (LinkedIn) is co-founder and CEO of Acryl Data, the company which is commercializing the open source DataHub project, a real-time metadata platform used by LinkedIn, Expedia, Saxo Bank, Klarna, Viasat, and many others. Prior to founding Acryl, he was the overall architect for Big Data at LinkedIn from 2010 to 2020, and responsible for creating the metadata and data management strategy at the company. As part of this, he founded the DataHub project and shaped its evolution to a metadata platform that powers DataOps, MLOps, productivity, and governance use cases at LinkedIn. He is also a PMC and committer on the Apache Gobblin project which manages 100PB+ of data assets at rest at LinkedIn, and is deployed in production at other large companies like Verizon, PayPal etc. Prior to LinkedIn, Shirshanka worked on high-performance serving systems at Yahoo and PayPal. Shirshanka has a Ph.D. in Computer Science from UCLA.

Brian Hall (Austin) @brian_w_hall

Brian Hall leads engineering and development as VP of Cyber Applications at Qomplx where they build high capacity stream processing applications to secure and detect corporate networks against potential attack. Prior to that Brian led the Graph and Analytics Practice at Expero, utilizing a wide array of graph engines including JanusGraph, DataStax, Neo4j, Tigergraph and Neptune, Brian has been developing software and consulting for over 25 years and holds a B.S. in Computer Science from Vanderbilt University and an M.S. in Computer Science from DePaul University. In their free time, Brian and his wife, Nicole, enjoy watching their kids slowly turn into the adults they will become, traveling, eating good food with good wine, and staying active in Austin with all the outdoor activities and great live music.
Brian will present the following Streaming Data session: Protecting Against Ransomware Attacks using Kafka, Flink and Graph.

Joey Jablonski ( Austin ) @jrjablo

Joey Jablonski (LinkedIn) is VP of Analytics at Pythian, he leads strategic engagements assisting customers in developing their data strategy, defining and executing on data governance programs and building analytical models to power the modern data-driven organization. Prior to Pythian, Joey was VP of Product at Manifold, where he brought a product mind-set is part of all engagements—allowing for delivery of value quickly in any project, and building over time to drive adoption of new data-centric capabilities in an organization. Joey led engagements across industries including high tech, pharmaceuticals and for the federal government. Before Manifold, Joey held executive leadership positions at Northwestern Mutual, iHeartMedia and Cloud Technology Partners. He brings 20+ years of experience in software engineering, high performance computing, cyber security, data governance and data engineering.

Corey Lanum (Boston) @corey_lanum

Corey Lanum (LinkedIn), has a distinguished background in graph visualization. Over the last 15 years he has managed technical and business relationships with dozens of the largest defense and intelligence agencies in North America, in addition to working with many security and anti-fraud organizations in private industry. Prior to joining Cambridge Intelligence as their US Manager, Corey was helping the customers of i2 (now IBM) and SS8 to solve their most complex graph data challenges.
Corey is the author of Visualizing Graph Data from Manning Publications.

William Lyon (SFBay) @lyonwj

William Lyon (LinkedIn / blog) is a software developer at Neo4j. As an engineer on the Developer Relations team, he works primarily on integrating Neo4j with other technologies, building demo apps, helping other developers build applications with Neo4j, and writing documentation. Prior to joining Neo, William worked as a software developer for several startups in the real estate software, quantitative finance, and predictive API fields. William holds a Masters degree in Computer Science from the University of Montana. William is author of the Manning publication Full Stack GraphQL Applications With React, Node.js, and Neo4j and co-host of the GraphStuff.FM podcast.

William will lead the following 90 minute workshop: Hands-On Introduction To GraphQL For Data Scientists & Developers

Dave McComb (Ft Collins) @semanticarts

Patrick McFadin (Linkedin) is the President and co-founder of Semantic Arts. He and his team help organizations uncover the meaning in the data from their information systems. Dave is also the author of "The Data-Centric Revolution", "Software Wasteland" and "Semantics in Business Systems". For 20 years, Semantic Arts has helped firms of all sizes in this endeavor, including Amgen, Dupont, Proctor & Gamble, Goldman Sachs, Schneider-Electric, Lexis Nexis, Dun & Bradstreet, and Morgan Stanley. Prior to Semantic Arts, Dave co-founded Velocity Healthcare, where he developed and patented the first fully model driven architecture. Prior to that, he was a part of the integration problem.
Dave will present the following Data Integration session: Zero Copy Integration

Patrick McFadin (SF Bay) @patrickmcfadin

Patrick McFadin (Linkedin) is the VP of Developer Relations at DataStax, where he leads a team devoted to making users of DataStax products successful. He has also worked as Chief Evangelist for Apache Cassandra and consultant for DataStax, where he helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons and an Oracle DBA/Developer for over 15 years.

Jonathan Mugan (Austin) @jmugan

Jonathan Mugan (Linkedin) is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.

Andy Petrella (Liège, Belgium) @noootsab

Andy Petrella is an entrepreneur with a Mathematics and Distributed Data background.Andy is an early evangelist of Apache Spark and the Spark Notebook creator in the data community. He is also an O'Reilly author of “What is Data Observability”, “What is Data Governance”, and trainer “Distributed Data Science”, “Data Lineage Essentials”, “Machine Learning Model Monitoring”.Andy is also the founder and CEO of Kensu, a data observability solution implementing the Data Observability Driven Development (DODD) method.

Brent Schneeman (Austin) @schnee

Brent Schneeman swipes right for science and seeks to strengthen the scientific method muscle in whatever group he finds himself. Operating from a “lead by example” mindset, Brent frequently rolls up his sleeves and writes code to help bring predictive models to business problems. Passionate about building great teams and cultures, he’s pretty sure that a “servant leadership” posture is the right posture in his personal and professional lives.
Professionally, he tends to look after teams of data- and machine-learning-oriented contributors (analysts, scientists, and engineers) who collaborate on diverse sets of machine learning projects such as continuous optimization, customer customer churn prediction, fraud detection, and applying diverse techniques to unstructured data. Brent has worked at Vrbo, PayPal, Visa, and other small- and large-companies in individual contributor or management roles, mostly in product development organizations. He currently is attempting to make the world safe for machine learning with Alegion.
A storyteller, Brent has presented at the UT McCombs School, South By Southwest, NLP Day, multiple Data Days, and various meetups. He has one degree in Mathematics and another in Electrical Engineering and lives in Austin Texas with his wife, three kids, two cats and one dog. While he spends most of his free time mowing the lawn, he enjoys making photographs, running around downtown, and occasionally tries to make sense of neural network architectures.

Michael Berthold (Konstanz)

Michael Berthold is currently president of KNIME.com AG and co-creator of KNIME (wikipedia entry), the open analytics platform used by thousands of data experts around the world. Since August 2003, Michael has been the Nycomed-Chair for Bioinformatics and Information Mining at Konstanz University, Germany where his research focuses on using machine learning methods for the interactive analysis of large information repositories in the Life Sciences. Previously he held positions in both academia (Carnegie Mellon, UC Berkeley) and industry (Intel, Tripos).
Michael is Past President of the North American Fuzzy Information Processing Society, Associate Editor of several journals and the President of the IEEE System, Man, and Cybernetics Society. He has been involved in the organization of various conferences, most notably the IDA-series of symposia on Intelligent Data Analysis and the conference series on Computational Life Science. Together with David Hand he co-edited the textbook Intelligent Data Analysis: An Introduction which has recently appeared in a completely revised, second edition. He is also co-author of Guide to Intelligent Data Analysis (Springer Verlag) which appeared in summer 2010. When time permits Michael still writes code.

Sean Robinson (Charlotte)

Sean Robinson is a versatile data scientist with several years of experience optimizing data processes and building intelligent data systems. Specifically, he specializes in the use of graph data science and Neo4j to abstract complex systems within a domain into a highly dimensional, interconnected knowledge graphs to uncover novel insights which would otherwise remain dormant in other data structures. Sean currently serves both as Lead Data Scientist at Graphable as well as creating and instructing new network science courses at the University of North Carolina at Charlotte’s Data Science graduate program where he instructs the next generation of data scientists on how to integrate graph data science into their toolkit.
Sean will be presenting the following workshop: Intro to Graph Data Science for Python Developers.

Michael Uschold (Seattle, WA ) @UscholdM

Michael Uschold, Senior Ontology Consultant at Semantic Arts, has over twenty-five years’ experience in developing and transitioning semantic technology from academia to industry. He pioneered the field of ontology engineering, co-authoring the first paper and giving the first tutorial on the topic in 1995 in the UK.
As a senior ontology consultant at Semantic Arts since October 2010, Michael trains and guides clients to better understand and leverage semantic technology using knowledge graphs. He has built commercial enterprise ontologies in digital asset management, finance, healthcare, legal research, consumer products, electrical devices, manufacturing and corporation registration. More recently he has focused on semantic application development using SPARQL for application code and R2RML for converting relational data into a knowledge graph.
During 2008-2009, Uschold worked at Reinvent on a team that developed a semantic advertising platform that substantially increased revenue. As a research scientist at Boeing from 1997-2008 he defined, led and participated in numerous projects applying semantic technology to enterprise challenges. He is a frequent invited speaker and panelist at national and international events, and serves on the editorial board of the Applied Ontology Journal. He received his Ph.D. in AI from Edinburgh University in 1991 and an MSc. from Rutgers University in Computer Science in 1982.
Michael will lead the following 90 minute workshop: Ontology for Data Scientists

Weidong Yang (San Francisco) @wdyang

Weidong Yang is the founder and CEO of Kineviz. He holds a doctorate in Physics and a Masters in Computer and Information Science. After conducting theoretical and experimental research on quantum dots, Weidong worked for 10 years as a product manager and R&D scientist in the Semiconductor industry where he invented Diffraction-based Overlay technology to improve the manufacturing precision of silicon wafers. He has been awarded 11 US patents and has contributed to 20+ peer review publications.
Weidong also co-founded Kinetech Arts, a non-profit organization that brings dancers and engineers together to explore the creative potential of making art via new technologies.