Who will speak at Data Day Texas 2022

We still have discount rooms at the AT&T. If you are coming from out of town, this is where all the action is. For the best selection, Book a room now.

We're just now beginning to announce invited speakers. If you are doing something cool with data and want to share it at Data Day Texas, check out our proposals page.

Genevera Allen ( Houston ) @genevera_allen

Genevera Allen (LinkedIn / Google Scholar) is an Associate Professor of Electrical and Computer Engineering, Statistics and Computer Science at Rice University and an investigator at the Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital and Baylor College of Medicine. She is also the Founder and Faculty Director of the Rice Center for Transforming Data to Knowledge, informally called the Rice D2K Lab.
Dr. Allen's research focuses on developing statistical machine learning tools to help scientists make reproducible data-driven discoveries. Her work lies in the areas of interpretable machine learning, optimization, data integration, modern multivariate analysis, and graphical models with applications in neuroscience and bioinformatics. Dr. Allen is the recipient of several honors including a National Science Foundation Career award, the George R. Brown School of Engineering's Research and Teaching Excellence Award at Rice University, and in 2014, she was named to the "Forbes '30 under 30': Science and Healthcare" list. Dr. Allen received her PhD in statistics from Stanford University (2010), under the mentorship of Prof. Robert Tibshirani, and her bachelors, also in statistics, from Rice University (2006).

Paul Azunre (Austin) @pazunre

Paul Azunre (LinkedIn) holds a Ph.D. in Computer Science from MIT and has served as a Principal Investigator on several DARPA research programs. He has helped develop scientific software in key roles at established organizations such as Oracle and Dun & Bradstreet, as well as a variety of startups. He founded Algorine Inc., a Research Lab dedicated to advancing AI/ML and identifying scenarios where they can have a significant social impact. Paul also co-founded Ghana NLP, an open-source initiative focused on using NLP and Transfer Learning with Ghanaian and other low-resource languages. He frequently contributes to major peer-reviewed international research journals and serves as a program committee member at top conferences in the field.
In his spare time, under the alias Dr. Pushkin, Paul is part of the underground hip-hop, R&B and Afrobeats/Afropop/Afrohiphop group - Isolirium (Spotify / Soundcloud).
Paul is also author of the recently published Transfer Learning for Natural Language Processing from Manning.

Jans Aasman (SF Bay)

Jans Aasman (Wikipedia / LinkedIn) is a Ph.D. psychologist and expert in Cognitive Science - as well as CEO of Franz Inc., an early innovator in Artificial Intelligence and provider of the graph database, AllegroGraph. As both a scientist and CEO, Dr. Aasman continues to break ground in the areas of Artificial Intelligence and Knowledge Graphs as he works hand-in- hand with numerous Fortune 500 organizations as well as US and Foreign governments. Jans recently authored an IEEE article on “Enterprise Knowledge Graphs”.
Dr. Aasman spent a large part of his professional life in telecommunications research, specializing in applied Artificial Intelligence projects and intelligent user interfaces. He gathered patents in the areas of speech technology, multimodal user interaction, recommendation engines while developing precursor technology for tablets and personal assistants. He was also a professor in the Industrial Design department of the Technical University of Delft. Dr. Aasman is a noted conference speaker at such events as Smart Data, NoSQL Now, International Semantic Web Conference, GeoWeb, AAAI, Enterprise Data World, Text Analytics, and TTI Vanguard to name a few.

Dave Bechberger ( Anchorage ) @bechbd

Dave Bechberger is a Sr. Graph Architect on the AWS Neptune service team. A long time graph and distributed data practitioner, Dave has spent over 20 years in full stack software development and specializes in building data architectures in complex data domains such as bioinformatics, oil and gas, supply chain management, etc. Dave has previously spoken at a variety of national and international technical conferences including NDC Oslo, NDC London, as well as previous GraphDay conferences in Texas, San Francisco and Seattle. He is a co-author of Graph Databases in Action by Manning Publications.
Dave will present the following session: A gentle introduction to using graph neural networks on knowledge graphs.

Michael Berthold (Berlin)

Michael Berthold is currently president of KNIME.com AG and co-creator of KNIME (wikipedia entry), the open analytics platform used by thousands of data experts around the world. Since August 2003, Michael has been the Nycomed-Chair for Bioinformatics and Information Mining at Konstanz University, Germany where his research focuses on using machine learning methods for the interactive analysis of large information repositories in the Life Sciences. Previously he held positions in both academia (Carnegie Mellon, UC Berkeley) and industry (Intel, Tripos).
Michael is Past President of the North American Fuzzy Information Processing Society, Associate Editor of several journals and the President of the IEEE System, Man, and Cybernetics Society. He has been involved in the organization of various conferences, most notably the IDA-series of symposia on Intelligent Data Analysis and the conference series on Computational Life Science. Together with David Hand he co-edited the textbook Intelligent Data Analysis: An Introduction which has recently appeared in a completely revised, second edition. He is also co-author of Guide to Intelligent Data Analysis (Springer Verlag) which appeared in summer 2010. When time permits Michael still writes code.

Dr. Matthias Broecheler (Seattle) @mbroecheler

Dr. Matthias Broecheler (LinkedIn) is the inventor of the Titan graph database and co-founder of Aurelius, the original company behind the Apache TinkerPop graph framework, acquired by DataStax in 2015. A sought after speaker, he introduced Titan at the 2012 Cassandra Summit and gave the keynote at the first Graph Day Texas in 2016 (interview). Matthias is co-author of The Practitioner's Guide to Graph Data, published by O'Reilly. Matthias received his PhD in Computer Science at University of Maryland, College Park.

Dr. Ying Ding (Austin)

Dr. Ying Ding is the Bill & Lewis Suit Professor of Information Technology at the University of Texas School of Information. Before that, she was a professor and director of graduate studies for data science program at School of Informatics, Computing, and Engineering at Indiana University. She has led the effort to develop the online data science graduate program for Indiana University. She also worked as a senior researcher at Department of Computer Science, University of Innsburck (Austria) and Free University of Amsterdam (the Netherlands). She has been involved in various NIH, NSF and European-Union funded projects. She has published 240+ papers in journals, conferences, and workshops, and served as the program committee member for 200+ international conferences. She is the co-editor of book series called Semantic Web Synthesis by Morgan & Claypool publisher, the co-editor-in-chief for Data Intelligence published by MIT Press and Chinese Academy of Sciences, and serves as the editorial board member for several top journals in Information Science and Semantic Web. She is the co-founder of Data2Discovery company advancing cutting edge AI technologies in drug discovery and healthcare. Her current research interests include data-driven science of science, AI in healthcare, Semantic Web, knowledge graph, data science, scholarly communication, and the application of Web technologies.

Heather Hedden (Boston) @hhedden

Heather Hedden (LinkedIn) has been a taxonomist for over 26 years in various organizations and as an independent consultant. She is currently a data and knowledge engineer on the professional services team of Semantic Web Company, vendor of PoolParty software. Previously worked as a taxonomist at Cengage Learning, Gale, Viziant, First Wind, and Project Performance Corporation. Heather has designed and developed, taxonomies, ontologies, and metadata schema for internal and externally published content. She gives workshops on taxonomy creation at conferences, as corporate training, and through an independently offered online course. Heather is author of The Accidental Taxonomist.
Heather will host the following two sessions:
Introduction to Taxonomies for Data Scientists (workshop)
The Future of Taxonomies - Linking data to knowledge (presentation).

Amy Hodler (Kettle Falls, Washington) @amyhodler

Amy Hodler is the AI evangelist for Fidder Labs, educating data scientists on the use of continuous monitoring for accuracy and bias as well as creating more explainable ML and ultimately more trustworthy AI. Previously, she was AI and Graph Analytics Program Manager at Neo4j, where she promoted the use of graph analytics to reveal structures within real-world networks and predict dynamic behavior. Amy is the co-author of the O’Reilly book, Graph Algorithms: Practical Examples in Apache Spark and Neo4j, co-author of Knowledge Graphs: Data in Context for Responsive Businesses, and a contributor to the upcoming book, AI on Trial.
Amy will host the following two sessions:
- Continuous ML Improvement: Automated Monitoring with Built-In Explainability
- 4 Types of ML Drift and How to Catch Them (Or “Why your AI is wrong, eventually”)

Joey Jablonski ( Austin ) @jrjablo

Joey Jablonski (LinkedIn) is VP of Product at Manifold, where he builds trusted relationships with our customers, allowing us to more fully understand their needs and how we can help. Joey partners with customers to ensure that a product mind-set is part of all engagements—allowing for delivery of value quickly in any project, and building over time to drive adoption of new data-centric capabilities in an organization. He has extensive experience delivering innovative solutions to customers.
Prior to Manifold, Joey was VP of Core Data at Northwestern Mutual, where he delivered high quality and compliant data products to facilitate decision, automation, and new product launches. While there, he built the first data product management team to enable better engineering prioritization, more effective product value definition, and technology simplification. Prior to Northwestern Mutual, Joey was VP of Data Engineering and Analytics at iHeartMedia. There, he led a team delivering data science, data engineering, broadcast engineering, and attribution capabilities to the largest audio media company in the world.

Corey Lanum (Boston) @corey_lanum

Corey Lanum (LinkedIn), has a distinguished background in graph visualization. Over the last 15 years he has managed technical and business relationships with dozens of the largest defense and intelligence agencies in North America, in addition to working with many security and anti-fraud organizations in private industry. Prior to joining Cambridge Intelligence as their US Manager, Corey was helping the customers of i2 (now IBM) and SS8 to solve their most complex graph data challenges.
Corey is the author of Visualizing Graph Data from Manning Publications.
Cory will present the following session: Visual timeline analytics: applying concepts from graph theory to timeline and time series data

William Lyon (SFBay) @lyonwj

William Lyon (LinkedIn / blog) is a software developer at Neo4j. As an engineer on the Developer Relations team, he works primarily on integrating Neo4j with other technologies, building demo apps, helping other developers build applications with Neo4j, and writing documentation. Prior to joining Neo, William worked as a software developer for several startups in the real estate software, quantitative finance, and predictive API fields. William holds a Masters degree in Computer Science from the University of Montana. William is author of the Manning publication Full Stack GraphQL Applications With React, Node.js, and Neo4j and co-host of the GraphStuff.FM podcast.
William will host a 90 minute hands-on session: Intro to GraphQL for Developers and Data Scientists.

Ryan Mitchell (Boston) @Kludgist

An expert in web scraping, web security, and data science, Ryan Mitchell has hosted workshops and spoken at many events, including Data Day and DEF CON. She teaches web programming and data science and has taught and designed courses at Northeastern University and Olin College of Engineering. Ryan holds a master’s degree in software engineering from Harvard University Extension School and is currently a senior software engineer at the Gerson Lehrman Group where she creates data science tools. Ryan is the author of Web Scraping with Python (O’Reilly) and Instant Web Scraping with Java (Packt Publishing), as well as two Linkedin courses: Python Data Structures with Trees and Web Scraping with Python.
Ryan will be presenting the following session: What is Truth? - Strategies for managing semantic triples in large complex systems

Jonathan Mugan (Austin) @jmugan

Jonathan Mugan (Linkedin) is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.
Jonathan will be presenting the following session: A Path to Strong AI

Jacqueline Nolis (Seattle) @skyetetra

Dr. Jacqueline Nolis (LinkedIn / GitHub) is a data science leader with over 15 years of experience in managing data science teams and projects at companies ranging from DSW to Airbnb. She currently is the Head of Data Science at Saturn Cloud where she helps design products for data scientists. Jacqueline has a PhD in Industrial Engineering and is co-author, with Emily Robison, of the Manning publication Build a Career in Data Science.

Paige Roberts (Austin) @RobertsPaige

With two decades in the data management industry, Paige Roberts (Linkedin), has worked as an engineer, a trainer, a marketer, a product manager, and a consultant. Now, as Open Source Relations Manager at Vertica, she promotes understanding MPP data processing, open source, and how the analytics revolution is changing the world. Paige is contributor to the upcoming O'Reilly publication 97 Things Every Engineer Should know.
Paige is a total geek who is into role-playing games, LARP’ing in the SCA, Doctor Who, superheroes, space exploration, comics, Tolkien, etc. Paige writes and publishes fantasy and science fiction stories under her maiden name Paige E. Ewing. She won the Kennedy Space Center’s global Space Apps Challenge three years ago for coming up with an idea for growing food on Mars, And she's a pretty mean shot with a recurve, crossbow, or long bow.

Jörg Schad (Berlin / San Francisco) @joerg_schad

Jörg Schad (Linkedin / GitHub) is CTO at ArangoDB where he splits his time between Berlin and San Francisco. Prior to ArangoDB, he was Technical Community Lead and Distributed Systems Engineer at Mesosphere, and Big Data Engineer at SAP. Jörg received his Ph.D. at Universität des Saarlandes for research around distributed databases and data analytics.
Jörg will host a 90 minute hands-on workshop: Graph Powered Machine Learning for Python Developers

Joshua Shinavier (San Francisco) @joshsh

As a co-founder of what is now Apache TinkerPop, Joshua Shinavier contributed to the first common APIs for graph databases, the original TinkerPop query language which influenced Gremlin, and the first tools which aligned the property graph and RDF data models, starting with neo4j-rdf-sail in 2008. As a Research Scientist at Uber, he led development of the Dragon data integration platform. Joshua is host of The Graph Show, and co-organizer of the Bay Area Category Theory meetup. Joshua holds a PhD in computer science from RPI's Tetherless World Constellation, where he focused on combining knowledge graphs with augmented reality.

Rosaria Silipo (Zürich ) @DMR_Rosaria

Rosaria Silipo (LinkedIn), Principal Data Scientist at KNIME, is the author of 50+ technical publications, including her most recent book “Practicing Data Science: A Collection of Case Studies”. She holds a doctorate degree in bio-engineering and has spent 25+ years working on data science projects for companies in a broad range of fields, including IoT, customer intelligence, the financial industry, and cybersecurity.

Dr. Clair Sullivan (Breckenridge) @CJLovesData1

Dr. Clair Sullivan (LinkedIn / GitHub) is currently a graph data science advocate at Neo4j, working to expand the community of data scientists and machine learning engineers using graphs to solve challenging problems. She received her doctorate degree in nuclear engineering from the University of Michigan in 2002. After that, she began her career in nuclear emergency response at Los Alamos National Laboratory where her research involved signal processing of spectroscopic data. She spent 4 years working in the federal government on related subjects and returned to academic research in 2012 as an assistant professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign. While there, her research focused on using machine learning to analyze the data from large sensor networks. Deciding to focus more on machine learning, she accepted a job at GitHub as a machine learning engineer while maintaining adjunct assistant professor status at the University of Illinois. In 2021 she joined Neo4j as a Graph Data Science Advocate. Additionally, she founded a company, La Neige Analytics, whose purpose is to provide data science expertise to the ski industry. She has authored 4 book chapters, over 20 peer-reviewed papers, and more than 30 conference papers. Dr. Sullivan was the recipient of the DARPA Young Faculty Award in 2014 and the American Nuclear Society's Mary J. Oestmann Professional Women's Achievement Award in 2015.
Clair will host a 90 minute hands-on workshops: Intro to Graph Data Science for Python Developers.

Ryan Wisnesky (Cambridge, Massachusetts )

Ryan Wisnesky (LinkedIn) obtained B.S. and M.S. degrees in mathematics and computer science from Stanford University and a Ph.D. in computer science from Harvard University, where he studied the design and implementation of provably correct software systems. Previously, he was a postdoctoral associate in the MIT department of mathematics, where he developed the categorical query language CQL. He currently leads open-source and commercial development of CQL as CTO of Conexus AI. He maintains an active collaboration with the information-integration department of IBM Research, where he contributed to the Clio, Orchid, and HIL projects.

Weidong Yang (San Francisco) @wdyang

Weidong Yang is the founder and CEO of Kineviz. He holds a doctorate in Physics and a Masters in Computer and Information Science. After conducting theoretical and experimental research on quantum dots, he worked for 10 years as a product manager and R&D scientist in the Semiconductor industry. He has been awarded 11 US patents and contributed to 20+ peer review publications.
Weidong is also co-founder of Kinetech Arts, a non-profit organization that brings dancers and engineers together to explore the creative potential of new technologies in making art.
Weidong will host the following talk: What Data Visualization can learn from Dance.

Take advantage of earlybird prices and buy your ticket now