Who will speak at Data Day Texas 2023

There are just a few rooms at the conference hotel left. We have a heavily discounted rate, and this is where the action is. Book a room now.

Plenary Keynote
Zhamak Dehghani (San Francisco) @zhamakd

Zhamak Dehghani (Linkedin) works as the CEO and founder of a stealth tech startup reimagining the future of data platforms with Data-Mesh-native technologies to get value from data rapidly, sustainably and at scale. She founded the concept of Data Mesh in 2018 and since has been implementing the concept and evangelizing it with the wider industry. She is the author of the O'Reilly book Data Mesh.
Zhamak serves on multiple tech advisory boards. She has worked as a technologist for over 24 years and has contributed to multiple patents in distributed computing communications. She is an advocate for the decentralization of all things, including architecture, data, and ultimately power.
Zhamak will be presenting the Plenary Keynote.

Data Engineering Keynote
Adi Polak (Israel) @AdiPolak

As Vice President of Developer Experience at Treeverse, Adi Polak shapes the future of data & ML technologies for hands-on builders. She also contributes to the lakeFS open-source, a git-like interface for object stores. In her work, she brings her vast industry research and engineering experience to bear in educating and helping teams design, architect, and build cost-effective data systems and machine learning pipelines that emphasize scalability, expertise, and business goals. Adi is a frequent worldwide presenter and the author of the upcoming O'Reilly book, Machine Learning With Apache Spark. She is continually an invited member of multiple program committees and advisor for conferences like Data & AI Summit, Scale by the Bay, and others. Previously, she was a senior manager for Azure at Microsoft, where she focused on building advanced analytics systems and modern architectures. When Adi isn’t building data pipelines or thinking up new software architecture, you can find her on the local cultural scene or at the beach.

Database Keynote
Gwen Shapira ( SF Bay Area) @gwenshap

For the last year or so, Gwen Shapira has been working on something amazing - which she will be able to share soon. Previously, Gwen was an engineering leader at Confluent, where among other things, she managed the Cloud-Native Kafka team. She has almost two decades of experience working with code and customers to build scalable data architectures, integrating microservices, relational and big data technologies. Gwen is an author of two O'Reilly books : Kafka - the Definitive Guide” and "Hadoop Application Architectures". Gwen is also a committer on the Apache Kafka and Apache Sqoop projects. When Gwen isn't coding or arguing about protocols, you can find her peddling on her bike exploring the roads and trails of California, and beyond.

Gwen will be giving the Database Keynote:
Things Databases Don't Do… But Actually Should

Scaling Python Keynote
Holden Karau ( San Francisco) @holdenkarau

Holden Karau Wikipedia / Linkedin ) is a queer transgender Canadian, Apache Spark committer, Apache Software Foundation member, and an active open source contributor. As a software engineer, she’s worked on a variety of distributed computing, search, and classification problems at Apple, Google, IBM, Alpine, Databricks, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor of mathematics in computer science. Outside of software, she enjoys playing with fire, welding, riding scooters, eating poutine, and dancing. Holden is the author of multiple O'Reilly publications, including Learning Spark, High Performance Spark, Kubeflow for Machine Learning, as well as the upcoming Scaling Spark with Dask and Scaling Spark with Ray.

Geospatial Keynote
Bonny McClain ( Greensboro, North Carolina) @datamongerbonny

Dr Bonny McClain is a geospatial analyst & self described human geographer | social anthropologist. Dr McClain applies advanced data analytics, including data engineering and geo-enrichment, to poverty, race, and gender discussions. Her research targets judgments about structural determinants, racial equity, and elements of intersectionality to illuminate the confluence of metrics contributing to poverty. Moving beyond ZIP codes to explore apportioned socioeconomic data based on underlying population data leads to discovering novel variables based on location to build more context to complex data questions. Bonny is a member of the National Press Club, 500 Women Scientists, The Urban and Regional Information Systems Association (URISA), former member of Tableau Speaker’s Bureau, and Investigational Reporters and Editors allowing access to a wide variety of health policy and health economic discussions. Bonny is author of the upcoming O'Reilly publication: Python for Geospatial Data Analysis: Theory, Tools, and Practice for Location Intelligence.

Bonny will be presenting the Geospatial Keynote:
"one ant , one bird, one tree"....

Data Engineering - They wrote the book.

Joe Reis and Matt Housley of Ternary Data are co-hosts of the popular Monday Morning Data Chat (Spotify / Apple) as well as the Data Nerd Herd podcast. They are also co-authors of the bestselling O'Reilly book: Fundamentals of Data Engineering.

Joe Reis (Salt Lake city)

Joe Reis (Linkedin), Co-Founder and CEO of Ternary Data, is a “recovering data scientist,” and a business-minded data nerd who’s worked in the data industry for 20 years. His responsibilities have ranged from statistical modeling, forecasting, machine learning, data engineering, data architecture, and everything else in between. Joe also teaches at the University of Utah as well as runs several meetups, including The Utah Data Engineering Meetup and SLC Python. When he’s not busy running a company, teaching, or creating content, Joe often finds himself DJing/making music, rock climbing, or trail running in the mountains around Salt Lake City, Utah.

Matt Housley (Salt Lake city)

Co-Founder / CTO of Ternary Data as well as fellow “Recovering Data Scientist,” , Matt Housley is also a “Reformed Academic,” holding a PhD in Math and dual Masters degrees in both Math and Physics. It was only natural that he began his career in Academia as a Professor of Mathematics, before joining one of the largest e-commerce companies as a data scientist. His STEM background in combination with his knack for teaching makes him a mastermind at overhauling processes, improving teamwork, and incorporating engineering best practices so that real value is delivered to companies. While making the journey from data scientist to data engineer, Matt began to focus more on data & cloud engineering, working extensively with Amazon Web Services, Google Cloud Platform, Containers, Apache Airflow and GPUs, among other technologies. Matt (or should we say, “Dr. Housley”) is an adjunct faculty member in the David Eccles School of Business at The University of Utah..

Data Quality Keynote
Chad Sanderson (Seattle)


If you follow online discussions on Data Quality or Data Products, you’ve no doubt come across Chad Sanderson. He’s a featured speaker at many conferences as well as frequently requested interviewee on data podcasts. Most recently Chad was Head of Data at Convoy, where he oversaw the end-to-end data platform team — including data engineering, machine learning, experimentation, data pipeline — as well as multitude of other teams all in service of helping thousands of carriers ship freight more efficiently. Chad has built everything from feature stores, experimentation platforms, metrics layers, streaming platforms, analytics tools, data discovery systems, and workflow development platforms. He’s implemented open source, SaaS products (early and late-stage) and has built cutting-edge technology from the ground up.
Chad’s current initiative is the Data Quality Camp — the details of which he will be sharing while at Data Day Texas.

Healthcare Data Keynote
Andrew Nguyen ( SF Bay Area)

Dr/ Andrew Nguyen has been working at the intersection of healthcare data and AI for more than a decade. He quickly discovered graph databases and has been using them to harmonize disparate data sources for nearly as long. He has worked for a variety of organizations ranging from academia to startups. Andrew is currently a medical informatics architect and leads the Data Architecture and Informatics capability for real-world data at one of the largest biopharma companies in the world, where he is designing scalable solutions to harmonize healthcare RWD sources for all levels of analytics from statistics to machine learning. Prior to his current role, he served as chair of the Department of Health Professions, and director of the MS in Health Informatics program at the University of San Francisco. He also taught classes in medical informatics, semantic interoperability, machine learning, clinical natural language processing, and biosignal/time series data analysis. Andrew holds a PhD in biological and medical informatics from the University of California, San Francisco (UCSF) and a BS in electrical and computer engineering from the University of California, San Diego (UCSD). In his spare time, he enjoys photography, hiking/backpacking, and SCUBA diving, and serves as the technical rescue coordinator for his local Search and Rescue team. Andrew is author of the recently published O'Reilly book: Hands-On Healthcare Data.

NLP Keynote
John Bohannon ( SF Bay Area) @bohannon_bot

John Bohannon (Wikipedia / Linkedin ) is currently Director of Science at Primer, an artificial intelligence company headquartered in San Francisco. Bohannon is widely known as a science journalist, most notably with his "Gonzo Scientist" online series at Science Magazine and his creation of the annual Dance Your PhD contest. Bohannon is involved in the effective altruism movement. In July 2015 he became a member of Giving What We Can, an organization whose members pledge to give at least 10% of their income to effective charities. Bohannon completed his Doctor of Philosophy degree in Molecular biology at the University of Oxford in 2002, supervised by Paul Rainey.
To see why we invited John to be our NLP Keynote for 2023, check out the following interviews:
Data Exchange Podcast (Episode 144): John Bohannon,
Multimodal, Multi-Lingual NLP at Hugging Face with John Bohannon and Douwe Kiela,
Talk with John Bohannon, Director of Science at Primer,
Trends in NLP with John Bohannon,
John Bohannon Interview - Taming arXiv with Natural Language Processing.

Data Lakehouse Keynote
Bill Inmon (Castle Rock, Colorado)

Bill Inmon (Wikipedia / LinkedIn) is an American computer scientist, recognized by many as the father of the data warehouse. Inmon wrote the first book, held the first conference, wrote the first column in a magazine and was the first to offer classes in data warehousing. Inmon created the accepted definition of what a data warehouse is - a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions. Bill is among the most prolific and well-known authors in the big data analysis, data warehousing and business intelligence arena. In addition to authoring more than 50 books and 650 articles, Bill has been a monthly columnist with the Business Intelligence Network, EIM Institute and Data Management Review. In 2007, Bill was named by Computerworld as one of the “Ten IT People Who Mattered in the Last 40 Years” of the computer profession.

Data Modeling Keynote
Juan Sequeda (Austin) @juansequeda

Dr. Juan Sequeda is the co-founder of Capsenta, a spin-off from his research, and the Senior Director of Capsenta Labs. He holds a PhD in Computer Science from the University of Texas at Austin. His research interests are on the intersection of Logic and Data and in particular between the Semantic Web and Relational Databases for data integration, ontology based data access and semantic/graph data management. Juan is the recipient of the NSF Graduate Research Fellowship, received 2nd Place in the 2013 Semantic Web Challenge for his work on ConstituteProject.org, Best Student Research Paper at the 2014 International Semantic Web Conference and the 2015 Best Transfer and Innovation Project awarded by Institute for Applied Informatics. Juan is the General Chair of AMW 2018, was the PC chair of the ISWC 2017 In-Use track, is on the Editorial Board of the Journal of Web Semantics, member of multiple program committees (ISWC, ESWC, WWW, AAAI, IJCAI) and co-creator of the Consuming Linked Data Workshop series. Juan is a member of the Graph Query Languages task force of the Linked Data Benchmark Council (LDBC) and has also been an invited expert member and standards editor at the World Wide Web Consortium (W3C).

Hala Nelson (Alexandria, Virginia)

Hala Nelson (Linkedin) is an Associate Professor of Mathematics at James Madison University. She has a Ph.D. in Mathematics from the Courant Institute of Mathematical Sciences at New York University. Prior to her work at James Madison University, she was a postdoctoral Assistant Professor at the University of Michigan- Ann Arbor. Her research is in the areas of Materials Science, Statistical Mechanics, Inverse Problems, and the Mathematics of Machine Learning and Artificial Intelligence. Her favorite subjects are Optimization, Numerical Algorithms, Mathematics for AI, Mathematical Analysis, Numerical Linear Algebra and Probability Theory. She likes to translate complex ideas into simple and practical terms. To her, most mathematical concepts are painless and relatable, unless the person presenting them either does not understand them very well, or is trying to show off. Other facts: Hala Nelson grew up in Lebanon, during the time of its brutal civil war. She lost her hair at a very young age in a missile explosion. This event and many that followed shaped her interests in human behavior, the nature of intelligence, and AI. Her father taught her Math, at home and in French, until she graduated high school. Her favorite quote from her father about math is, "It is the one clean science''.
Hala is author of the upcoming O'Reilly book: Essential Math for AI.

Amy Hodler (Kettle Falls, Washington) @amyhodler

Amy Hodler is an evangelist for graph analytics, network science, and responsible AI. Amy has decades of experience in emerging tech at companies such as Microsoft, Hewlett-Packard (HP), Hitachi IoT, Neo4j, and Cray. At RelationalAI, she’s the Graph Evangelist and Sr. Director of Product Marketing. Amy has a love for science history and a fascination for complexity studies. Amy is the co-author of the O'Reilly books, Graph Algorithms: Practical Examples in Apache Spark and Neo4j, and Knowledge Graphs: Data in Context for Responsive Businesses.

Heather Hedden (Boston) @hhedden

Heather Hedden (LinkedIn) has been a taxonomist for over 26 years in various organizations and as an independent consultant. She is currently a data and knowledge engineer on the professional services team of Semantic Web Company, vendor of PoolParty software. Previously worked as a taxonomist at Cengage Learning, Gale, Viziant, First Wind, and Project Performance Corporation. Heather has designed and developed, taxonomies, ontologies, and metadata schema for internal and externally published content. She gives workshops on taxonomy creation at conferences, as corporate training, and through an independently offered online course. Heather is author of The Accidental Taxonomist.
Heather will host the following workshop:
Introduction to Taxonomies for Data Scientists (workshop)

Sanghamitra Deb (SF Bay) @sangha_deb

Sanghamitra Deb is a Data Scientist at Chegg, where she works on problems related school and college education to sustain and improve the learning process. Her work involves recommendation systems, graph modeling, deep NLP analysis , data pipelines and machine learning. Previously, Sanghamitra was a data scientist at a Accenture where she worked on a wide variety of problems related data modeling, architecture and visual story telling. Sanghamitra is active in Data Science outreach and believes in applying analytics to a range of domains such as pharma, HR, customer support, market research, etc. Prior to being data scientist she was an astrophysicist who studied the structure of the universe by modeling galaxy clusters.

Sanghamitra will present the following session:
Computer Vision Landscape at Chegg: Present and Future

Janet Six (DFW) @janetmsix

Janet Six is a Product Manager at Tom Sawyer Software, where she helps companies design easier-to-use products within their financial, time, and technical constraints. For her research in information visualization, Janet was awarded the University of Texas at Dallas Jonsson School of Engineering Computer Science Dissertation of the Year Award. She was also awarded the prestigious IEEE Dallas Section 2003 Outstanding Young Engineer Award. Her work has appeared in the Journal of Graph Algorithms and Applications and the Kluwer International Series in Engineering and Computer Science. The proceedings of conferences on Graph Drawing, Information Visualization, and Algorithm Engineering and Experiments have also included the results of her research.
Janet will present the following sessions:
Visualizing Connected Data as It Evolves Over Time
Where Is the Graph? Best practices for extracting data from unstructured data sources for effective visualization and analysis
#graphday #visualization

Dean Wampler (Chicago) @deanwampler

Dean Wampler is an expert in data engineering for scalable, streaming data systems and applications of machine learning and artificial intelligence (ML/AI). He is the Director of Engineering for the Accelerated Discovery Platform at IBM Research. Previously, he worked at Domino Data Lab on their data science platform, he worked on scalable ML with Ray at Anyscale, and he lead an engineering team at Lightbend developing distributed streaming data systems using Apache Spark, Apache Kafka, Kubernetes, and other tools. Dean is the author of several books, reports, and videos for O'Reilly Media, including Programming Scala, Third Edition, Fast Data Architectures for Streaming Applications, What Is Ray?, Functional Programming for Java Developers, and Programming Hive (coauthor). He is a contributor to several open source projects and a frequent conference speaker and co-organizer. Dean has a Ph.D. in Physics from the University of Washington.
Dean will present the following session: Reinforcement Learning with Ray RLlib

Ryan Boyd (Boulder) @ ryguyrg

Ryan Boyd (LinkedIn) is a Boulder-based software engineer, data + authNZ geek and technology executive. He's currently a co-founder at MotherDuck, where they're making data analytics fun, frictionless and ducking awesome. He previously led developer relations teams at Databricks, Neo4j and Google Cloud. He's the author of O'Reilly's Getting Started with OAuth 2.0.Ryan advises B2B SaaS startups on growth marketing and developer relations as a Partner at Hypergrowth Partners. Prior to leading the Google Cloud Developer Relations team, he spent 7 years at Google working on 20+ different developer products and was the co-founder of Google Code Labs which aimed to improve quality and stability of Google's developer products.Ryan graduated with a degree in Computer Science from Rochester Institute of Technology (RIT) where he later worked full-time building web applications + APIs and architecting the central web hosting platform.

Ryan will present the following session:
Your laptop is faster than your data warehouse. Why wait for the cloud? (DuckDB)
#database

Haikal Pribadi (London, England) @haikalpribadi

Haikal Pribadi is the Founder and CEO of Vaticle, dedicated to building a strongly-typed database, TypeDB. His spark for computer science began at the Monash Intelligent Systems Lab, where he worked on robotics systems that were later adopted by NASA's JPL. He furthered his postgraduate studies in Advanced Computer Science at the University of Cambridge.In industry, Haikal became the youngest Algorithm Expert behind Quintiq’s Optimisation Technology, powering some of the world’s most challenging optimisation problems. He now works at Vaticle where he builds TypeDB, a strongly-typed database that empowers engineers to solve complex problems. TypeDB was awarded Product of the Year 2017 by the University of Cambridge Computer Lab.

Haikal will present the following session:
Introducing a Strongly-typed Database: TypeDB & TypeQL

Tomás Sobat Stöfsel (London, England) @tasabat

Tomás Sabat is the Chief Operating Officer at Vaticle. He works closely with TypeDB's open source and enterprise users who use TypeDB to build applications in a wide number of industries including financial services, life sciences, cyber security and supply chain management. A graduate of the University of Cambridge, Tomás has spent the last seven years founding and building businesses in the technology industry.
Tomás will present the following sessions:
What You Can't do With Graph Databases
Enabling the Computational Future of Biology
#typedbday

Sivaram Arabandi (Houston) @ontomd

Sivaram Arabandi is a surgeon and an informaticist working at the intersection of clinical ontologies, semantic web and healthcare AI. He started ONTOPRO in 2013 and provides expertise in clinical data standards such as SNOMED, LOINC, RxNorm and ICD; building and using semantic models, text mining applications, as well as data integration and interoperability strategy to leverage structured and unstructured data for advanced big data analytics. He is currently involved with Optum Health's Clinical Decision Support (CDS) project to operationalize published clinical guidelines using ontologies and open data standards such as FHIR and CPG for point-of-care decision support. Prior to this, Sivaram was the Director for Smart Content Strategy at Elsevier and headed Clinical Terminology services responsible for developing EMMeT (Elsevier Merged Medical Taxonomy) at the core of the ClinicalKey product. He worked on Mayo Clinic's MayoExpertAdvisor (MEA) knowledge delivery tool and Knowledge Enriched Data (KED) projects integrating longitudinal clinical data spanning multiple years to decades. He was a visiting scientist at the National Library of Medicine (NLM), co-chair of the 5th International Conference on Biomedical Ontology (ICBO’14) and current co-chair for ICBO-2022. He collaborates with other researchers on the development of open clinical ontologies in areas of general medicine (OGMS), infectious diseases (IDO), newborn screening (ONSTR) and sleep medicine (SDO). He served on the External Review Board for Mayo Clinic's Knowledge Content Management System (KCMS) initiative and was a scientific advisor for Emory University’s Newborn Screening Follow-up Data Integration Collaborative (NBSDC).
Sivaram will present the following session: Ontology in Healthcare: a survey

Jans Aasman (SF Bay)

Jans Aasman (Wikipedia / LinkedIn) is a Ph.D. psychologist and expert in Cognitive Science - as well as CEO of Franz Inc., an early innovator in Artificial Intelligence and provider of the graph database, AllegroGraph. As both a scientist and CEO, Dr. Aasman continues to break ground in the areas of Artificial Intelligence and Knowledge Graphs as he works hand-in- hand with numerous Fortune 500 organizations as well as US and Foreign governments. Jans recently authored an IEEE article on “Enterprise Knowledge Graphs”.
Dr. Aasman spent a large part of his professional life in telecommunications research, specializing in applied Artificial Intelligence projects and intelligent user interfaces. He gathered patents in the areas of speech technology, multimodal user interaction, recommendation engines while developing precursor technology for tablets and personal assistants. He was also a professor in the Industrial Design department of the Technical University of Delft. Dr. Aasman is a noted conference speaker at such events as Smart Data, NoSQL Now, International Semantic Web Conference, GeoWeb, AAAI, Enterprise Data World, Text Analytics, and TTI Vanguard to name a few.

Tim Berglund (Denver) @tlberglund

Tim Berglund is a teacher, author, and technology leader with StarTree, where he serves as the Vice President of Developer Relations. For over a decade, Tim has been a first-call speaker at conferences around the world. He can also be found on YouTube, where he has a reputation for explaining complex technology topics in an accessible way. He tweets as @tlberglund, blogs every few years at timberglund.com, and lives in Littleton, CO, USA. He has three grown children and two grandchildren, a fact about which he is rather excited.

Tim will be presenting the following session: An Introduction to Apache Pinot

Jeff Carpenter (Scottsdale, Arizona) @jscarp

Jeff Carpenter (Linkedin) , co-author of Cassandra: The Definitive Guide (3rd edition available soon!), has worked on large-scale systems in the defense and hospitality industries. Jeff leads the Developer Advocate team at DataStax, where he uses his background in system architecture, microservices and Apache Cassandra to help empower developers and operations engineers to build distributed systems that are scalable, reliable, and secure.

Zachary Carrico (Austin)

Zachary is currently developing a machine learning platform for blood-based cancer diagnostics company Freenome. Before joining Freenome, Zac did machine learning engineering at job-search engine company Indeed. He is passionate about simplifying and extending the use of machine learning in biotechnology. He received his PhD from Berkeley for work on bioengineering, and loves finding ways to make machine learning research fast and reproducible.

Rosaria Silipo (Zürich ) @DMR_Rosaria

Rosaria Silipo (LinkedIn),
Head of Data Science Evangelism at KNIME, has been a researcher in applications of Data Science and Machine Learning for over a decade. Her field of experience traverses many domains, including biomedical systems, IoT, customer intelligence, financial services, social media, cybersecurity, and automatic speech processing.
Rosara is the author of 50+ technical publications, including her most recent book Practicing Data Science: A Collection of Case Studies. She holds a doctorate degree in bio-engineering from Università degli Studi di Firenze.

Brandon Baylor (Houston)

Brandon Baylor is a Systems Engineer at Chevron, where he brings the principles and processes of composition to transform the energy industry. His multi-disciplinary experience working with international business units spans design, operations, safety, software, and human systems. As a longtime engineer, he is exploring ways to design and build systems that can deal with complexity in an integrated way and at a global scale. Brandon received his B.S. in Petroleum Engineering from Marietta College and his M.S. from Massachusetts Institute of Technology, where he is a System Design & Management Fellow. Brandon is also an author. His upcoming book, A Categorical Defense of Our Future, is set to be launched this August. In it, he and his co-author imagine a new foundation for engineering and point the way toward the complete paradigm shift that is required in order to save us from ourselves. It is a firsthand story of the difficulties of living in harmony with the systems we create.
Brandon will be presenting a session in the Oil/Gas/Energy track.

Max De Marzi (Chicago) @maxdemarzi

Marx De Marzi (Linkedin) is addicted to graphs. You may consider him a graph database enthusiast. He spent 8 years at Neo4j and recently made the swith to AWS Neptune. He is a blogger and an open source contributor, both activities which stem from passion: teaching people about graphs. He is always open to talk graphs, always learning, and nothing thrills him more than finding easy graph solutions to hard relational problems. He has been helping people get to the "graph epiphany" for over a decade. He is an avid graph database modeler, leveraging his knowledge of mechanical sympathy and experience to deliver dozens of graph uses cases over the years.
Max will present the following session: Outrageous ideas for Graph Databases
#graphday

Yue Cathy Chang ( Sunnyvale ) @yuec

Yue Cathy Chang is an executive recognized for thought leadership and execution in digital transformation. She is passionate about addressing business challenges and often finds herself and her team "parachuting" into situations to tackle challenging and meaningful data needs. Cathy has led teams and functions at blue-chip enterprises as well as startups, across financial services and high-tech industries, working with leaders of centralized and distributed data teams, all betting the next product differentiation on data. She is currently an AVP in banking and financial services at an American multinational technology corporation.
Cathy holds MS and BS degrees in electrical and computer engineering from Carnegie Mellon University, MBA and MS degrees from MIT, and two granted US patents. She's a co-author, with Jike Chong, of the Manning publication How to Lead in Data Science.

Cathy will co-present the following Data Science sessions:
For the overwhelmed data professionals: What to do when there is so much to do?
Data Professional's Career: Techniques to Practice Rigor and Avoid Ten Mistakes?

Jike Chong (Sunnyvale) @jikechong

Jike Chong is an executive who nurtures teams and crafts cultures to produce billion-dollar business impacts. He built and grew multiple high-performing data functions in public and private companies and nurtured dozens of ambitious individual contributor data scientists into leaders; some have gone on to lead teams of more than 70 data scientists. Jike was part of the executive team that took Yiren Digital Ltd public on NYSE. He also expanded and led the data team as the chief data scientist at Acorns, designed and executed a project predicting venture investment risks at Silver Lake, and led the Hiring Marketplace Data Science team at LinkedIn, serving a business line with $4B a year in revenue.
Jike received his bachelor’s and master’s degrees in electrical and computer engineering from Carnegie Mellon University and a PhD in electrical engineering and computer science from the University of California, Berkeley. He's a co-author, with Yue Cathy Chang of the Manning publication How to Lead in Data Science.

Jike will co-present the following Data Science session:
For the overwhelmed data professionals: What to do when there is so much to do?
Data Professional's Career: Techniques to Practice Rigor and Avoid Ten Mistakes?

Shirshanka Das (Santa Clara) @shirshanka

Shirshanka Das (LinkedIn) is co-founder and CEO of Acryl Data, the company which is commercializing the open source DataHub project, a real-time metadata platform used by LinkedIn, Expedia, Saxo Bank, Klarna, Viasat, and many others. Prior to founding Acryl, he was the overall architect for Big Data at LinkedIn from 2010 to 2020, and responsible for creating the metadata and data management strategy at the company. As part of this, he founded the DataHub project and shaped its evolution to a metadata platform that powers DataOps, MLOps, productivity, and governance use cases at LinkedIn. He is also a PMC and committer on the Apache Gobblin project which manages 100PB+ of data assets at rest at LinkedIn, and is deployed in production at other large companies like Verizon, PayPal etc. Prior to LinkedIn, Shirshanka worked on high-performance serving systems at Yahoo and PayPal. Shirshanka has a Ph.D. in Computer Science from UCLA.

Shirshanka will present the following session: In Search of the Control Plane for Data.

Brian Hall (Austin) @brian_w_hall

Brian Hall leads engineering and development as VP of Cyber Applications at Qomplx where they build high capacity stream processing applications to secure and detect corporate networks against potential attack. Prior to that Brian led the Graph and Analytics Practice at Expero, utilizing a wide array of graph engines including JanusGraph, DataStax, Neo4j, Tigergraph and Neptune, Brian has been developing software and consulting for over 25 years and holds a B.S. in Computer Science from Vanderbilt University and an M.S. in Computer Science from DePaul University. In their free time, Brian and his wife, Nicole, enjoy watching their kids slowly turn into the adults they will become, traveling, eating good food with good wine, and staying active in Austin with all the outdoor activities and great live music.

Brian will present the following Streaming Data session: Protecting Against Ransomware Attacks using Kafka, Flink and Graph.

James Hansen (Houston)

James Hansen works as a Wells Engineer in the Upstream function at Chevron where he employs his hybrid digital and engineering skillset to transform traditional workflows by creating tools that unite disparate processes, conducted by cross-functional teams across the globe. He leads Chevron’s Systems Engineering CoP’s book club, currently exploring the advantages of categorical systems to address complexity at scale.
Before holding the title of Wells Engineer, James held the title of Lead Field Engineer on the Deepwater Bigfoot TLP Project as well as Global Performance Engineer. During his time as a Performance Engineer, he was responsible for several initiatives that are now standard practice in the O&G industry. They include probabilistic cost and time estimation, multi-level abstraction and normalization, real-time data ingestion and analysis, business intelligence visualization, & project lifecycle management solutions. James received his B.S in Mechanical Engineering from Texas Tech University where he competed in Formula SAE & NASA’s TSGC Design Competition. James is from, and currently resides in Houston, Texas, and lives with his dog Scout.
Brandon will be presenting a session in the Oil/Gas/Energy track.

David Hughes (Seattle)

David Hughes is the Principal Graph Consultant for Graphable. He has 10 years of experience designing and building graph solutions which surface meaningful insights. His background includes clinical practice, medical research, software development, and cloud architecture. David has worked in healthcare and biotech within the intensive care, interventional radiology, oncology, cardiology, and proteomics domains. He enjoys endurance running, hiking, and spending time with his family in the outdoors when he is not enabling clients to have data epiphanies from their complex data.
David will be presenting the following session: Clinical trials exploration: surfacing a clinical application from a larger Bio-Pharma KnowledgeGraph
#graphday

Joey Jablonski ( Austin ) @jrjablo

Joey Jablonski (LinkedIn) is VP of Analytics at Pythian, he leads strategic engagements assisting customers in developing their data strategy, defining and executing on data governance programs and building analytical models to power the modern data-driven organization. Prior to Pythian, Joey was VP of Product at Manifold, where he brought a product mind-set is part of all engagements—allowing for delivery of value quickly in any project, and building over time to drive adoption of new data-centric capabilities in an organization. Joey led engagements across industries including high tech, pharmaceuticals and for the federal government. Before Manifold, Joey held executive leadership positions at Northwestern Mutual, iHeartMedia and Cloud Technology Partners. He brings 20+ years of experience in software engineering, high performance computing, cyber security, data governance and data engineering.

Corey Lanum (Boston) @corey_lanum

Corey Lanum (LinkedIn), has a distinguished background in graph visualization. Over the last 15 years he has managed technical and business relationships with dozens of the largest defense and intelligence agencies in North America, in addition to working with many security and anti-fraud organizations in private industry. Prior to joining Cambridge Intelligence as their US Manager, Corey was helping the customers of i2 (now IBM) and SS8 to solve their most complex graph data challenges.
Corey is the author of Visualizing Graph Data from Manning Publications.

William Lyon (SFBay) @lyonwj

William Lyon (LinkedIn / blog) is a software developer at Neo4j. As an engineer on the Developer Relations team, he works primarily on integrating Neo4j with other technologies, building demo apps, helping other developers build applications with Neo4j, and writing documentation. Prior to joining Neo, William worked as a software developer for several startups in the real estate software, quantitative finance, and predictive API fields. William holds a Masters degree in Computer Science from the University of Montana. William is author of the Manning publication Full Stack GraphQL Applications With React, Node.js, and Neo4j and co-host of the GraphStuff.FM podcast.

William will lead the following 90 minute workshop: Hands-On Introduction To GraphQL For Data Scientists & Developers

Dave McComb (Ft Collins) @semanticarts

Dave McComb (Linkedin) is the President and co-founder of Semantic Arts. He and his team help organizations uncover the meaning in the data from their information systems. Dave is also the author of "The Data-Centric Revolution", "Software Wasteland" and "Semantics in Business Systems". For 20 years, Semantic Arts has helped firms of all sizes in this endeavor, including Amgen, Dupont, Proctor & Gamble, Goldman Sachs, Schneider-Electric, Lexis Nexis, Dun & Bradstreet, and Morgan Stanley. Prior to Semantic Arts, Dave co-founded Velocity Healthcare, where he developed and patented the first fully model driven architecture. Prior to that, he was a part of the integration problem.

Dave will present the following Data Integration session:
Zero Copy Integration

Patrick McFadin (SF Bay) @patrickmcfadin

Patrick McFadin (Linkedin) is the VP of Developer Relations at DataStax, where he leads a team devoted to making users of DataStax products successful. He has also worked as Chief Evangelist for Apache Cassandra and consultant for DataStax, where he helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons and an Oracle DBA/Developer for over 15 years.

Alex Merced (Winter Park, FL) @alexmerced

Alex Merced (Linkedin) is a Developer Advocate for Dremio with a history of creating content to enable developers of all types through his personal projects like DevNursery.com, The Web Dev 101 Podcast, and the DataNation podcast. Alex Merced has been a developer with companies like Crossfield Digital, CampusGuard, GenEd Systems and others along with being an Instructor for General Assembly Bootcamps.
Alex will present the following sessions:
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg and the Right to Be Forgotten

Jonathan Mugan (Austin) @jmugan

Jonathan Mugan (Linkedin) is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.

Jonathan will be presenting the following session: How to build someone we can talk to.

Andy Petrella (Liège, Belgium) @noootsab

Andy Petrella is an entrepreneur with a Mathematics and Distributed Data background.Andy is an early evangelist of Apache Spark and the Spark Notebook creator in the data community. He is also author of the O'Reilly book: “What is Data Observability”, “What is Data Governance”, and trainer “Distributed Data Science”, “Data Lineage Essentials”, “Machine Learning Model Monitoring”.Andy is also the founder and CEO of Kensu, a data observability solution implementing the Data Observability Driven Development (DODD) method.

Brent Schneeman (Austin) @schnee

Brent Schneeman swipes right for science and seeks to strengthen the scientific method muscle in whatever group he finds himself. Operating from a “lead by example” mindset, Brent frequently rolls up his sleeves and writes code to help bring predictive models to business problems. Passionate about building great teams and cultures, he’s pretty sure that a “servant leadership” posture is the right posture in his personal and professional lives.
Professionally, he tends to look after teams of data- and machine-learning-oriented contributors (analysts, scientists, and engineers) who collaborate on diverse sets of machine learning projects such as continuous optimization, customer customer churn prediction, fraud detection, and applying diverse techniques to unstructured data. Brent has worked at Vrbo, PayPal, Visa, and other small- and large-companies in individual contributor or management roles, mostly in product development organizations. He currently is attempting to make the world safe for machine learning with Alegion.
A storyteller, Brent has presented at the UT McCombs School, South By Southwest, NLP Day, multiple Data Days, and various meetups. He has one degree in Mathematics and another in Electrical Engineering and lives in Austin Texas with his wife, three kids, two cats and one dog. While he spends most of his free time mowing the lawn, he enjoys making photographs, running around downtown, and occasionally tries to make sense of neural network architectures.

Michael Berthold (Konstanz)

Michael Berthold is currently president of KNIME.com AG and co-creator of KNIME (wikipedia entry), the open analytics platform used by thousands of data experts around the world. Since August 2003, Michael has been the Nycomed-Chair for Bioinformatics and Information Mining at Konstanz University, Germany where his research focuses on using machine learning methods for the interactive analysis of large information repositories in the Life Sciences. Previously he held positions in both academia (Carnegie Mellon, UC Berkeley) and industry (Intel, Tripos).
Michael is Past President of the North American Fuzzy Information Processing Society, Associate Editor of several journals and the President of the IEEE System, Man, and Cybernetics Society. He has been involved in the organization of various conferences, most notably the IDA-series of symposia on Intelligent Data Analysis and the conference series on Computational Life Science. Together with David Hand he co-edited the textbook Intelligent Data Analysis: An Introduction which has recently appeared in a completely revised, second edition. He is also co-author of Guide to Intelligent Data Analysis (Springer Verlag) which appeared in summer 2010. When time permits Michael still writes code.

Sean Robinson (Charlotte)

Sean Robinson is a versatile data scientist with several years of experience optimizing data processes and building intelligent data systems. Specifically, he specializes in the use of graph data science and Neo4j to abstract complex systems within a domain into a highly dimensional, interconnected knowledge graphs to uncover novel insights which would otherwise remain dormant in other data structures. Sean currently serves both as Lead Data Scientist at Graphable as well as creating and instructing new network science courses at the University of North Carolina at Charlotte’s Data Science graduate program where he instructs the next generation of data scientists on how to integrate graph data science into their toolkit.
Sean will be presenting the following workshop: Intro to Graph Data Science for Python Developers.

Michael Uschold (Seattle, WA ) @UscholdM

Michael Uschold, Senior Ontology Consultant at Semantic Arts, has over twenty-five years’ experience in developing and transitioning semantic technology from academia to industry. He pioneered the field of ontology engineering, co-authoring the first paper and giving the first tutorial on the topic in 1995 in the UK.
As a senior ontology consultant at Semantic Arts since October 2010, Michael trains and guides clients to better understand and leverage semantic technology using knowledge graphs. He has built commercial enterprise ontologies in digital asset management, finance, healthcare, legal research, consumer products, electrical devices, manufacturing and corporation registration. More recently he has focused on semantic application development using SPARQL for application code and R2RML for converting relational data into a knowledge graph.
During 2008-2009, Uschold worked at Reinvent on a team that developed a semantic advertising platform that substantially increased revenue. As a research scientist at Boeing from 1997-2008 he defined, led and participated in numerous projects applying semantic technology to enterprise challenges. He is a frequent invited speaker and panelist at national and international events, and serves on the editorial board of the Applied Ontology Journal. He received his Ph.D. in AI from Edinburgh University in 1991 and an MSc. from Rutgers University in Computer Science in 1982.
Michael will lead the following 90 minute workshop: Ontology for Data Scientists

Weidong Yang (San Francisco) @wdyang

Weidong Yang is the founder and CEO of Kineviz. He holds a doctorate in Physics and a Masters in Computer and Information Science. After conducting theoretical and experimental research on quantum dots, Weidong worked for 10 years as a product manager and R&D scientist in the Semiconductor industry where he invented Diffraction-based Overlay technology to improve the manufacturing precision of silicon wafers. He has been awarded 11 US patents and has contributed to 20+ peer review publications.
Weidong also co-founded Kinetech Arts, a non-profit organization that brings dancers and engineers together to explore the creative potential of making art via new technologies.