Who spoke at Data Day Texas 2024

Plenary Keynote
Sol Rashidi (NYC)

With eight patents issued and receiving awards that include “Top 10 Women in Data & Applied AI”, “50 Most Powerful Women in Tech”, “Global 100 Data Power List”, “Top 20 CDOs”, Top 100 Innovators in Data & Analytics” and “Forbes AI Maverick of the 21st Century”, Sol Rashidi (Linkedin) is an energetic business leader and a goal-oriented technologist, skilled at coupling her technical acumen with story-telling abilities to articulate business value with early-stage startups and enterprises who are leaning into data, AI, and technology as a competitive advantage, while wanting to preserve the legacy in which they were founded upon.
Sol was the former CDAO for Merck, EVP & CDO for Sony Music, CDO for Royal Caribbean, AI, Data & Analytics Partner at E&Y, Chief Information & Digital Officer for Soli, and a key player in launching IBM's Gen-1 “Watson", pioneering IBM’s early advances in Enterprise Information Management.

Sol will be presenting the Plenary Keynote:
Practitioner turned Executive; lessons I learned about how decisions are really made with data ecosystems.

MLOps Keynote
Mikiko Bazeley (San Francisco) @BazeleyMikiko

Mikiko Bazeley (Linkedin / YouTube / Substack / GitHub) is currently Head of AI Developer Relations at Labelbox. Most recently, she was Head of MLOps at Featureform. Mikiko has worked as an engineer, data scientist, and data analyst for companies like Mailchimp (Intuit), Teladoc, Sunrun, Autodesk as well as a handful of early stage startups. Mikiko leverages her knowledge and experiences as a practitioner, mentor, and strategist to contribute MLOps & production ML content through LinkedIn, Youtube, & Substack, as well as partnering with companies in the ML ecosystem like Nvidia. Her main goals are to help: data scientists deploy better models faster; ML platform engineers develop robust & scalable ML systems & stacks without breaking the bank; & bring the delight back into building ML products.

Mikiko will be presenting the MLOps Keynote:
MLOps: Where do we go from here?

Machine Learning Keynote
Susan Shu Chang (Toronto) @susan_shuc

Susan Shu Chang (Linkedin) is currently Principal Data Scientist at Elastic. Originally trained in Economics, Susan is a 5x PyCon speaker, founder of Indie game studio Quill Game Studios and organizer of the 3700+ member Toronto Women's Data Group. Susan is also author of the upcoming O'Reilly book: Machine Learning Interviews. To learn how she finds time for all this and more, check out her personal site, susanshu.com, for her writings on focus optimization and daily routines. O'Reilly Author

Susan will be presenting the Machine Learning Keynote session:
Distilling the meaning of language: How vector embeddings work
#machine-learning

Data Architecture Keynote
Jessica Talisman (Santa Cruz)

Jessica Talisman is a taxonomist, ontologist, information architect, and professional data wrangler. Over her 25 years of experience in the information & data architecture world, Jessica has worked in galleries, libraries, museums, the federal government, e-commerce, as tech and currently is the Information Architect for Sellers Platform at Amazon. Jessica holds a Master of Library and Information Science and Masters in Teaching. Check out Jessica’s recent interviews on the Monday Morning Data Chat and Discovering Data.

Jessica will be presenting the Data Architecture Keynote session:
What Data Architects and Engineers can learn from Library Science

Data Engineering Keynote
Jesse Anderson (Lisbon) @jessetanderson

One of our perennially requested speakers, Jesse Anderson (Linkedin) is author of the oft-cited APress book Data Teams. As managing director of the Big Data Institute, Jesse works with companies ranging from startups to Fortune 100 companies. As an expert trainer known for his novel teaching practices, Jesse has taught over 30,000 people the skills to become successful data engineers. Jesse is published on O’Reilly and Pragmatic Programmers. He has been covered in prestigious publications such as The Wall Street Journal, CNN, BBC, NPR, Engadget, and Wired. Check out Jesse's new deep dive podcast: Unapologetically Technical, and learn more about Jesse at Jesse-Anderson.com.

Jesse will present the Data Engineering Keynote:
The State of Data Engineering ... and not repeating history

Clair Sullivan (Breckenridge, Colorado) @ProfCJSullivan

Dr. Clair Sullivan is currently the Founder and CEO of Clair Sullivan and Associates, a company dedicated to providing data science consulting services. Prior to starting her company, she was the Director of Data Science at Vail Resorts leading a team of data scientists and machine learning engineers providing production models for operations and marketing. Previously she was a data science advocate at Neo4j, working to expand the community of data scientists and machine learning engineers using graphs to solve challenging problems. She received her doctorate degree in nuclear engineering from the University of Michigan in 2002. After that, she began her career in nuclear emergency response at Los Alamos National Laboratory where her research involved signal processing of spectroscopic data. She spent 4 years working in the federal government on related subjects and returned to academic research in 2012 as an assistant professor in the Department of Nuclear, Plasma, and Radiological Engineering at the University of Illinois at Urbana-Champaign. While there, her research focused on using machine learning to analyze the data from large sensor networks. Deciding to focus more on machine learning, she accepted a job at GitHub as a machine learning engineer while maintaining adjunct assistant professor status at the University of Illinois. In 2021 she joined Neo4j as a Graph Data Science Advocate. Additionally, she founded a company, La Neige Analytics, whose purpose is to provide data science expertise to the ski industry. She has authored 4 book chapters, over 20 peer-reviewed papers, and more than 30 conference papers. Dr. Sullivan was the recipient of the DARPA Young Faculty Award in 2014 and the American Nuclear Society's Mary J. Oestmann Professional Women's Achievement Award in 2015.

Clair will be presenting the following session:
Ensuring Success for your Data Team

Greg Coquillo (Seattle)

Greg Coquillo (Linkedin) is a passionate Data Professional and two-times Linkedin Top Voice in Data Science, AI, Technology and Innovation. He works in tech as a Product Manager who owns the roadmap for AI-powered systems with human-in-the-loop that automate and scale multi-media processing, such as Document and Image classification. He also lead a team of Data Scientists, Data Engineers and Software Engineers to build value-based pricing models that optimized margin for a global portfolio of chemical products. He considers himself having a non-traditional background working in tech as an Industrial Engineer with a Masters in Engineering Management from the University of Florida. However, his curiosity pushes him to be at the forefront of the AI revolution.

Greg will be presenting the following Data / AI Products session:
I am an AI Product Manager, am I not?

Jess Haberman (Boston) @JessHaberman

Jess Haberman is Director of Product Content at Anaconda, where she leads content strategy and education. Previously, Jess was an acquisitions editor at O’Reilly Media, collaborating with tech industry leaders to develop instructional books and online content in data science and data engineering. She has presented at and facilitated technology conferences (O’Reilly’s Strata and Data Superstreams, PyCon US, Scale by the Bay, DataCon LA), webinars, live training courses, podcasts, publishing seminars, and writing retreats. Jess earned her BA in English Literature from Denison University and spent 14 years in nonfiction book publishing.

Jess will be presenting the following Career Development session:
Ten Simple Rules for Writing a Technical Book

Closing Town Hall
Joe Reis (Salt Lake city)

Joe Reis (Linkedin), Co-Founder and CEO of Ternary Data, is a “recovering data scientist,” and a business-minded data nerd who’s worked in the data industry for 20 years. His responsibilities have ranged from statistical modeling, forecasting, machine learning, data engineering, data architecture, and everything else in between. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) as well as the newly launched Joe Reis Show (Apple / Spotify). Joe is also co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering. Joe also teaches at the University of Utah as well as runs several meetups, including The Utah Data Engineering Meetup and SLC Python. When he’s not busy running a company, teaching, or creating content, Joe often finds himself DJing/making music, rock climbing, or trail running in the mountains around Salt Lake City, Utah. O'Reilly Author

Joe will also be co-presenting the following session with Sol Rashidi:
Bridging the Gap: Enhancing Collaboration Between Executives and Practitioners in Data-Driven Organizations

Amy Hodler (Kettle Falls, Washington) @amyhodler

Amy Hodler is an evangelist for graph analytics, network science, and responsible AI. Amy has decades of experience in emerging tech at companies such as Microsoft, Hewlett-Packard (HP), Hitachi IoT, Neo4j, Cray, and Relational AI. Amy has a love for science history and a fascination for complexity studies. Amy is the co-author of the O'Reilly book: Graph Algorithms, as well as co-author of an upcoming volume on the history of graph analytics. O'Reilly Author

Amy will be presenting the Graph Analytics Keynote session:
Patterns of Power: Uncovering control points to influence outcomes
Amy will also be co-leading the following 90 minute hands-on AI workshop:
Causality: The Next Frontier of GenAI Explainability

Database Keynote
Peter Boncz (Amsterdam) @peterabcz

Peter Boncz (Wikipedia / Linkedin / homepage) has been active in the database community during the past four decades, making him a veteran. He leads the Database Architectures research group at research institute CWI in Amsterdam and has been involved in six startup companies so far. He was recently appointed ACM Fellow for his contributions to modern database architectures, and is also professor at VU University in Amsterdam, specializing in analytical databases. He is also the founder and chairman of the graph database organization Linked Data Benchmark Council (LDBC), though this year he is on leave from the latter function, during his sabbatical stay at MotherDuck.

Peter will be presenting the Database Keynote session:
An abridged history of DuckDB: database tech from Amsterdam
#database

Michelle Yi (SF Bay Area) @ YulleYi

Michelle Yi is a technology leader that specializes in machine learning and cloud computing. She has 15 years of experience in the technology industry, contributed to the original IBM Watson showcased on Jeopardy, and enjoys building and leading teams that develop and deploy AI solutions to solve real-world problems. Michelle is passionate about diversity, STEM education/careers for our minority communities, and serves both on the board of Women in Data and as an avid volunteer for Girls Who Code.

Michelle will be presenting the following AI session:
Building Generative AI Applications: An LLM Case Study
Michelle will also be co-leading the following 90 minute hands-on AI workshop:
Causality: The Next Frontier of GenAI Explainability

Knowledge Graph Keynote
Jans Aasman (SF Bay)

Jans Aasman (Wikipedia / LinkedIn) is a Ph.D. psychologist and expert in Cognitive Science - as well as CEO of Franz Inc., an early innovator in Artificial Intelligence and provider of the graph database, AllegroGraph. As both a scientist and CEO, Dr. Aasman continues to break ground in the areas of Artificial Intelligence and Knowledge Graphs as he works hand-in- hand with numerous Fortune 500 organizations as well as US and Foreign governments. Dr. Aasman spent a large part of his professional life in telecommunications research, specializing in applied Artificial Intelligence projects and intelligent user interfaces. He gathered patents in the areas of speech technology, multimodal user interaction, recommendation engines while developing precursor technology for tablets and personal assistants. He was also a professor in the Industrial Design department of the Technical University of Delft. Dr. Aasman is a noted conference speaker at such events as Smart Data, NoSQL Now, International Semantic Web Conference, GeoWeb, AAAI, Enterprise Data World, Text Analytics, and TTI Vanguard to name a few.

Jans will present the following Knowledge Graphs Keynote:
Beyond Human Oversight: The Rise of Self-Building Knowledge Graphs in AI

Roopa Tangirala (SF Bay Area)

Roopa Tangirala is a seasoned engineering executive with over two decades of experience steering large scale data platforms, specializing in Database as a Service (DBaaS) and adept at cultivating and scaling high-performing teams to achieve exceptional results. In her current role, she is a Senior Engineering Director at ClickHouse, where her teams focus on building the foundational infrastructure components for ClickHouse Cloud. Before joining ClickHouse, Roopa dedicated 14.5 years to Netflix, where a substantial part of her tenure was focused on shaping the database landscape. Some of her contributions included Netflix’s cloud migration including spearheading the adoption of Cassandra and playing a key role in evolving and optimizing Netflix data platform by leveraging abstractions on top of Cassandra, ElasticSearch and different database solutions.

Roopa will present the following Database session:
From Open Source to SaaS: The ClickHouse Odyssey

Shane Murray (Brooklyn)

Shane Murray is the field chief technology officer at Monte Carlo, partnering with data leaders on their data strategy and operations, to realize the maximal value from their data observability and data quality initiatives. Prior to Monte Carlo, Shane was the senior vice president of data & insights at The New York Times, leading 150+ employees across data science, analytics, governance and data platforms. Under his leadership, Shane expanded the team into areas like applied machine learning, experimentation and data privacy, delivering research & insights that improved the Times' ability to draw and retain a large audience and scale the digital subscription business, which grew 10-fold to 8 million subscriptions over this tenure.

Shane will present the following Data Product session:
Data's Product Pivot: The Reliability Imperative

Roy Hasson (Wrentham, Massachusetts)

Roy Hasson is the Head of Product at Upsolver where he works with customers globally to simplify how they build, manage and deploy data pipelines to deliver high quality data as a product. Sr Manager of Global Business Development for Analytics and Data Lakes at Amazon Web Services. Roy serves as an expert resource on big data architectures, data lakes and machine learning. Prior to AWS, Roy spent 15 years working with tier 1 service providers to design and deploy large data and telephone network systems.

Roy will present the following two Data Engineering sessions:
ELT and ETL are not products, technologies, or features.
Battle of the warehouses, lakehouses and streaming DBs, choose your platform wisely

Dipankar Mazumdar (Toronto) @Dipankartnt

Dipankar Mazumdar (Linkedin / GitHub) is currently a Staff Data Advocate at ONEHOUSE, where he focuses on open-source projects such as Apache Hudi and Onetable to help engineering teams build and scale robust data analytics platforms. Before this, he worked on critical open-source projects such as Apache Iceberg and Apache Arrow at Dremio. For most of his career, Dipankar worked at the intersection of Data Visualization and Machine Learning. He also holds a Master's in Computer Science with a research area focused on ExplainableAI.

Dipankar will present the following Data Engineering session:
OneTable: Interoperate between Apache Hudi, Delta Lake & Apache Iceberg

Lindsay Murphy (Toronto)

Lindsay Murphy, a data expert with 12 years of industry experience, currently serves as the Head of Data at Secoda, a data search and discovery tool. She's led data functions at Toronto startups Maple and BenchSci, co-founded the 2000+ Toronto Modern Data Stack Meetup, and created and teaches an Advanced dbt course at Uplimit.

Lindsay will be presenting the following Data Engineering session:
Cost containment: Scaling your data function on a budget

Holden Karau ( San Francisco) @holdenkarau

Holden Karau (Wikipedia / Linkedin ) is a queer transgender Canadian, Apache Spark committer, Apache Software Foundation member, and an active open source contributor. As a software engineer, she’s worked on a variety of distributed computing, search, and classification problems at Apple, Google, IBM, Alpine, Databricks, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor of mathematics in computer science. Outside of software, she enjoys playing with fire, welding, riding scooters, eating poutine, and dancing.
Holden is the author of multiple O'Reilly publications, including Learning Spark, Kubeflow for Machine Learning, Scaling Spark with Ray, as well as two upcoming titles: Scaling Spark with Dask, and the 2nd edition of High Performance Spark. O'Reilly Author

Holden will present the following llm / Data Engineering session:
Using LLMs to Fight Health Insurance Denials: From Data Synthesis to Production

Santona Tuli (Washington DC)

Santona Tuli, PhD started working with data through fundamental physics—analyzing massive event data from particle collisions at CERN. Since then, she has worked as a machine learning engineer in the NLP sector, and on product engineering for the programmatic data workflow orchestration tool Airflow. Currently at Upsolver, she works on a framework for authoring data pipelines declaratively in SQL. Dr. Tuli is passionate about building and empowering others to build end-to-end data and machine learning pipelines scalably.She has also been featured in the 3D IMAX movie Secrets of the Universe, which showcases real scientists pushing the frontiers of knowledge. In her STEM outreach work, she emphasizes representation, equity, advocacy and empowerment.

Santona will be presenting the following Data Quality session:
Data Quality Skepticism

Hubert Dulay (New York) @hkdulay

Hubert Dulay (LinkedIn ), currently a developer advocate at StarTree, is a veteran engineer with over 20 years of experience in big and fast data and MLOps. Previously, he held positions at Decodable, Confluent, and Cloudera. Hubert is co-author of the recently released Streaming Data Mesh from O'Reilly, and is currently working on his second book, Streaming Databases, also from O'Reilly. O'Reilly Author

Hubert will present the following Data Engineering session:
Introduction to the Streaming Plane: Bridging the Operational and Analytical Data Planes

Malcolm Hawker (London)

Former Gartner Analyst and Profisee Head of Data Strategy, Malcolm Hawker is a recognized thought leader and one of the industry’s foremost authorities on the topics of data strategy, data governance, and master data management. As the co-author of the last three Gartner MDM Magic Quadrant™ documents, Malcolm has consulted with thousands of CDO's and other data leaders from across the globe on their biggest data related challenges. In a career that spans three decades, Malcolm has held executive-level IT and Product leadership roles at F500 companies, and has a unique combination of experience as a leader, implementer, vendor, and consultant for enterprise-class data solutions. Having lived in Austin for a big portion of his professional life, Malcolm has deep ties to Texas and the amazing data professionals that call it home.

Malcolm will be presenting the following Data Products session:
Data Product Chaos

Ron Itelman (Denver) @ron_itelman

Ron Itelman (Linkedin / intelligence.ai / Medium) is passionate about creating systems that leverage human and machine learning to augment efficiency and give users delightful experiences. Ron has served as product designer, product owner, UX designer, full-stack developer, startup-founder, and business manager. His specialty is working with data scientists, machine learning engineers, organizational leaders, and end-users to increase productivity while giving users experiences that feel magical. Ron is co-author of the upcoming O'Reilly book: Unifying Business, Data, and Code. O'Reilly Author

Ron will be presenting the following Data Products session:
Data Products: The Value of Simplicity

David Hughes (Seattle)

David Hughes is the Principal Graph Consultant for Graphable. He has 10 years of experience designing and building graph solutions which surface meaningful insights. His background includes clinical practice, medical research, software development, and cloud architecture. David has worked in healthcare and biotech within the intensive care, interventional radiology, oncology, cardiology, and proteomics domains. He enjoys endurance running, hiking, and spending time with his family in the outdoors when he is not enabling clients to have data epiphanies from their complex data.

David will present the following two Graph Day sessions:
Accelerating Insights with Memgraph and GraphXR: A Unified Approach to Graph Database Analytics and Visualization.
Advancing Graph Data Insights through a Graph Query Engine: PuppyGraph

William Lyon (SFBay) @lyonwj

William Lyon (LinkedIn / blog) is a Developer Relations Engineer at Wherobots where he helps developers and data scientists build spatial analytics applications and make sense of geospatial data. Previously he worked at Neo4j as a software engineer. Prior to joining Neo, William worked as a software developer for several startups in the real estate software, quantitative finance, and predictive API fields. William holds a Masters degree in Computer Science from the University of Montana. William is author of the Manning publication Full Stack GraphQL Applications With React, Node.js, and Neo4j.

Will will be presenting the following Data Products session:
Geospatial Analytics With Apache Sedona In Python, R & SQL

Data Leadership Keynote
Aaron Wilkerson (Detroit)

Aaron Wilkerson is the Sr. Manager of Data Strategy and Governance at Carhartt, where he is responsible for developing and delivering the company's enterprise data governance strategy. Aaron’s expertise lies in building and delivering data platforms that provide valuable insights to organizational leaders. His career spans over 15 years working in technical capacities across various industries, including Manufacturing, Financial Technology Services, Automotive, and Healthcare. Aaron is a frequent guest on well known data podcasts, most recently, Catalog and Cocktails, the Super Data Brothers, and the Tech Bros.

Aaron will be presenting the Data Leadership session:
You need to be more strategic - The mantra for data leader career growth.

Veronika Durgin (Boston)

Veronika Durgin is Vice President of Data at Saks, the premier luxury ecommerce platform. In her role she is responsible for the data strategy at Saks from driving enterprise digital transformation and governing enterprise data, to enabling data efficiency and supporting analytics and reporting of the full customer shopping journey. Prior to joining Saks, she held various data engineering and management roles at tech-enabled sustainable agriculture company, Indigo, and Sonos, Inc. Veronika started her career as a database administrator focusing on performance tuning and optimization. Over the past two decades she has developed skills across database administration, data engineering, platform architecture, data modeling, and analytics and insights. Veronika is a Certified Data Vault Practitioner and a Snowflake Data Superhero. She is passionate about her profession and sharing knowledge with others. Veronika earned a master’s degree in computer software engineering from Brandeis University and a bachelor’s degree in biology from the University of Massachusetts, Boston. She lives in Massachusetts with her husband, 2 boys, and a Rhodesian Ridgeback.

Veronika will be presenting the Data Engineering session:
On the Data Highway: Is Data Vault Speedy or Reckless?

Hala Nelson (Alexandria, Virginia)

Hala Nelson (Linkedin) is an Associate Professor of Mathematics at James Madison University. She has a Ph.D. in Mathematics from the Courant Institute of Mathematical Sciences at New York University. Prior to her work at James Madison University, she was a postdoctoral Assistant Professor at the University of Michigan- Ann Arbor. Her research is in the areas of Materials Science, Statistical Mechanics, Inverse Problems, and the Mathematics of Machine Learning and Artificial Intelligence. Her favorite subjects are Optimization, Numerical Algorithms, Mathematics for AI, Mathematical Analysis, Numerical Linear Algebra and Probability Theory. She likes to translate complex ideas into simple and practical terms. To her, most mathematical concepts are painless and relatable, unless the person presenting them either does not understand them very well, or is trying to show off. Other facts: Hala Nelson grew up in Lebanon, during the time of its brutal civil war. She lost her hair at a very young age in a missile explosion. This event and many that followed shaped her interests in human behavior, the nature of intelligence, and AI. Her father taught her Math, at home and in French, until she graduated high school. Her favorite quote from her father about math is, "It is the one clean science''. Hala is author of the recent O'Reilly book: Essential Math for AI.
#oreilly-showcase

Hala will host the following AI session:
My Attempt To Build a Foundation for AI and Data

Adi Polak (Israel) @AdiPolak

Adi Polak brings her vast industry research and engineering experience to bear in educating and helping teams design, architect, and build cost-effective data systems and machine learning pipelines that emphasize scalability, expertise, and business goals. Adi is a frequent worldwide presenter and the author of the recent O'Reilly book, Machine Learning With Apache Spark. She is continually an invited member of multiple program committees and advisor for conferences like Data & AI Summit, Scale by the Bay, and others. Previously, she was a senior manager for Azure at Microsoft, where she focused on building advanced analytics systems and modern architectures. When Adi isn’t building data pipelines or thinking up new software architecture, you can find her on the local cultural scene or at the beach.
#oreilly-showcase

Adi will host the following LLM / NLP session:
Demystifying the next advancements of LLMs: RAG: Retrieval-augmented generation

Janet Six (DFW) @janetmsix

Janet Six is a Product Manager at Tom Sawyer Software, where she helps companies design easier-to-use products within their financial, time, and technical constraints. For her research in information visualization, Janet was awarded the University of Texas at Dallas Jonsson School of Engineering Computer Science Dissertation of the Year Award. She was also awarded the prestigious IEEE Dallas Section 2003 Outstanding Young Engineer Award. Her work has appeared in the Journal of Graph Algorithms and Applications and the Kluwer International Series in Engineering and Computer Science. The proceedings of conferences on Graph Drawing, Information Visualization, and Algorithm Engineering and Experiments have also included the results of her research.

Janet will host the following Business Intelligence / Visualization session:
Discover Insights in a Large Multi-Decade Life Sciences Database Through Data Visualization and Analysis

Paige Roberts (Hamilton, Texas) @RobertsPaige

With two and a half decades in the data management industry, Paige Roberts (Linkedin), has worked as an engineer, trainer, support technician, technical writer, marketer, product manager, and a consultant. She has built data engineering pipelines and architectures, documented and tested open source analytics implementations, worked with different industries, and questioned a lot of assumptions. She's worked for companies like Pervasive, the Bloor Group, Actian, Hortonworks, Syncsort, and Vertica. She contributed to “97 Things Every Data Engineer Should Know,” and co-authored “Accelerate Machine Learning with a Unified Analytics Architecture” and “Up and Running with Aerospike” all from O’Reilly publishing. She promotes understanding of distributed data processing, high scale data engineering architecture, and how the analytics revolution is changing the world. In her free time, she’s been known to hurl axes, pull a bow, and write fantasy books under her maiden name, Paige E. Ewing.

Paige will host the following Data Engineering session:
When linear scaling is too slow – strategies for high scale data processing

Megan Lieu (Washington DC)

Bitten early in life by the math bug, Megan Lieu began her career in finance working on transaction advisory and business valuations. Currently a Data Advocate at Deepnote, Megan is the author of two Linkedin Learning courses: Choose the Right Tool for Your Data and SQL for Finance Professionals.

Megan will host the following Data Science session:
Evolving as a Data Scientist in the age of AI"

Ryan Boyd (Boulder) @ ryguyrg

Ryan Boyd (LinkedIn) is a Boulder-based software engineer, data + authNZ geek and technology executive. He's currently a co-founder at MotherDuck, where they're making data analytics fun, frictionless and ducking awesome. He previously led developer relations teams at Databricks, Neo4j and Google Cloud. He's the author of O'Reilly's Getting Started with OAuth 2.0.Ryan advises B2B SaaS startups on growth marketing and developer relations as a Partner at Hypergrowth Partners. Prior to leading the Google Cloud Developer Relations team, he spent 7 years at Google working on 20+ different developer products and was the co-founder of Google Code Labs which aimed to improve quality and stability of Google's developer products.Ryan graduated with a degree in Computer Science from Rochester Institute of Technology (RIT) where he later worked full-time building web applications + APIs and architecting the central web hosting platform.

Ryan and Peter Boncz will co-present the following session:
DuckDB - Ask me anything

Juha Korpela (Helsinki)

Juha Korpela, one of the leading data modeling experts in the Nordics, currently provides services at Datakor Consulting. Most recently, he was Chief Product Officer at Ellie.ai. Juha is a strong advocate for always staying business-driven when designing new data architectures and believes that the future of data is in the hands of business users. Over the last decade, Juha has worked in multiple industries (such as manufacturing, banking, and the public sector) in a variety of high-profile data management roles. In addition to data modeling, Juha's main areas of expertise are data warehousing, information architecture management, data mesh, and agile methodologies.

Juha will present the following data modeling session:
Conceptual Modeling - a practical way to capture business needs for data products

Ryan Mitchell (Boston) @Kludgist

An expert in web scraping, web security, and data science, Ryan Mitchell is a frequently requested speaker at data and security conferences. She has also taught and designed courses at Northeastern University and Olin College of Engineering. Ryan holds a master’s degree in software engineering from Harvard University Extension School and is currently a senior software engineer at the Gerson Lehrman Group where she creates data science tools. Ryan is author of the O'Reilly book Web Scraping with Python, soon to be in its third edition, as well as multiple Linkedin courses including: Python Data Structures with Trees and Web Scraping with Python.
#oreilly-showcase

Ryan will be presenting the following AI / knowledge graph session:
Managing Competing AI Decisions in Large Ontologies.

Marko Budiselic (London)

Marko Budiselic (LinkedIn ) is the co-founder and CTO (Chief Technology Officer) at Memgraph. As the CTO, Marko’s role typically involves overseeing the company’s technical aspects and technology strategy, including developing the Memgraph graph database platform. He is responsible for guiding the research and development teams, ensuring that the product aligns with the company’s strategic objectives, and helping to make crucial technical decisions to improve the platform.

Marko will be presenting the following graph session:
Querying a Graph Through an LLM.

Jonathan Mugan (Austin) @jmugan

Jonathan Mugan (Linkedin), Principal Scientist at De Umbra, is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion.

Jonathan will be presenting the following two sessions:
1) Practical Large Language Models: Using LLMs in Your Business With Python.
2) Survey of recent progress in intelligent robotics.

Jonathan Ellis (Austin) @spyced

Jonathan Ellis became involved with Apache Cassandra in 2008 when Rackspace hired him to build their next-generation database infrastructure. As its first PMC chair, and later as co-founder of DataStax, Jonathan is largely responsible for leading Cassandra through its first decade of development. Most recently, Jonathan has been working with Vector Search to facilitate its integration with Cassandra for the next generation of AI applications.

Jonathan will be presenting the following Database session:
Under the hood of vector search with JVector

Matthias Broecheler (Seattle) @mbroecheler

Dr. Matthias Broecheler (Linkedin) is the inventor of the Titan graph database (acquired by DataStax in 2015) and co-founder of Aurelius, the original company behind the Apache TinkerPop graph framework. A sought after speaker, Matthias introduced Titan at the 2012 Cassandra Summit and gave the keynote at the first Graph Day Texas in 2016 (interview). Most recently, Matthias has been developing DataSQRL : a compiler and build tool for streaming data pipelines to build data APIs. Matthias is co-author of O'Reilly book : The Practitioner's Guide to Graph Data. Matthias received his PhD in Computer Science at University of Maryland, College Park.
#oreilly-showcase

Matthias will be presenting the following Database session:
We decomposed the database - now what?

Brian Greene (Chicago)

Brian Greene (Linkedin / Substack) has spent his career building software teams across BI, middleware, enterprise architecture, and cloud data management ecosystems. As an architect at SnapLogic he's involved in new product development. As the inventor of Neuron Sphere, Brian he's been helping to create a graph-based model-driven-architecture platform engineering toolkit - bringing the breadth of software engineering discipline and capability to building data collection and interchange ecosystems.

Brian will be presenting the following Data Modeling session:
How polyglot storage cost me a job, almost killed data modeling, (and started my quest for one data model to rule them all)

Matthew Housley (Salt Lake city)

Co-Founder / CTO of Ternary Data as well as fellow “Recovering Data Scientist,” , Matthew Housley is also a “Reformed Academic,” holding a PhD in Math and dual Masters degrees in both Math and Physics. It was only natural that he began his career in Academia as a Professor of Mathematics, before joining one of the largest e-commerce companies as a data scientist. Matt's STEM background in combination with his knack for teaching makes him a mastermind at overhauling processes, improving teamwork, and incorporating engineering best practices so that real value is delivered to companies. While making the journey from data scientist to data engineer, Matt began to focus more on data & cloud engineering, working extensively with Amazon Web Services, Google Cloud Platform, Containers, Apache Airflow and GPUs, among other technologies. Matt (or should we say, “Dr. Housley”) is an adjunct faculty member in the David Eccles School of Business at The University of Utah. Joe is co-host of the popular Monday Morning Data Chat (Spotify / Apple) and co-author of the bestselling O'Reilly book: Fundamentals of Data Engineering.
#oreilly-showcase

Chris Tabb (London)

Chris Tabb, co-founder of LEIT DATA started his career in the Business Intelligence/Analytics domain 30 years ago. Beginning at Cognos in the 90’s working in the back office before becoming an expert in all their products, and leaving to become an independent BI consultant in 1998. Chris has followed the evolution of the analytics industry, working hands-on with all the technologies in the ecosystems: – Databases, ETL/ELT, BI/OLAP /Visualisation Tools, Big Data Technologies, Infrastructure On premises / Cloud across many vendors, some old some new. Recently with a focus on the Modern Data Stack Evolution Chris has started many movements with a focus on Business Value using a number of hashtags to raise awareness #bringbackdatamodelling / #bringbackdatamodeling #bringbackdocumention under the umbrella of the #meandatastreets that is focused on simplification of the Data Platform architecture and to focus on Business Value.

Chris will be presenting the following Business session:
Data-Driven Transformation: Building a Business Value Machine.

Andy Petrella (Liège, Belgium) @noootsab

Andy Petrella is an entrepreneur with a Mathematics and Distributed Data background.Andy is an early evangelist of Apache Spark and the Spark Notebook creator in the data community. Andy is the founder and CEO of Kensu, a data observability solution implementing the Data Observability Driven Development (DODD) method. Andy is also author of the O'Reilly book: Fundamentals of Data Observability. #oreilly-showcase

Ryan Dolley (Detroit)

Ryan Dolley is a data consultant specializing in BI and analytics, author of the Super Data Blog, and one half of the Super Data Brothers. Check out his discussion on the evolution of BI and moving beyond dashboards on a recent episode of the Joe Reis Show.

Sean Robinson (Charlotte)

Sean Robinson is a versatile data scientist with several years of experience optimizing data processes and building intelligent data systems. Specifically, he specializes in the use of graph data science and Neo4j to abstract complex systems within a domain into a highly dimensional, interconnected knowledge graphs to uncover novel insights which would otherwise remain dormant in other data structures. Sean currently serves both as Lead Data Scientist at Graphable as well as creating and instructing new network science courses at the University of North Carolina at Charlotte’s Data Science graduate program where he instructs the next generation of data scientists on how to integrate graph data science into their toolkit.

Sean will be presenting the following session:
LLMs for Enhanced ETL into Graph
Sean will also be leading the following 90 minute hands-on graph workshop:
Introduction to Graph Data Science in Python: Leveling Up Your Data Science Toolbelt

Alex Merced (Winter Park, FL) @alexmerced

Alex Merced (Linkedin) is a Developer Advocate at Dremio with a history of creating content to enable developers of all types through his personal projects like DevNursery.com, The Web Dev 101 Podcast, and the DataNation podcast. Alex Merced has been a developer with companies like Crossfield Digital, CampusGuard, GenEd Systems and others along with being an Instructor for General Assembly Bootcamps. Alex is co-author of the upcoming O'Reilly book: Apache Iceberg: The Definitive Guide.
#oreilly-showcase

Alex will be presenting the following Iceberg session:
1. Exploring the Apache Iceberg Ecosystem
Alex will also be presenting the following data lakehouse session:
2. The Ins & Outs of Data Lakehouse Versioning at the File, Table, and Catalog Level

Ryan Wisnesky (Cambridge, Massachusetts )

Ryan Wisnesky (LinkedIn) obtained B.S. and M.S. degrees in mathematics and computer science from Stanford University and a Ph.D. in computer science from Harvard University, where he studied the design and implementation of provably correct software systems. Previously, he was a postdoctoral associate in the MIT department of mathematics, where he developed the categorical query language CQL. He currently leads open-source and commercial development of CQL as CTO of Conexus AI. He maintains an active collaboration with the information-integration department of IBM Research, where he contributed to the Clio, Orchid, and HIL projects.

Ryan will present the following two sessions:
1. Expert Systems are Generative AIs
2. Computational Trinitarianism

Tom Zeppenfeldt (Netherlands)

Tom Zeppenfeldt is the founder and CEO of Graphileon. His mission is to provide mid-size and larger organizations with low-code tools to remove bottlenecks and solve challenges by leveraging the synergie between graph databases, vector indexes and generative AI. Originally an agricultural engineer, Tom now applies the lessons learnt from his international experience across many sectors, including food, pharma, fintech, architectural building and engineering, as well as fraud detection.

Tom will present the following session:
Enhancing low-code with graph, vector and AI

Juan Sequeda (Austin) @juansequeda

Dr. Juan Sequeda is the Principal Scientist at data.world. He joined through the acquisition of Capsenta, a company he founded as a spin-off from his PhD research in Computer Science from The University of Texas at Austin. His goal is to reliably create knowledge from inscrutable data. His research and industry work has been on designing and building Knowledge Graph for enterprise data and metadata management.
Juan has researched and developed technology on semantic data virtualization, graph data modeling, schema mapping and data integration methodologies. He pioneered technology to construct knowledge graphs from relational databases, resulting in W3C standards, research awards, patents, software and his startup Capsenta acquired by data.world in 2019. Juan strives to build bridges between academia and industry as past co-chair of the LDBC Property Graph Schema Working Group, member of the LDCB Graph Query Languages task force, standards editor at the World Wide Web Consortium (W3C) and organizing committees of scientific conferences, including being the general chair of The Web Conference 2023.

Juan will be presenting the following data engineering session:
1. Past, Present and Future of Data Catalogs
Alex will also be presenting the LLM Keynote session:
2. The Role of Knowledge Graphs for LLM accuracy in the Enterprise

Vish Puttagunta (Dallas-Fort Worth)

Vish Puttagunta is a Senior Analytics leader and Investor with a proven track record of applying Data, Artificial and Business Intelligence techniques to generate tangible and measurable ROI in Business, with a firm focus on Food Manufacturing and Food Packaging. As the CEO of Power Central, he helps fellow small and mid-sized Food Manufacturing and Packaging companies scale their operations profitably while complying with FDA regulations.

Vish will present the following session:
Building a Resilient Food Supply Chain with Neo4J and Graphileon

Satoru Hayasaka (Austin) @sathayas42

Dr. Satoru Hayasaka was trained in statistical analysis of various types of biomedical data. He has taught several courses on data analysis geared toward non-experts and beginners. Prior to joining KNIME, Satoru taught introductory machine learning courses to graduate students from different disciplines. Based in Austin, TX, he is part of the KNIME Evangelism team, and he continues teaching machine learning and data mining using KNIME software. Satoru received his PhD in biostatistics from the University of Michigan, and completed his post-doctoral training at University of California, San Francisco.

Satoru will present the following 90 Deep Dive: LLMs In a Low-Code Environment – Is it Possible?.

Scott Fincher (Austin)

As an experienced data scientist, Scott Fincher routinely teaches, presents, and leads group workshops covering topics such as the KNIME Analytics Platform, Machine Learning, and the broad Data Science umbrella. He enjoys assisting other data scientists with general best practices and model optimization. For Scott, this is not just an academic exercise. Prior to his work at KNIME, he worked for almost 20 years as an environmental consultant, with a focus on numerical modeling of atmospheric pollutants. Scott holds an MS in Statistics and a BS in Meteorology, both from Texas A&M University.

Scott will present the following Low code Data Science session:
Low-Code Data Analysis with the KNIME Analytics Platform