Who will speak at Data Day Texas 2018?

Take advantage of our discount room block at the official conference hotel.
Use the following link to book your room: http://datadaytexas.com/2018/book-a-hotel-room

We have just begun to announce the first wave of speakers for 2018. Is there someone you want to see? Some tool, technology, or project you want to see covered? Send us your thoughts at suggestions@datadaytexas.com

John Akred (SF Bay) @BigDataAnalysis

John Akred is the Founder and CTO of Silicon Valley Data Science. In the business world, John Akred likes to help organizations become more data driven. He has over 15 years of experience in machine learning, predictive modeling, and analytical system architecture. His focus is on the intersection of data science tools and techniques; data transport, processing and storage technologies; and the data management strategy and practices that can unlock data driven capabilities for an organization. A frequent speaker at the O'Reilly Strata Conferences, John is host of the perennially popular workshop: Building A Data Platform.
John will be giving the following presentation: Machine Learning: From The Lab To The Factory

Joey Echeverria (SF Bay) @fwiffo

Joey Echeverria is the platform technical lead at Splunk, where he builds applications for scaling IT operations built on the Apache Hadoop platform. Joey is a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a software engineer at Cloudera, where contributed to several ASF projects including Apache Flume, Apache Sqoop, Apache Hadoop, and Apache HBase. Joey is also a coauthor of Hadoop Security, published by O'Reilly Media.


Holden Karau (San Francisco) @holdenkarau

Holden Karau is a transgender Canadian, Apache Spark committer, an active open source contributor, and co-author of Learning Spark & High Performance Spark. When not in San Francisco working as a software development engineer at IBM’s Spark Technology Center, Holden talks internationally on Spark and holds office hours at coffee shops at home and abroad. She makes frequent contributions to Spark, specializing in PySpark and Machine Learning. Prior to IBM she worked on a variety of distributed, search, and classification problems at Alpine, Databricks, Google, Foursquare, and Amazon. She graduated from the University of Waterloo with a Bachelor of Mathematics in Computer Science. Outside of computers she enjoys dancing & playing with fire.

Alex Korbonits (Seattle) @korbonits

Alex Korbonits is a Data Scientist at Remitly, Inc., where he works extensively on feature extraction and putting machine learning models into production. Outside of work, he loves Kaggle competitions, is diving deep into topological data analysis, and is exploring machine learning on GPUs. Alex is a graduate of the University of Chicago with degrees in Mathematics and Economics.
Alex gave one of the highest rated talks at Data Day Texas 2017: Distilling Dark Knowledge from Neural Networks.
Alex will be speaking as part of AI Weekend.

Jonathan Mugan (Austin) @jmugan

Jonathan Mugan (Linkedin) is a researcher specializing in artificial intelligence, machine learning, and natural language processing. His current research focuses in the area of deep learning for natural language generation and understanding. Dr. Mugan received his Ph.D. in Computer Science from the University of Texas at Austin. His thesis was centered in developmental robotics, which is an area of research that seeks to understand how robots can learn about the world in the same way that human children do. Dr. Mugan also held a post-doctoral position at Carnegie Mellon University, where he worked at the intersection of machine learning and human-computer interaction. One of the most requested speakers at the Data Day Texas conferences, he recently also spoke on the topic of NLP at the O’Reilly AI conference, and is the creator of the O’Reilly video course Natural Language Text Processing with Python. Dr. Mugan is also the author of The Curiosity Cycle: Preparing Your Child for the Ongoing Technological Explosion
Jonathan Mugan will be speaking as part of
AI Weekend..

Haikal Pribadi (London) @ haikalpribadi

Haikal Pribadi is the Founder and CEO of GRAKN.AI, the database for AI. His interest in the field began at the Monash Intelligent Systems Lab, where he built an open source driver for the Parallax Eddie Robot which was then adopted by NASA. After which, he completed a masters degree in AI from the University of Cambridge. Haikal was also the youngest Algorithm Expert behind Quintiq’s Optimisation Technology behind some of the world’s largest supply chain systems in transportation, retail and logistics. He now works on GRAKN.AI, a distributed knowledge base with that uses machine reasoning to handle and interpret complex data. GRAKN.AI was recently awarded Product of the Year 2017 by the University of Cambridge Computer Lab.
Haikal will be speaking as part of AI Weekend.

R User Day Speakers

Check out everything going on with R User Day.

Mara Averick (Boston) @dataandme

Mara Averick (LinkedIn / GitHub / Medium ) is a polymath and self-confessed data nerd. With a strong background in research, she has a breadth of experience in data analysis, visualization, and applications thereof. Currently, by day, she’s a Consultant at TCB Analytics. By night, you’ll find her sharing dope R related stuff on Twitter and translating heavily technical subject matter into easy reading for a non-technical audience. When she’s not talking data, she's diving into NBA stats, exploring weird and wonderful words, and/or indulging in her obsession with all things Archer. (Thanks to Mango Solutions for bio.)

Jasmine Dumas (Connecticut ) @jasdumas

Jasmine Dumas (LinkedIn / GitHub) is a Data Scientist at Simple Finance where she is focused on experimentation and data product development. She earned a B.S.E. in Biomedical Engineering from the University of Hartford and has experienece in Aerospace Manufacturing, Medical Devices and Financial Technology. She is an active member of the R programming community and has developed open source packages: shinyGEO, ttbbeer, shinyLP, & gramr and participated in Google Summer of Code, NASA Datanauts, R-Ladies, and Forwards. She is currently developing a course on shiny with DataCamp and co-organizing the regional Noreast'R Conference.

Alex Engler (Washington, D.C.) @alexcengler

Alex Engler (LinkedIn / GitHub / Urban Institute / Georgetown / Johns Hopkins) is the Program Director and Lecturer for the M.S. in Computational Analysis and Public Policy program at the University of Chicago. He is also a contributing data scientist to the Urban Institute, where he worked before UChicago. Alex also previously taught visualization and data science for policy analysis at Georgetown University and Johns Hopkins University.
Alex will be presenting the following workshop: Introduction to SparkR in AWS EMR, as part of R User Day.

Chester Ismay (Portland) @old_man_chester

Chester Ismay (LinkedIn / GitHub) is Curriculum Lead at DataCamp. He was formerly an Adjunct Professor of Sociology at Pacific University and an Instructional Technologist and Consultant for Data Science, Statistics, and R at Reed College. He obtained his PhD in statistics from Arizona State University and has taught courses and led workshops in statistics, data science, mathematics, computer science, and sociology. He is the co-author of the fivethirtyeight R data package and is the author of the thesisdown R package. He is also a co-author of an open source textbook entitled ModernDive: An Introduction to Statistical and Data Sciences via R.
Chester will be speaking as part of R User Day.

Albert Y. Kim (Amherst) @rudeboybert

Albert Y. Kim (LinkedIn / GitHub) is a Lecturer in Statistics in the Mathematics & Statistics Department at Amherst College. Born in Montreal Quebec, he earned his BSc in Mathematics and Computer Science from McGill University in 2004 and his PhD in Statistics from the University of Washington in 2011. Prior to joining Amherst College, he was a Decision Support Engineering Analyst in the AdWords division of Google Inc, a Visiting Assistant Professor of Statistics at Reed College, and an Assistant Professor of Statistics at Middlebury College.

Jared Lander (NYC) @jaredlander

Jared Lander (LinkedIn) is the Chief Data Scientist of Lander Analytics a data science consultancy based in New York City, the Organizer of the New York Open Statistical Programming Meetup and the New York R Conference and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations ranges from music and fund raising to finance and humanitarian relief efforts.
Jared specializes in data management, multilevel models, machine learning, generalized linear models, data management and statistical computing. He is the author of R for Everyone: Advanced Analytics and Graphics, a book about R Programming geared toward Data Scientists and Non-Statisticians alike and is creating a course on glmnet with DataCamp.
Chester will be speaking as part of R User Day.

Lucy McGowan (Nashville) @LucyStats

Lucy D'Agostino McGowan (LinkedIn / GitHub) is a Biostatistics PhD candidate at Vanderbilt University where her research focuses on observational studies, large-scale inference, and methods for quantifying and estimating the effect of unmeasured confounding. She is the co-founder of R-Ladies Nashville and is enthusiastic about learning from and uplifting other women in the R and STEM communities.
Lucy will be speaking as part of R User Day

Jessica Minnier (Portland) @datapointier

Jessica Minnier (LinkedIn / GitHub)
is an Assistant Professor of Biostatistics at Oregon Health & Sciences University. She is a faculty member of the OHSU-PSU School of Public Health with appointments in the Knight Cardiovascular Institute and Knight Cancer Institute Biostatistics Shared Resource. Her statistical research interests include risk prediction with high dimensional data sets and the analysis of genetic and other omics data. She is also interested in statistical computing (mostly in R), reproducible research and open science.
Jessica teaches Mathematics/Statistics II, a statistical inference course for the MS in Biostatistics program at OHSU-PSU School of Public Health. Jessica has an A.M. and Ph.D. in Biostatistics from Harvard University and a B.A. in Mathematics with minor in Computer Science from Lewis & Clark College.

Jonathan Nolis (Seattle) @skyetetra

Jonathan Nolis (LinkedIn / GitHub) is the Director of Insights & Analytics at Lenati, and is the lead of the Customer Insights & Analytics team. He has over a decade of experience in solving business problems using data science. Jonathan has provided insights and strategic advice in industries such as retail, manufacturing, aerospace, health care, and e-commerce. Jonathan helps create proprietary technology for Lenati including the Loyalty Program ROI Simulator – a tool that uses big data to predict the value of a loyalty program. He has a PhD in industrial engineering, and has several academic publications in the field of applied optimization. Prior to joining Lenati, Jonathan was a Lead of Advanced Analytics at Promontory Financial Group, a regulatory compliance consulting firm.

Hilary Parker (San Francisco) @hspter

Hilary Parker (LinkedIn / GitHub) is a Data Scientist at Stitch Fix and co-host of the Not So Standard Deviations podcast. She is an R and statistics enthusiast determined to bring rigor to analysis wherever she goes. At Stitch Fix she works on teasing apart correlation from causation, with a strong dose of reproducibility. Formerly a Senior Data Analyst at Etsy, she received a PhD in Biostatistics from the Johns Hopkins Bloomberg School of Public Health.
Julia will be speaking as part of R User Day

Gabriela de Queiroz (San Francisco) @gdequeiroz

Gabriela de Queiroz (LinkedIn / GitHub) is the Lead Data Scientist at SelfScore. Formerly Gabriela was data scientist at Sharethrough, where she developed statistical models from concept creation to production, designed, ran, and analyzed experiments, and employed a variety of techniques to derive insights and drive data-centric decisions. Gabriela is the founder of R-Ladies, an organization created to promote diversity in the R community, which now has over 25 chapters worldwide. Currently, she is developing an online course on machine learning in partnership with DataCamp.
Gabriela will be speaking as part of R User Day

David Robinson (NYC) @drob

David Robinson (LinkedIn / GitHub) is a data scientist at Stack Overflow with a PhD in Quantitative and Computational Biology from Princeton University. He enjoys developing open source R packages, including broom, gganimate, fuzzyjoin and widyr, as well as blogging about statistics, R, and text mining on his blog, Variance Explained.
David will be speaking as part of R User Day.

Emily Robinson (NYC) @robinson_es

Emily Robinson (LinkedIn / GitHub) works as a Data Analyst at Etsy with the search team to design, implement, and analyze experiments on the ranking algorithm, UI changes, and new features. Emily earned her masters in Organizational Behavior from INSEAD in 2016 and her bachelor’s in Decision Sciences from Rice University (where she took classes from Hadley Wickham). She's a co-organizer of the R-Ladies NYC chapter, a global organization to promote gender diversity in the R community. She enjoys blogging about A/B Testing, conferences, and data science projects on her blog, Hooked on Data.

Julia Silge (Salt Lake City) @juliasilge

Julia Silge (LinkedIn / GitHub) is a data scientist at Stack Overflow. She enjoys making beautiful charts, the statistical programming language R, black coffee, red wine, and the mountains of her adopted home here in Utah. She has a PhD in astrophysics and an abiding love for Jane Austen. Her work involves analyzing and modeling complex data sets while communicating about technical topics with diverse audiences.
Julia will be speaking as part of R User Day.

David Smith (Chicago) @revodavid

David Smith is the R Community Lead at Microsoft. With a background in data science, he writes daily about applications of predictive analytics at the Revolutions blog (blog.revolutionanalytics.com), and is a co-author of Introduction to R.
David will be speaking as part of R User Day.

Nick Strayer (Nashville) @NicholasStrayer

Nick Strayer (LinkedIn / GitHub) has worked in many different realms, including as a Journalist at the New York Times, data scientist at Dealer.com in Vermont, and as a "data artist in residence" at tech startup Conduce in California. Currently, he is a PhD student in biostatistics at Vanderbilt University and also an intern at the Johns Hopkins Data Science Lab. Recently (May '15), he graduated from the University of Vermont where he majored in mathematics and statistics and minored in computer science.
Nick likes data. Manipulating it, modeling it, making it (simulation), visualizing it and yes, even cleaning it. He does these things with some combination of R, Python and Javascript (d3.js in particular). Most recently he has been fascinated with conveying complex statistical topics and methods using intuitive and interactive graphics.
Nick's current research interests include: data gathering, extracting inference from machine learning, data visualization and scientific communication. When not in "school mode" Nick loves to bike places, read science fiction and wander around gardens/musuems.

Daniel Woodie (Austin) @DanielWoodie5

Daniel Woodie is founder and lead scientist of Bamboo Analytics, a data science services firm. He's trained originally as a statistician and has worked on applications ranging from systems neuroscience to global supply chains. With Bamboo Analytics he offers analytical consulting and training to early stage startups and Fortune 500 companies, alike.
Daniel will be emcee for R User Day at Data Day Texas.



Emil Eifrem of Neo4j describing the evolution of the property graph model.

Rob McDaniel (Seattle) of LiveStories was one of the highest rated speakers at Data Day Texas 2017