The following speakers have been confirmed for Data Day Texas 2016, with many more to come.
We are still accepting submissions for Data Day 2016. Details can be found on our proposals page.
John Akred (SF Bay)
John Akred is the Founder and CTO of Silicon Valley Data Science. In the business world, John Akred likes to help organizations become more data driven. He has over 15 years of experience in machine learning, predictive modeling, and analytical system architecture. His focus is on the intersection of data science tools and techniques; data transport, processing and storage technologies; and the data management strategy and practices that can unlock data driven capabilities for an organization. A frequent speaker at the O'Reilly Strata Conferences, John is host of the perennially popular workshop: Building A Data Platform.
John will also be hosting office hours at Data Day Texas.
Carl Anderson (NYC)
Carl Anderson is the Director of Data Science at Warby Parker in New York overseeing data engineering, data science, supporting the broader analytics org, and creating a data-driven organization. He has had a broad-ranging career, mostly in scientific computing, covering areas such as healthcare modeling, data compression, robotics, and agent based modeling. He holds a Ph.D. in mathematical biology from the University of Sheffield, UK.
He is the author of "Creating a Data-Driven Organization" (O'Reilly, 2015)
While at Data Day Seattle, Carl will also be signing his new book, Creating a Data Driven Organization
Lukas Biewald (SF Bay)
Lukas Biewald (Wikipedia) is the founder and CEO of CrowdFlower. Founded in 2007, CrowdFlower provides Labor-on-Demand to help companies outsource high-volume, repetitive tasks to a massively-distributed global workforce.
Before founding CrowdFlower, Lukas was a senior scientist and manager within the Ranking and Management Team at Powerset, Inc., acquired by Microsoft in 2008. He led the Search Relevance Team for Yahoo! Japan after graduating from Stanford University with a B.S. in Mathematics and an M.S. in Computer Science. Recently, Lukas won the Netexplorateur Award for GiveWork – a collaboration with Samasource that brings digital work to refugees worldwide. Lukas is also an expert level Go player.
While at Data Day Seattle, Lukas will also be hosting office hours
Trey Blalock (Seattle)
Trey Blalock, (GIAC-GWAPT, GIAC-GPEN, GIAC-GCFA, CISA, CISM, CISSP, SSCP, NSA-IAM) has served as Manager of Global Security Operations / Security Architect for one of the worlds largest financial transaction hubs (S1 Corporation) overseeing all aspects of security for hundreds of web-banking environments, ATM networks, and point-of-sale transaction networks world-wide.
Currently on the National Board of Information Security Examiners (NBISE) Operational Security Testing Panel designing comprehensive testing solutions to evaluate skill levels of commercial penetration testers as well as military red team, and blue team technicians. This is primarily to be used by government & military to identify above-average talent in these areas.
Has over ten years of experience providing penetration testing and assessment services to hundreds of clients in the financial, government, retail, chemical, oil & gas, medical, educational, legal, telecom, and law enforcement sectors. See his full bio on the following page.
While in Austin, Trey will be offering an encore presentation of his Pentesting 101 Course/
Kurt Brown (SFBay)
Kurt Brown, leads the Data Platform team at Netflix. His group architects and manages the technical infrastructure underpinning the company’s analytics. The Netflix data infrastructure includes various big data technologies (Hadoop, Hive, and Pig), Netflix open sourced applications and services (Lipstick and Genie), and traditional BI tools (Teradata and MicroStrategy).
Following his presentation at Data Day, Kurt will be holding office hours.
Laine Campbell (Las Vegas)
Laine Campbell specializes in database architecture and operations, particularly MySQL and Cassandra. Laine is currently CTO of OrderWithMe. Most recently, Laine was a co-founder at Pythian, where she led the open source database practice. Prior to that, Laine founded and led PalominoDB, then Blackbird for 8 years, where her team of DBAs supported many of the most exciting database infrastructures in the industry. Before that, she designed, built and supported the Travelocity databases for 8 years with a remarkable team. She lives in Las Vegas, and travels extensively. Laine has been around the block.
Laine is passionate about supporting members of underserved populations to gain experience, skills and jobs in technology. She is an advocate of bringing women, people of color and LGBTQ individuals into the world of technology, and supporting them in their careers. She is also passionate about open source technologies and the commoditization of IT, and how it can support communities and the general welfare of the individual.
Ed Capriolo (NYC)
Ed Capriolo is a Data Architect at the Huffington Post. Previously, he was a software developer at Media 6 degrees. Ed is organizer of the NYC Cassandra Meetup group, as well as a Apache Hive PMC committer / member. Ed is author of multiple books, including the Cassandra High Performance Cookbook and Programming Hive.
While at Data Day Texas, Ed will also be signing the soon to be released second edition of his O'Reilly book, Programming Hive
Michelle Casbon (San Antonio / SFBay )
Michelle Casbon is a Senior Data Science Engineer at Idibon, where she is contributing to the goal of bringing language technologies to all the world’s languages. Her development experience spans a decade across various industries, including media, investment banking, healthcare, retail, and geospatial services. Michelle completed a Masters at the University of Cambridge, focusing on NLP, speech recognition, speech synthesis, and machine translation. She loves working with open source technologies and has had a blast contributing to the Apache Spark project. Holding technical conversations and learning from the people she meets is her favorite part of Data Day Texas.
Ted Dunning (SFBay)
Ted Dunning is Chief Applications Architect at MapR Technologies and committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects and mentor for Apache Storm. He contributed to Mahout clustering, classification, and matrix decomposition algorithms and helped expand the new version of Mahout Math library. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems, built fraud-detection systems for ID Analytics (LifeLock), and has issued 24 patents to date. Ted has a PhD in computing science from University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. Ted is co-author, along with Ellen Friedman, of the recent O'Reilly media publications, Practical Machine Learning: Innovations in Recommendation, and Practical Machine Learning: A New Look at Anomaly Detection. By the way, Ted bought the beer at the first Hadoop meetup.
Chris Fregly (SFBay)
Chris Fregly is Principle Data Solutions Engineer for the IBM Spark Technology Center. Chris is an Apache Spark contributor, the organizer of the Bay Area Advanced Spark Meetup, and author of the upcoming books Advanced Spark and Spark Streaming in Action. Chris has 15+ years of distributed big data systems experience across many domains including media/entertainment, banking, insurance, and travel. Previously, Chris was an engineer at Databricks, Streaming Data Engineer at Netflix, Platform/Data Engineer at Playboy Enterprises, and a Distributed Systems Engineer at BEA Systems.
Twitter: @ cfregly
Ellen Friedman (SFBay)
Ellen Friedman is a solutions consultant, scientist and author, currently writing about a variety of open source and big data topics including being co-author of Mahout in Action (Manning), the Practical Machine Learning series from O’Reilly, and the newest title, Time Series Databases (O’Reilly). She is a committer on the Apache Mahout project, a contributor to Apache Drill and has been an invited speaker at Berlin Buzzwords 2013, the Philly ETE 2014 conference and keynote speaker for NoSQL Matters 2014 in Barcelona. With a Ph.D. in biochemistry and years of work writing on a variety of scientific and computing topics, she is an experienced communicator. She’s also co-author of a book of magic-themed cartoons, A Rabbit Under the Hat.
Luca Garulli (London, UK)
Luca Garulli is the CEO and Founder of Orient Technologies, and the original author of OrientDB. Luca started working with storage algorithms in 1998 and created the first production-ready version of OrientDB in early 2010 after 17 years of experience working with other DBMSs. Luca is a member of the Sun Microsystems JDO 1.0 and 2.0 Expert Groups that wrote the JDO standard. He has also published various tech articles in Technet, Computer Programming, IoProgrammer, and Week.it magazines.
Luca will be holding office hours at Data Day. He will also be speaking at Graph Day
Jonathan Gray (SF Bay)
Jonathan Gray, founder and CEO of Cask, is an entrepreneur and software engineer with a background in startups, open source, and all things data. Prior to founding Cask, Jonathan was a software engineer at Facebook where he drove HBase engineering efforts, including Facebook Messages and several other large-scale projects, from inception to production.
An open source evangelist, Jonathan was responsible for helping build the Facebook engineering brand through developer outreach and refocusing the open source strategy of the company. Prior to Facebook, Jonathan founded Streamy.com, where he became an early adopter of Hadoop and HBase and is now a core contributor and active committer in the community.
Jonathan holds a Bachelor’s degree in Electrical and Computer Engineering and Business Administration from Carnegie Mellon University.
Joel Grus (Seattle)
Joel Grus is a software engineer at Google. Before that he worked as a data scientist at multiple startups. He lives in Seattle, where he regularly attends data science happy hours.
While at Data Day Texas, Joel will also be signing his recently released O'Reilly book, Data Science From Scratch
Sarah Guido (NYC)
Sarah Guido is a data scientist at Bitly and is interested in all things Python, data, and machine learning. She is a co-organizer of the PyGotham conference and the NYC Python meetup. Excited to share her passion for data with others, she has spoken at conferences such as PyCon, OSCON and PyData, and is writing an O'Reilly book on machine learning. Prior to joining Bitly, she worked in a few other startups and graduated from the University of Michigan’s School of Information.
While at Data Day Texas, Sarah will also be signing her new book, Introduction to Machine Learning with Python
Russell Jurney (SF Bay)
Russell Jurney is founder and CEO of Relato. Russell has over a decade of experience building analytic applications, from casino gaming to inbox analytics. Russell is passionate about graphs and sees networks in the world around him. Mapping markets to achieve a deeper understanding of how they work is exciting work.
Prior to Relato, Russell was a Data Scientist in Residence at The Hive, where he helped launch E8 Security as their first engineer. Before that he was Evangelist at Hortonworks, after being Senior Data Scientist in product analytics at LinkedIn. Russell is author the recently released O'Reilly book Agile Data Science as well co-author of the the soon to be released O'Reilly book: Big Data for Chimps. Russell is originally from Atlanta, GA. He lives in Pacifica, California with Bella the Data Dog.
While at Data Day, Russell will be holding office hours and signing copies of Big Data for Chimps. He will also be speaking at Graph Day on Sunday, January 18.
Jay Kreps (SF Bay)
Jay Kreps is the original author of multiple well-known projects including Apache Kafka, Apache Samza, Voldemort, and Azkaban. Formerly Principle Staff Engineer at Linkedin. Jay is also co-founder and CEO at Confluent - a company built around realtime data streams and the open source messaging system Apache Kafka. Jay is the author of the upcoming O'Reilly book, I Heart Logs: Event Data, Stream Processing, and Data Integration.
While at Data Day Texas, Jay will also be signing his new book, I Heart Logs
Patrick McFadin (SF Bay)
Patrick McFadin is regarded as one of the foremost experts of Apache Cassandra and data modeling techniques. As the Chief Evangelist for Apache Cassandra and consultant for DataStax, he has helped build some of the largest deployments in the world. Previous to DataStax, he was Chief Architect at Hobsons, an education services company. There, he spoke often on web application design and performance.
Wes McKinney (SF Bay)
Wes McKinney is a software engineer at Cloudera. Prior to that, Wes was co-founder of DataPad, and CTO and Cofounder of Lambda Foundry, Inc. From 2010 to 2012, he served as a Python consultant to hedge funds and banks while developing pandas, a widely used Python data analysis library. From 2007 to 2010, he researched global macro and credit trading strategies at AQR Capital Management. He graduated from MIT with an S.B. in Mathematics. He is on leave from the Duke University Ph.D program in Statistics. Wes is author of the O'Reilly book Python for Data Analysis.
Wes McKinney's Blog
Ryan Mitchell (Somerville, MA)
Ryan Mitchell is a Software Engineer at LinkeDrive in Boston, where she develops their API and data analysis tools. She is a graduate of Olin College of Engineering, and is a Masters degree student at Harvard University School of Extension Studies. Prior to joining LinkeDrive, she was a Software Engineer working on web scraping and data analysis at Abine. Ryan is author of Instant Web Scraping with Java and the upcoming O'Reilly book: Web Scraping with Python.
Twitter: @ Kludgist
While at Data Day Texas, Ryan will also be signing her new book, Introduction to Machine Learning with Python
Christopher Moody (SF Bay)
Chris Moody loves high-performance computing, high dimensions & high fashion. He loves learning the beautiful symmetries between physics, data, and analytics. Went to Caltech, did astrostats & supercomputing and now Data Labs at Stitch Fix. Currently enjoying coding up word2vec, Gaussian Processes, Deep RNNs and t-SNE.
Robert Munro (San Francisco)
Robert Munro is the CEO of Idibon, founded with the goal of bringing language technologies to all the world’s languages. He is a world leader in applying big data analytics to human communications, having worked in many diverse environments, from Sierra Leone, Haiti and the Amazon to London, Sydney and San Francisco. He completed a PhD in Computational Linguistics as a Graduate Fellow at Stanford University. Outside of work, he has learned about the world’s diversity by cycling more than 20,000 kilometers across 20 countries, mostly through the mountains.
Diego Oppenheimer (Seattle)
Diego Oppenheimer is a data geek with a passion for sports and cooking. He has worked in multiple industries in different capacities around business intelligence and data analytics.Prior to founding Algorithmia where he serves the role of CEO he spent over five years at Microsoft where he had the chance to deliver some of the most widely used data analysis software in the world including Excel, SQL Server and Power Pivot. He received his Bachelors in Information Systems Management and Masters degree in IS - Business Intelligence and Data Analytics from Carnegie Mellon University.
Diego will also be hosting office hours at Data Day Texas.
Claudia Perlich (NYC)
Prior to joining Dstillery (former Media6Degrees), Claudia Perlich spent five years working at the Data Analytics Research group at the IBM T.J. Watson Research Center, concentrating on research in data analytics and machine learning for complex real-world domains and applications. She has been published in over 30 scientific publications and holds multiple patents in the area of machine learning. Claudia has won many data mining competitions, including the prestigious 2007 KDD CUP on movie ratings, the 2008 KDD CUP on breast-cancer detection, and the 2009 KDD CUP on churn and propensity predictions for telecommunication customers. Claudia received her Ph.D. in Information Systems from Stern School of Business, New York University in 2005, and holds a Master of Computer Science from Colorado University.
Claudia will also be hosting office hours at Data Day Texas.
Ben Reiter (SF Bay)
Ben Reiter is a Senior Engineer on the Architecture team on Vungle. His responsibilities include planning and implementing a new shared data pipeline throughout Vungle. His focus for the past year has been on designing and implementing the way data flows through and is processed within Vungle's architecture.
Ben will be walking through their Spark use case at Vungle.
Eric Schmidt (Seattle)
Twitter: @DJ Rhythma
Fangjin Yang (SF Bay)
Fangjin Yang is one of the main committers to the open source Druid project and one of the first developers at Metamarkets, a San Francisco-based data startup. Fangjin previously worked on diagnostic optimization algorithms at Cisco Systems. He holds a BASc in Electrical Engineering and a MASc in Computer Engineering from the University of Waterloo, Canada.