The following speakers have been confirmed for Data Day Texas 2016, with many more to come.
We are currently accepting submissions for Data Day 2016. Details can be found on our proposals page.
John Akred (SF Bay)
John Akred is the Founder and CTO of Silicon Valley Data Science. In the business world, John Akred likes to help organizations become more data driven. He has over 15 years of experience in machine learning, predictive modeling, and analytical system architecture. His focus is on the intersection of data science tools and techniques; data transport, processing and storage technologies; and the data management strategy and practices that can unlock data driven capabilities for an organization. A frequent speaker at the O'Reilly Strata Conferences, John is host of the perennially popular workshop: Building A Data Platform.
John will also be hosting office hours at Data Day Texas.
Carl Anderson (NYC)
Carl Anderson is the Director of Data Science at Warby Parker in New York overseeing data engineering, data science, supporting the broader analytics org, and creating a data-driven organization. He has had a broad-ranging career, mostly in scientific computing, covering areas such as healthcare modeling, data compression, robotics, and agent based modeling. He holds a Ph.D. in mathematical biology from the University of Sheffield, UK.
He is the author of "Creating a Data-Driven Organization" (O'Reilly, 2015)
While at Data Day Seattle, Carl will also be signing his new book, Creating a Data Driven Organization
Trey Blalock (Seattle)
Trey Blalock, (GIAC-GWAPT, GIAC-GPEN, GIAC-GCFA, CISA, CISM, CISSP, SSCP, NSA-IAM) has served as Manager of Global Security Operations / Security Architect for one of the worlds largest financial transaction hubs (S1 Corporation) overseeing all aspects of security for hundreds of web-banking environments, ATM networks, and point-of-sale transaction networks world-wide.
Currently on the National Board of Information Security Examiners (NBISE) Operational Security Testing Panel designing comprehensive testing solutions to evaluate skill levels of commercial penetration testers as well as military red team, and blue team technicians. This is primarily to be used by government & military to identify above-average talent in these areas.
Has over ten years of experience providing penetration testing and assessment services to hundreds of clients in the financial, government, retail, chemical, oil & gas, medical, educational, legal, telecom, and law enforcement sectors. See his full bio on the following page.
While in Austin, Trey will be offering an encore presentation of his Pentesting 101 Course/
Kurt Brown (SFBay)
Kurt Brown, leads the Data Platform team at Netflix. His group architects and manages the technical infrastructure underpinning the company’s analytics. The Netflix data infrastructure includes various big data technologies (Hadoop, Hive, and Pig), Netflix open sourced applications and services (Lipstick and Genie), and traditional BI tools (Teradata and MicroStrategy).
Following his presentation at Data Day, Kurt will be holding office hours.
Laine Campbell (Las Vegas)
Laine Campbell specializes in database architecture and operations, particularly MySQL and Cassandra. Laine is currently CTO of OrderWithMe. Most recently, Laine was a co-founder at Pythian, where she led the open source database practice. Prior to that, Laine founded and led PalominoDB, then Blackbird for 8 years, where her team of DBAs supported many of the most exciting database infrastructures in the industry. Before that, she designed, built and supported the Travelocity databases for 8 years with a remarkable team. She lives in Las Vegas, and travels extensively. Laine has been around the block.
Laine is passionate about supporting members of underserved populations to gain experience, skills and jobs in technology. She is an advocate of bringing women, people of color and LGBTQ individuals into the world of technology, and supporting them in their careers. She is also passionate about open source technologies and the commoditization of IT, and how it can support communities and the general welfare of the individual.
While at Data Day Texas, Laine will also be signing her soon to be released O'Reilly book, Databases at Scale.
Ed Capriolo (NYC)
Ed Capriolo is a Data Architect at the Huffington Post. Previously, he was a software developer at Media 6 degrees. Ed is organizer of the NYC Cassandra Meetup group, as well as a Apache Hive PMC committer / member. Ed is author of multiple books, including the Cassandra High Performance Cookbook and Programming Hive.
While at Data Day Texas, Ed will also be signing the soon to be released second edition of his O'Reilly book, Programming Hive
Michelle Casbon (San Antonio / SFBay )
Michelle Casbon is a Senior Data Science Engineer at Idibon, where she is contributing to the goal of bringing language technologies to all the world’s languages. Her development experience spans a decade across various industries, including media, investment banking, healthcare, retail, and geospatial services. Michelle completed a Masters at the University of Cambridge, focusing on NLP, speech recognition, speech synthesis, and machine translation. She loves working with open source technologies and has had a blast contributing to the Apache Spark project. Holding technical conversations and learning from the people she meets is her favorite part of Data Day Texas.
Ted Dunning (SFBay)
Ted Dunning is Chief Applications Architect at MapR Technologies and committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects and mentor for Apache Storm. He contributed to Mahout clustering, classification, and matrix decomposition algorithms and helped expand the new version of Mahout Math library. Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems, built fraud-detection systems for ID Analytics (LifeLock), and has issued 24 patents to date. Ted has a PhD in computing science from University of Sheffield. When he’s not doing data science, he plays guitar and mandolin. Ted is co-author, along with Ellen Friedman, of the recent O'Reilly media publications, Practical Machine Learning: Innovations in Recommendation, and Practical Machine Learning: A New Look at Anomaly Detection. By the way, Ted bought the beer at the first Hadoop meetup.
Chris Fregly (SFBay)
Chris Fregly is Principle Data Solutions Engineer for the IBM Spark Technology Center. Chris is an Apache Spark contributor, the organizer of the Bay Area Advanced Spark Meetup, and author of the upcoming books Advanced Spark and Spark Streaming in Action. Chris has 15+ years of distributed big data systems experience across many domains including media/entertainment, banking, insurance, and travel. Previously, Chris was an engineer at Databricks, Streaming Data Engineer at Netflix, Platform/Data Engineer at Playboy Enterprises, and a Distributed Systems Engineer at BEA Systems.
Twitter: @ cfregly
Ellen Friedman (SFBay)
Ellen Friedman is a solutions consultant, scientist and author, currently writing about a variety of open source and big data topics including being co-author of Mahout in Action (Manning), the Practical Machine Learning series from O’Reilly, and the newest title, Time Series Databases (O’Reilly). She is a committer on the Apache Mahout project, a contributor to Apache Drill and has been an invited speaker at Berlin Buzzwords 2013, the Philly ETE 2014 conference and keynote speaker for NoSQL Matters 2014 in Barcelona. With a Ph.D. in biochemistry and years of work writing on a variety of scientific and computing topics, she is an experienced communicator. She’s also co-author of a book of magic-themed cartoons, A Rabbit Under the Hat.
Jonathan Gray (SF Bay)
Jonathan Gray, founder and CEO of Cask, is an entrepreneur and software engineer with a background in startups, open source, and all things data. Prior to founding Cask, Jonathan was a software engineer at Facebook where he drove HBase engineering efforts, including Facebook Messages and several other large-scale projects, from inception to production.
An open source evangelist, Jonathan was responsible for helping build the Facebook engineering brand through developer outreach and refocusing the open source strategy of the company. Prior to Facebook, Jonathan founded Streamy.com, where he became an early adopter of Hadoop and HBase and is now a core contributor and active committer in the community.
Jonathan holds a Bachelor’s degree in Electrical and Computer Engineering and Business Administration from Carnegie Mellon University.
Joel Grus (Seattle)
Joel Grus is a software engineer at Google. Before that he worked as a data scientist at multiple startups. He lives in Seattle, where he regularly attends data science happy hours.
While at Data Day Texas, Joel will also be signing his recently released O'Reilly book, Data Science From Scratch
Sarah Guido (NYC)
Sarah Guido is a data scientist at Bitly and is interested in all things Python, data, and machine learning. She is a co-organizer of the PyGotham conference and the NYC Python meetup. Excited to share her passion for data with others, she has spoken at conferences such as PyCon, OSCON and PyData, and is writing an O'Reilly book on machine learning. Prior to joining Bitly, she worked in a few other startups and graduated from the University of Michigan’s School of Information.
While at Data Day Texas, Sarah will also be signing her new book, Introduction to Machine Learning with Python
Russell Jurney (SF Bay)
Russell Jurney is founder and CEO of Relato. Russell has over a decade of experience building analytic applications, from casino gaming to inbox analytics. Russell is passionate about graphs and sees networks in the world around him. Mapping markets to achieve a deeper understanding of how they work is exciting work.
Prior to Relato, Russell was a Data Scientist in Residence at The Hive, where he helped launch E8 Security as their first engineer. Before that he was Evangelist at Hortonworks, after being Senior Data Scientist in product analytics at LinkedIn. Russell is author the recently released O'Reilly book Agile Data Science as well co-author of the the soon to be released O'Reilly book: Big Data for Chimps. Russell is originally from Atlanta, GA. He lives in Pacifica, California with Bella the Data Dog.
While at Data Day, Russell will be holding office hours and signing copies of Big Data for Chimps. He will also be speaking at Graph Day on Sunday, January 18.
Jay Kreps (SF Bay)
Jay Kreps is the original author of multiple well-known projects including Apache Kafka, Apache Samza, Voldemort, and Azkaban. Formerly Principle Staff Engineer at Linkedin. Jay is also co-founder and CEO at Confluent - a company built around realtime data streams and the open source messaging system Apache Kafka. Jay is the author of the upcoming O'Reilly book, I Heart Logs: Event Data, Stream Processing, and Data Integration.
While at Data Day Texas, Jay will also be signing his new book, I Heart Logs
Patrick McFadin (SF Bay)
Patrick McFadin is regarded as one of the foremost experts of Apache Cassandra and data modeling techniques. As the Chief Evangelist for Apache Cassandra and consultant for DataStax, he has helped build some of the largest deployments in the world. Previous to DataStax, he was Chief Architect at Hobsons, an education services company. There, he spoke often on web application design and performance.
Ryan Mitchell (Somerville, MA)
Ryan Mitchell is a Software Engineer at LinkeDrive in Boston, where she develops their API and data analysis tools. She is a graduate of Olin College of Engineering, and is a Masters degree student at Harvard University School of Extension Studies. Prior to joining LinkeDrive, she was a Software Engineer working on web scraping and data analysis at Abine. Ryan is author of Instant Web Scraping with Java and the upcoming O'Reilly book: Web Scraping with Python.
Twitter: @ Kludgist
While at Data Day Texas, Ryan will also be signing her new book, Introduction to Machine Learning with Python
Christopher Moody (SF Bay)
Chris Moody loves high-performance computing, high dimensions & high fashion. He loves learning the beautiful symmetries between physics, data, and analytics. Went to Caltech, did astrostats & supercomputing and now Data Labs at Stitch Fix. Currently enjoying coding up word2vec, Gaussian Processes, Deep RNNs and t-SNE.
Robert Munro (San Francisco)
Robert Munro is the CEO of Idibon, founded with the goal of bringing language technologies to all the world’s languages. He is a world leader in applying big data analytics to human communications, having worked in many diverse environments, from Sierra Leone, Haiti and the Amazon to London, Sydney and San Francisco. He completed a PhD in Computational Linguistics as a Graduate Fellow at Stanford University. Outside of work, he has learned about the world’s diversity by cycling more than 20,000 kilometers across 20 countries, mostly through the mountains.
Eric Schmidt (Seattle)
Twitter: @DJ Rhythma
The following speakers have been confirmed for Graph Day on Sunday:
Marko Rodriguez (Santa Fe)
Dr. Marko A. Rodriguez is Director of Engineering at DataStax as well as Co-Founder and contributor to Tinkerpop. Prior to joining DataStax, Marko was co-founder of Aurelius, LLC, the company behind the Titan graph database. Marko has focused his academic and commercial career on graph theory, network science, and graph-system architecture and development. He is a TinkerPop cofounder and serves as the lead developer of the Gremlin graph traversal language and Faunus graph analytics engine. Marko received his Bachelors in Cognitive Science from UC San Diego, his Masters and Ph.D. in Computer Science from UC Santa Cruz and was a Director’s Fellow at the Center for Nonlinear Studies of the Los Alamos National Laboratory.
Matthias Broecheler (SF Bay)
Dr. Matthias Broecheler, best known as lead developer of the Titan distributed graph database, is Director of Engineering responsible for DSE graph at DataStax. Most recently, Matthias was managing partner of Aurelius. Matthias has researched large scale graph database systems for more than 5 years. His award-winning research includes high performance index structures and query answering algorithms for graph structured data. In addition, he developed the Probabilistic Similarity Logic (PSL) machine learning framework to analyze and reason about multi-relational data. Matthias holds a Ph.D. in Computer Science from the University of Maryland.