Some of the topics covered at Data Day

This is a partia - not complete - l list of topics and talks covered at Data Day 2020.
We will be adding more throughout Tuesday and Wednesday

90 minute workshops

Intro to RDF/SPARQL and RDF*/SPARQL* - Thomas Cook - Cambridge Semantics
Ontology for Data Scientists - Michael Uschold - Semantic Arts
Graph Feature Engineering for More Accurate Machine Learning - Amy Hodler / Justin Fine - Neo4j

Data Science / Machine Learning

Working Together as Data Teams - Jesse Anderson - Big Data Institute
Data Governance and FATTER AI - Devangana Khokhar - Thoughtworks
Shining a Light on Dark Documents - Dale Markowitz - Google
Bigger data vs. better math: which is most effective in ML? - Brent Schneeman - Alegion
Practicing data science: A collection of case studies - Rosaria Silipo - KNIME
Ontology for Data Scientists - Michael Uschold - Semantic Arts
MLflow: An open platform to simplify the machine learning lifecycle - Corey Zumar - Databricks
Responsible AI Requires Context and Connections - Amy Hodler - Neo4j
Creating Explainable AI with Rules - Jans Aasman - Franz. Inc
Graph Feature Engineering for More Accurate Machine Learning - Amy Hodler / Justin Fine - Neo4j
AI/ML Model Serving using Apache Pulsar Functions - Karthik Ramasamy - Streamlio

Human in the Loop Machine Learning

Information Extraction with Humans in the Loop: Dr. Anna Lisa Gentile - IBM Research Almaden
Human Centered Machine Learning: Robert Munro - Author / Serial Founder
Cost-Optimized Data Labeling Strategy: Jennifer Prendki - Alectio
Humans, machines and disagreement: Lessons from production at Stitch Fix: Brad Klingenberg - Stitch Fix
Learning sequential tasks from human feedback : Brad Knox - Bosch USA
How to trust your Human-In-The-Loop (HITL) data annotations: Emanuel Ott - iMerit
From Stanford to Startup: making academic human-in-the-loop technology work in the real world : Abraham Starosta / Tanner Gilligan

Data Engineering / Ops / Architecture (see also pipelines)

Architecting Production IoT Analytics - Paige Roberts - Vertica
Where’s my lookup table? Modeling relational data in a denormalized world - Rick Houlihan - Amazon Web Services
Serverless Data Integration - Gaja Krishna Vaidyanatha - DBPerfMan LLC

Data / Databases / Theory

mm-ADT A Multi-Model Abstract Data Type: Marko Rodriguez - RRedux
Category Theory for the Working Database Programmer: Heidi Waterhouse - LaunchDarkly
The Death of Data : 2020: Ryan Wisnesky - Conexus

Apache Cassandra

The Next Five Years in Databases - Jonathan Ellis - DataStax
Cassandra 4.0 In Action - Jeffrey Carpenter - DataStax
Scaling Your Cassandra Cluster For Fluctuating Workloads - Brian Hall - Expero

Apache Kafka

Managing your Kafka in an explosive growth environment - Alon Gavra - AppsFlyer
Stateful Streaming Application Integration - Leveraging Kafka Streams Processor API with KSQL DB - Clay Lambert - Expero
Apache Kafka - Ask Me Anything - Jesse Anderson

Data Pipelines

Defeating pipeline debt with Great Expectations - Abe Gong -Superconductive Health
How Declarative Configurations and Automation Can Prevent A Data Mutiny - Sean Knapp - Ascend.io
MLflow: An open platform to simplify the machine learning lifecycle - Corey Zumar - Databricks
Immutable Data Pipelines for Fun and Profit - Rob McDaniel - Sigma IQ
How to start your first computer vision project - Sanghamitra Deb - Chegg
Moving Your Machine Learning Models to Production with TensorFlow Extended - Jonathan Mugan - DeUmbra
Machine Learning Counterclockwise - Shawn Rutledge - Sigma IQ
Using interactive Querying of Streaming Data for Anomaly Detection - Karthik Ramasamy - Streamlio

Time Series Data

Deep Learning and the Analysis of Time Series Data - Dr. Bivin Sadler - Southern Methodist University
Time-Series analysis in healthcare: A practical approach - Sanjay Joshi - Dell
JGTSDB: A JanusGraph/TimescaleDB Mashup - Ted Wilmes - Expero
Modeling, Querying, and Seeing Time Series Data within a Self-Organizing Mesh Network - Denise Gosnell - DataStax

Graph Databases

GQL: Get Ready for a Standard Graph Query Language - Stefan Plantikow - Neo4j
Automated Encoding of Knowledge from Unstructured Natural Language Text into a Graph Database -Chris Davis - Lymba
Intro to RDF/SPARQL and RDF*/SPARQL* - Thomas Cook - Cambridge Semantics
Managing Relationships in the Healthcare Industry with Graphileon: A CHG Healthcare Use Case - Tyler Glaittli - CHG Healthcare
Graph Feature Engineering for More Accurate Machine Learning - Amy Hodler / Justin Fine - Neo4j
Query Processor of GraphflowDB and Techniques for the Graph Databases of 2020s - Semih Salihoglu - University of Waterloo
Responsible AI Requires Context and Connections - Amy Hodler - Neo4j

Knowledge Graphs

TinkerPop 2020 - Joshua Shinavier - Uber
A Brief History of Knowledge Graph's Main Ideas - Juan Sequeda - data.world
Knowledge Graph for drug discovery - Dr. Ying Ding - University of Texas at Austin

Graphs - Apache TinkerPop

Modeling, Querying, and Seeing Time Series Data within a Self-Organizing Mesh Network - Denise Gosnell - DataStax
Building a Graph User-Interface for Malware-Analysis - Stefan Hausotte - G DATA / Ethan Hasson - Expero
TinkerPop 2020 - Joshua Shinavier - Uber
JGTSDB: A JanusGraph/TimescaleDB Mashup - Ted Wilmes - Expero

Data /AI in Health

Knowledge Graph for drug discovery - Dr. Ying Ding - University of Texas at Austin
Time-Series analysis in healthcare: A practical approach - Sanjay Joshi - Dell
Managing Relationships in the Healthcare Industry with Graphileon: A CHG Healthcare Use Case - Tyler Glaittli - CHG Healthcare