Apache Cassandra at Data Day Texas
At the upcoming edition of Data Day, we will be hosting multiple tracks devoted to Apache Cassandra and its ecosphere. We are still accepting proposals for talks and mini-workshops in the Cassandra track. Are you doing something cool with Apache Cassandra? Share it with the community at Data Day Texas!
Confirmed Sessions
Cassandra and the Cloud
Jonathan Ellis - DataStax
Is Apache Cassandra still relevant in an era of hosted cloud databases? DataStax CTO Jonathan Ellis will discuss Cassandra’s strengths and weaknesses relative to Amazon DynamoDB, Microsoft CosmosDB, and Google Cloud Spanner.
Cassandra and Kubernetes
Ben Bromhead - Instaclustr
Kubernetes has become the most popular container orchestration and management API with cloud-native support from AWS, GCP, Azure and a growing enterprise support ecosystem. Leveraging Kubernetes to provide tested, repeatable deployment patterns that follow best practices is a win for both developers and operators.
In this talk Ben Bromhead, CTO of Instaclustr will introduce the Cassandra Kubernetes Operator, a Cassandra controller that provides robust, managed Cassandra deployments on Kubernetes. By adopting Kubernetes and Cassandra, you can provide DBaaS like services rapidly and easily to the rest of your team and have a simple on-ramp to true multi-cloud capabilities to your environment.
Cassandra Architecture FTW!
Jeff Carpenter - DataStax
In this talk we’ll take a deep dive into the architecture of Apache Cassandra to learn why it succeeds at scales where other databases fail. We’ll introduce the key distributed system design elements that Cassandra is built on, the problems that Cassandra solves especially well, and how to pair Cassandra with complementary technologies to build even more powerful systems. If you’ve heard about Cassandra and wondered if it was right for your use case, this talk is for you.
Cassandra pluggable storage engine
Dikang Gu - Facebook / Pengchao Wang - Facebook
Instagram is running one of the largest Cassandra deployments. In this year, the Cassandra team in Instagram has been working on a very interesting project to make Apache Cassandra's storage engine to be pluggable, and implement a new RocksDB based storage engine into Cassandra. The new storage engine can improve the performance of Apache Cassandra significantly.
In this talk, we will describe the motivation and different approaches we have considered, the high-level design of the solution we choose, also the performance metrics in benchmark and production environments.
Cassandra Performance Tuning and Crushing SLAs
Jon Haddad - The Last Pickle
In an ideal world, everything would just be fast out of the box. Unfortunately, we’re not quite there yet. Getting the best performance out of a database means understanding your entire system, from the hardware and OS to the databases’s internals. In this talk, Jon Haddad will discuss a wide range of performance tuning techniques. We’ll start by examining how to measure and interpret the statistics from the different components on our machines. Once we understand how to identify what exactly is holding our performance back, we can take the necessary steps to address the problem and move to the next issue. We’ll examine common pitfalls and problems, learning how to tune counters, compaction, garbage collection, compression, and more. If you’re working on a low latency, high throughput system you won’t want to miss this talk.
What have we done!? 10 years of Cassandra
Patrick McFadin - DataStax
10 years ago a couple of engineers at Facebook put up a project on Google code and a legend was born. The project has grown and users have shown an enormous amount of success. Are we ready to say Apache Cassandra has won and have a party? Let me present the evidence and we can decide as a group. No other database has delivered on the initial promises of being a reliable, performant, multi-datacenter source of record for important data. No other project, vendor or cloud has done as well or, I would argue, ever will.
I will highlight the main use cases and data models that has put Apache Cassandra ahead of its peers. If you are new to Apache Cassandra, come learn how you are lied to buy every other database that makes this claim. If you are a veteran, let me revive some of the thinking that got you here in the first place and give you some fresh reasons to love this database of ours.
Performance Data Modeling at Scale
Aaron Ploetz - Target
The most important aspect about backing your application with Cassandra, is in building a good data model. In addition to designing a query-based model that distributes well, performance at scale should also be a prime consideration. After all, you want good things to happen when your application gets a sudden 10x increase in traffic. At Target, the holiday season hits our infrastructure hard, and engineering to withstand that 10x increase is our reality.
In this presentation, we will examine real-world use cases and data processing scenarios. We will cover Cassandra data modeling techniques, and considerations for both high performance and large scale. Performance engineering of existing models will also be discussed, along with ways to get that extra bit of lower latency.
Intended audience: Cassandra DBAs, developers, and data modelers.
Go big or go home! Does it still make sense to do Big Data with Small Nodes?
Glauber Costa - ScyllaDB
In the world of Big Data, scaling out is the norm. The prospect of running massive computation in commodity hardware is enticing, but what does "commodity hardware" really mean? The usual 8-core setup people have been deploying with can now be found on phones, and every cloud provider makes boxes with 32 cores and up available at the click of a button. And still, a lot of Big Data deployments are trapped in a sea of small boxes cluster.
With the advent of scalable platforms like ScyllaDB, node performance is no longer an issue and doubling the size of the nodes will usually double the available storage and memory and processing power. So which other reasons stop people from going big in the Cloud Native world? This talk will explore some of the popular knowledge associated with it and delve into which are true, and which aren't.
Confirmed Speakers
Ben Bromhead (SF Bay)
Ben will be giving the following presentation: Cassandra and Kubernetes.
Jeff Carpenter (Scottsdale, Arizona) @jscarp
Jeff will be giving the following Cassandra presentation: Cassandra Architecture FTW!
Jonathan Ellis (Austin) @spyced
Jonathan will be presenting the following Cassandra session: Cassandra and the Cloud.
Jon Haddad (Los Angeles) @rustyrazorblade
Jon will be giving the following Cassandra presentation: Cassandra Performance Tuning and Crushing SLAs .
Patrick McFadin (SF Bay) @patrickmcfadin
Patrick will be giving the following Cassandra presentation: What have we done!? 10 years of Cassandra.
Dikang Gu (San Francisco) @dikanggu
Dikang will be co-presenting the following Cassandra session: Cassandra Pluggable Storage Engine.
Aaron Ploetz (Minneapolis) @APloetz
Aaron will be presenting the following Cassandra session: Performance Data Modeling at Scale
Glauber Costa (Toronto) @glcst
Before ScyllaDB, Glauber worked with Virtualization in the Linux Kernel for 10 years, with contributions ranging from the Xen and KVM Hypervisors to all sorts of guest functionality and containers.
Glauber will be presenting the following session: Go big or go home! Does it still make sense to do Big Data with Small Nodes?.