Archived Events

Archived Events

Nov 14

2013

Nov 14 2013
Realtime and Batch Data Processing @ Twitter (Dmitriy Ryaboy)
Speaker:
Dmitriy Ryaboy
System:
Summingbird

Summingbird, an open-source project recently released by Twitter, allows engineers to easily build data processing pipelines that work both in a streaming context provided by Twitter Storm, and in offline batch context through Apache Hadoop. This talk will cover the practical motivation for building such a thing, and explain the core Summingbird architecture and components. Read More

Oct 24

2013

Oct 24 2013
Beyond Bigtable: Challenges in Cassandra 2.0 (Jonathan Ellis)
Speaker:
Jonathan Ellis
System:
Cassandra

Cassandra has fulfilled its original design as a Bigtable/Dynamo hybrid and is moving beyond it with the release of 2.0 last month. (I have annotated the original Cassandra LADIS paper with comparisons to modern Apache Cassandra at [1].) I will talk about integrating Paxos with an eventually consistent system, using cardinality estimation to improve compaction efficiency, and improving request isolation... Read More

Oct 10

2013

Oct 10 2013
The New Era of Data Management: MongoDB and Document Databases (Dwight Merriman)
Speaker:
Dwight Merriman
System:
MongoDB

SQL databases have been the dominant form of data management since the late 1970's. However, computing and technology have evolved, leading to an influx of data and new systems to manage it. For this new generation, "Document Databases" quickly emerged as a popular solution and became widely used among companies of all sizes. Using MongoDB as the example, we'll look... Read More