Archived Events

Archived Events

May 12

2014

May 12 2014
DB Seminar: Atreyee Maiti
Speaker:
Atreyee Maiti

Transactions in on-line transaction processing (OLTP) workloads typically have the following characteristics: (1) they are short-lived, (2) they work on a small subset of the data, (3) they are repetitive. Traditional disk-based database management systems (DBMS) incur too much overhead for OLTP datasets that could simply be memory resident. This is because of the presence of heavyweight concurrency control and... Read More

May 5

2014

May 5 2014
DB Seminar: Jay-Yoon Lee
Speaker:
Jay-Yoon Lee

How can we visualize billion-scale graphs? How to spot outliers in such graphs quickly? Visualizing graphs is the most direct way of understand- ing them; however, billion-scale graphs are very difficult to visualize since the amount of information overflows the resolution of a typical screen. In this paper we propose NET-RAY, an open-source package for visualization- based mining on billion-scale... Read More

Apr 24

2014

Apr 24 2014
Mike Cafarella (University of Michigan)
Speaker:
Mike Cafarella

Trained systems that apply machine learning to very large datasets, such as web search and IBM's Watson question-answering system, are among the most important and sophisticated software systems being constructed today. Such trained systems are frequently based on supervised learning tasks that require features, signals extracted from the data that distill complicated raw data objects into a small number of... Read More

Apr 21

2014

Apr 21 2014
Database Group Meeting (April 21, 2014)
Speaker:
Vagelis Papalexakis

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix- Tensor Factorization... Read More

Dec 5

2013

Dec 5 2013
Datomic (Rich Hickey)
Speaker:
Rich Hickey
System:
Datomic

Proponents of functional programming tout its many benefits, most of which are available only within a particular process, or afforded by a particular programming language feature. Anything outside of that is considered I/O, dangerous and difficult to reason about. But real systems almost always cross process and language boundaries, and most require, crucially, a very gnarly bit of shared state... Read More

Nov 14

2013

Nov 14 2013
Realtime and Batch Data Processing @ Twitter (Dmitriy Ryaboy)
Speaker:
Dmitriy Ryaboy
System:
Summingbird

Summingbird, an open-source project recently released by Twitter, allows engineers to easily build data processing pipelines that work both in a streaming context provided by Twitter Storm, and in offline batch context through Apache Hadoop. This talk will cover the practical motivation for building such a thing, and explain the core Summingbird architecture and components. Read More

Oct 24

2013

Oct 24 2013
Beyond Bigtable: Challenges in Cassandra 2.0 (Jonathan Ellis)
Speaker:
Jonathan Ellis
System:
Cassandra

Cassandra has fulfilled its original design as a Bigtable/Dynamo hybrid and is moving beyond it with the release of 2.0 last month. (I have annotated the original Cassandra LADIS paper with comparisons to modern Apache Cassandra at [1].) I will talk about integrating Paxos with an eventually consistent system, using cardinality estimation to improve compaction efficiency, and improving request isolation... Read More

Oct 10

2013

Oct 10 2013
The New Era of Data Management: MongoDB and Document Databases (Dwight Merriman)
Speaker:
Dwight Merriman
System:
MongoDB

SQL databases have been the dominant form of data management since the late 1970's. However, computing and technology have evolved, leading to an influx of data and new systems to manage it. For this new generation, "Document Databases" quickly emerged as a popular solution and became widely used among companies of all sizes. Using MongoDB as the example, we'll look... Read More