News & Events
[DB Seminar] Fall 2017: Ben Darnell (CockroachDB)
Distributed consensus algorithms like Paxos and Raft provide an important building block for distributed systems, but there's a lot more that goes into a resilient and scalable distributed database. CockroachDB's key-value layer is built on many independent and overlapping Raft consensus groups. In this talk I'll explain why we built it this way, and some of the expected and unexpected challenges we had to overcome along the way. Read More
[DB Seminar] Fall 2017: Joy Arulraj
For the first time in 25 years, a new non-volatile memory (NVM) category is being created that is expected to be 1000 times faster than current durable storage devices. The advent of NVM will fundamentally change the dichotomy between memory and durable storage in database systems (DBMSs). These new NVM devices are almost as fast as DRAM, but all writes to it are potentially persistent even after power loss. Existing DBMSs are unable to take full advantage of this technology Read More
Alex and Vagelis both win KDD-dissertation distinction!
Dr. Alex Beutel and Prof. Evangelos (Vagelis) Papalexakis, each attracted the ‘runner-up’ distinction for the prestigious SIGKDD doctoral dissertation award. SIGKDD is the flagship venue for data mining. Alex’s dissertation, titled ‘User Behavior Modeling with Large-Scale Graph Analysis’ focused on anomaly detection, by modeling normal and abnormal users, as well as on the design of scalable algorithms for large graphs. Vagelis’ dissertation, titled 'Mining Large Multi-aspect Data: Algorithms and Applications' bridged signal- and tensor-analysis, with data mining at scale, and Read More
CMU Database Post-Doctoral Researcher Position (Fall 2017)
The Carnegie Mellon Database Group has an opening for a fully-funded post-doctoral position on database management systems at its Pittsburgh, Pennsylvania campus. The position is to assist with the research and development of CMU's in-memory HTAP DBMS (Peloton) as part of the Intel Science and Technology Center for Visual Cloud Systems. The post-doctoral researcher will be expected to develop their own research agenda within the scope of the position, design and implement novel analysis techniques, conduct experiments, supervise students on Read More
[DB Seminar] Spring 2017: Yingjun Wu
The emergence of large main memories and massively parallel processors has triggered the development of multi-core main-memory database management systems (DBMSs). Although the reduction of disk accesses results in low single-thread transaction execution time, scaling these systems on multi-core machines remains notoriously difficult. In particular, the concurrent processing of a large number of transactions can bring about significant performance bottlenecks. In this talk, I will discuss the potential for improving the DBMS performance through program analysis. The intuition is that Read More
Four CMU-DB Talks @ SIGMOD 2017
Members of the Carnegie Mellon Database Group are presenting four times at SIGMOD 2017 held in Chicago, IL: Tutorial Joy Arulraj & Andy Pavlo — How to Build a Non-Volatile Memory Database Management System Keynote Andy Pavlo — What Are We Doing With Our Lives? Nobody Cares About Our Research on Transactions Research Talk Dana Van Aken — Automatic Database Management System Tuning Through Large-scale Machine Learning Research Talk Andy Pavlo — Online Deduplication for Databases Videos of these talks Read More
Alicia Klinvex (Sandia National Labs)
As parallel computing tends toward the exascale, scientific data produced by simulations are growing increasingly massive, sometimes resulting in terabytes of data. By viewing this data as a dense tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving impressive compression ratios with negligible loss in accuracy. We present recent improvements in our distributed-memory parallel implementation of the Tucker decomposition, whose key computations correspond to parallel linear algebra operations. To demonstrate the compression and accuracy of Read More
Pedro Ribeiro (University of Porto)
One way of understanding the design principles of complex networks is to look at how they are organized at the subgraph level. In this talk I will describe how subgraphs can be seen as fundamental structural units and how they can provide a powerful and very flexible framework for characterizing and comparing networks. I will focus on two concepts geared around this idea, namely network motifs and graphlets/orbits. At the core of these methodologies lies the ability to search and count Read More
[DB Seminar] Spring 2017: Priya Govindan
The structure of real-world complex networks has long been an area of interest, and one common way to describe the structure of a network has been with the k-core decomposition. The core number of a node can be thought of as a measure of its centrality and importance, and is used by applications such as community detection, understanding viral spreads, and detecting fraudsters. However, we observe that the k-core decomposition suffers from an important flaw: namely, it is calculated globally, Read More
Dhivya Eswaran and Zongge Liu (SDM2017 dry run)
Dhivya and Zongge will have dry runs for SDM 2017. Dhivya's talk information: Title: The Power of Certainty: A Dirichlet Multinomial Model for Belief Propagation Abstract: Given a friendship network, how certain are we that Smith is a progressive (vs. conservative)? How can we propagate these certainties through the network? While Belief propagation marked the beginning of principled label propagation to classify nodes in a graph, its numerous variants proposed in the literature fail to take into account uncertainty during the propagation Read More