Archived Events

Archived Events

Apr 25

2017

Apr 25 2017
Dhivya Eswaran and Zongge Liu (SDM2017 dry run)
Speaker:
Dhivya Eswaran and Zongge Liu

Dhivya and Zongge will have dry runs for SDM 2017. Dhivya's talk information: Title: The Power of Certainty: A Dirichlet Multinomial Model for Belief Propagation Abstract: Given a friendship network, how certain are we that Smith is a progressive (vs. conservative)? How can we propagate these certainties through the network? While Belief propagation marked the beginning of principled label propagation to classify... Read More

Apr 24

2017

Apr 24 2017
[DB Seminar] Spring 2017: Dana Van Aken
Speaker:
Dana Van Aken
System:
OtterTune

Database management system (DBMS) configuration tuning is an essential aspect of any data-intensive application effort. But this is historically a difficult task because DBMSs have hundreds of configuration "knobs" that control everything in the system, such as the amount of memory to use for caches and how often data is written to storage. The problem with these knobs is that... Read More

Apr 17

2017

Apr 17 2017
[DB Seminar] Spring 2017: Hyeontaek Lim
Speaker:
Hyeontaek Lim

Multi-core in-memory databases promise high-speed online transaction processing.  However, the performance of individual designs suffers when the workload characteristics miss their small sweet spot of a desired contention level, read-write ratio, record size, processing rate, and so forth. Cicada is a single-node multi-core in-memory transactional database with serializability.  To provide high performance under diverse workloads, Cicada reduces overhead and contention... Read More

Apr 10

2017

Apr 10 2017
[DB Seminar] Spring 2017: Mohammad Hammoud
Speaker:
Mohammad Hammoud

Relational join is a fundamental data management operation, which highly influences the performance of almost every database query. In this talk, I will show that different workload characteristics and hardware configurations necessitate different main-memory hash join models. Subsequently, I will identify four effective models by which any hash-based join algorithm can be executed. I will characterize the relative merits of... Read More

Apr 6

2017

Apr 6 2017
[PDL/SDI/ISTC] Derek Murray (Google)
Speaker:
Derek Murray

TensorFlow is an open-source machine learning system, originally developed by the Google Brain team, which operates at large scale and in heterogeneous environments. TensorFlow trains and executes a variety of machine learning models at Google, including deep neural networks for image recognition and machine translation. The system uses dataflow graphs to represent stateful computations, and achieves high performance by mapping... Read More

Apr 3

2017

Apr 3 2017
[DB Seminar] Spring 2017: Prashanth Menon
Speaker:
Prashanth Menon

In-memory database management systems (DBMSs) are a key component of modern on-line analytic processing (OLAP) applications, since they provide low-latency access to large volumes of data. Because disk accesses are no longer the principle bottleneck in such systems, the focus in designing query execution engines has shifted to optimizing CPU performance.  Recent systems have revived an older technique of using... Read More

Mar 27

2017

Mar 27 2017
[DB Seminar] Spring 2017: Viktor Leis
Speaker:
Viktor Leis

Managing data sets that are larger than RAM has always been one of the most important tasks for database systems. Traditional systems cache fixed-size pages in an in-memory buffer pool that has complete knowledge of all page accesses and transparently loads/evicts pages from/to disk. While this approach is effective at minimizing the number of I/O operations, it is also one... Read More

Mar 24

2017

Mar 24 2017
Dan Ports (University of Washington)
Speaker:
Dan Ports

Today's most popular applications are deployed as massive-scale distributed systems in the datacenter. Keeping data consistent and available despite server failures and concurrent updates is a formidable challenge. Two well-known abstractions, strongly consistent replication and serializable transactions, can free developers from these challenges by transparently masking failures and treating complex updates as atomic units. Yet the conventional wisdom is that... Read More

Mar 21

2017

Mar 21 2017
[MLD Seminar] Jure Leskovec (Stanford University)
Speaker:
Jure Leskovec

Evaluating whether machines improve on human performance is one of the central questions of machine learning. However, there are many domains where the data is selectively labeled in the sense that the observed outcomes are themselves a consequence of the existing choices of the human decision-makers. For instance, in the context of judicial bail decisions, the outcome of whether a... Read More

Mar 21

2017

Mar 21 2017
[HCII Seminar] Michael Franklin (University of Chicago)
Speaker:
Mike Franklin

The “P“ in AMPLab stands for "People" and an important research thrust in the lab was on integrating human processing into analytics pipelines. Starting with the CrowdDB project on human-powered query answering and continuing into the more recent SampleClean and AMPCrowd/Clamshell projects, we have been investigating ways to maximize the benefit that can be obtained through involving people in data... Read More