Archived Events

Archived Events

Apr 25

2016

Apr 25 2016
[DB Seminar] Spring 2016: Miguel Araujo
Speaker:
Miguel Araujo

Miguel will give a practice talk on his thesis proposal. Abstract: The identification of anomalies and communities of nodes in real-world graphs has applications in widespread domains, from the automatic categorization of wikipedia articles or websites to bank fraud detection. While recent and ongoing research is supplying tools for the analysis of simple unlabeled data, it is still a challenge to find patterns and anomalies in large labeled... Read More

Apr 18

2016

Apr 18 2016
[DB Seminar] Spring 2016: Dana Van Aken
Speaker:
Dana Van Aken

Database management system (DBMS) configuration tuning is an essential aspect of any data-intensive application effort. But this is historically a difficult task because DBMSs have hundreds of configuration "knobs" that control everything in the system, such as the amount of memory to use for caches and how often data is written to storage. The problem with these knobs is that... Read More

Apr 15

2016

Apr 15 2016
Monte Zweben (Splice Machine)
Speaker:
Monte Zweben
System:
Splice Machine

This talk describes the Splice Machine RDBMS designed to power today’s new class of modern applications that require high scalability and high-availability while simultaneously executing OLTP and OLAP workloads. Splice Machine is a full ANSI SQL database that is ACID compliant, supports secondary indexes, constraints, triggers, and stored procedures. It uses a unique, distributed snapshot isolation algorithm that preserves transactional... Read More

Apr 14

2016

Apr 14 2016
Yi Pan (Apache Samza @ LinkedIn)
Speaker:
Yi Pan
System:
Samza

This talk will provide an overview of LinkedIn's distributed stream processing platform, including Samza/Kafka/Databus. It will first cover the high level scenarios for stream processing in LinkedIn, followed by detailed requirements around scalability, re-processing, accuracy of results, and easy programmability; then we will focus on the requirements on stateful stream processing applications and explain how Samza's state management allows us... Read More

Apr 11

2016

Apr 11 2016
[DB Seminar] Spring 2016: Vladimir I. Zadorozhny
Speaker:
Vladimir I. Zadorozhny

Information fusion deals with reconstructing objects from multiple, possibly incomplete and inconsistent observations. The task of scalable information fusion is critical for interdisciplinary research where a comprehensive picture of the subject requires large amounts of data from disparate data sources. Despite its increasing availability, making sense of such data is not trivial. In this talk I will elaborate on challenges... Read More

Apr 4

2016

Apr 4 2016
[DB Seminar] Spring 2016: Srijan Kumar
Speaker:
Srijan Kumar

The web enables transmission of knowledge at a speed and breadth unprecedented in human history, which has had tremendous positive impact on the lives of billions of people. While benign users try to keep the web safe and usable, malicious users add and spread harmful content, manipulate information and twist things in their favor. Having malicious users and their content... Read More

Mar 28

2016

Mar 28 2016
[DB Seminar] Spring 2016: Yingjun Wu
Speaker:
Yingjun Wu

Today’s main-memory databases can support very high transaction rate for OLTP applications. However, when a large number of concurrent transactions contend on the same data records, the system performance can deteriorate significantly. This is especially the case when scaling transaction processing with optimistic concurrency control (OCC) on multicore machines. In this paper, we propose a new concurrency-control mechanism, called transaction... Read More

Mar 21

2016

Mar 21 2016
[DB Seminar] Spring 2016: Pengtao Xie
Speaker:
Pengtao Xie

Matrix-parametrized models, including multiclass logistic regression and sparse coding, are used in machine learning (ML) applications ranging from computer vision to computational biology. When these models are applied to large scale ML problems starting at millions of samples and tens of thousands of classes, their parameter matrix can grow at an unexpected rate, resulting in high parameter synchronization costs that greatly... Read More

Mar 14

2016

Mar 14 2016
[DB Seminar] Spring 2016: CSD Open House Event
Speakers:
Joy Arulraj, Evangelos Papalexakis, DB group members

This week, we will have one student from Professor Christos Faloutsos' group and Professor Andy Pavlo's group to give a short talk on the on-going research. Then we will have round table discussion of the on-going work of the other members of DB group with the visiting students attending the CSD Open House. Read More

Feb 29

2016

Feb 29 2016
[DB Seminar] Spring 2016: Hao Zhang
Speaker:
Hao Zhang

We propose a dynamic topic model for monitoring temporal evolution of market competition by jointly leveraging tweets and their associated images. For a market of interest (e.g. luxury goods), we aim at automatically detecting the latent topics (e.g. bags, clothes, luxurious) that are competitively shared by multiple brands (e.g. Burberry, Prada, and Chanel), and tracking temporal evolution of the brands'... Read More