Archived Events

Archived Events

Oct 22

2014

Oct 22 2014
Pitt/CMU DB Meetup – Spyros Blanas (Ohio State)
Speaker:
Spyros Blanas

Web data are commonly processed using thousands of CPU cores, and large-scale scientific simulations are quickly approaching the one million CPU core mark. At this scale, the barrier to efficient data analysis is commonly the limited bandwidth to the disk. The growing main memory capacities allow data to be intelligently reduced, analyzed and transformed in situ, before being written to... Read More

Oct 22

2014

Oct 22 2014
Impala: A Modern, Open-Source SQL Engine for Hadoop (Ippokratis Pandis)
Speaker:
Ippokratis Pandis
System:
Impala

The Cloudera Impala project is pioneering the next generation of Hadoop capabilities: the convergence of fast SQL queries with the capacity, scalability, and flexibility of a Hadoop cluster. With Impala, the academic and Hadoop communities now have an open-sourced codebase that helps query data stored in HDFS and Apache HBase in real time, using familiar SQL syntax. In contrast with... Read More

Oct 20

2014

Oct 20 2014
DB Seminar [Fall 2014]: Neil Shah
Speaker:
Neil Shah

Abstract: How can we detect suspicious users in large online networks? Online popularity of a user or product (via follows, page-likes, etc.) can be monetized on the premise of higher ad click-through rates or increased sales. Web services and social networks which incentivize popularity thus suffer from a major problem of fake connections from link fraudsters looking to make a... Read More

Oct 17

2014

Oct 17 2014
Jimeng Sun (Georgia Institute of Technology)
Speaker:
Dr. Jimeng Sun

Predictive modeling plays an important role in biomedical research. Thanks to the explosion of Electronic Heart Records (EHR), the interest of building predictive models using EHR data has skyrocketed in recent years. However, the methodologies for develop a predictive model are still labor intensive and ad-hoc. Such rudimentary approaches have hindered the quality and throughput of healthcare and biomedical research.... Read More

Oct 16

2014

Oct 16 2014
State-of-the-Art Database Index Maintenance (Bradley C. Kuszmaul)
Speaker:
Bradley C. Kuszmaul
System:
Tokutek
Video:
YouTube

This talk will discuss how B-trees, Log-Structured Merge Trees and Streaming B-trees operate, and what is their asymptotic performance. Part of the "Seven Databases in Seven Weeks" Seminar Series: http://db.cs.cmu.edu/seminar2014 Read More

Oct 13

2014

Oct 13 2014
DB Seminar [Fall 2014]: Thomas Marshall
Speaker:
Thomas Marshall

Big data processing can be expensive and slow, a problem made worse when your data set keeps changing, forcing you to reanalyze it repeatedly. Incremental computation can speed things up by minimizing the work that must be done to update output in response to changing input, but many previous efforts at incremental computation have been limited in the algorithms they... Read More

Oct 9

2014

Oct 9 2014
MongoDB: Reinventing the Database Landscape (Andrew Morrow)
Speaker:
Andrew Morrow
System:
MongoDB
Video:
YouTube

MongoDB is the next-generation database that helps businesses transform their industries by harnessing the power of data. The world’s most sophisticated organizations, from cutting-edge startups to the largest companies, use MongoDB to create applications never before possible at a fraction of the cost of traditional databases. This talk will review the feature set of MongoDB and the history and rationale... Read More

Oct 6

2014

Oct 6 2014
DB Seminar [Fall 2014]: Zhanpeng Fang
Speaker:
Zhanpeng Fang

Online gaming is one of the largest industries on the Internet, generating tens of billions of dollars in revenues annually. One core problem in online game is to find and convert free users into paying customers, which is of great importance for the sustainable development of almost all online games. Although much research has been conducted, there are still several... Read More

Oct 2

2014

Oct 2 2014
Current and Future Challenges in Data Management (Seth Proctor)
Speakers:
Seth Proctor (CTO , NuoDB)
System:
NuoDB
Video:
YouTube

NuoDB is a relational, transactional database built around a fundamentally new architecture. The distributed nature of the system means that in addition to tackling traditional database problems it’s well-suited to support some new problems emerging in enterprise deployments. This talk will start with a short introduction to what makes NuoDB different, what the motivation was for the architecture and what... Read More

Sep 29

2014

Sep 29 2014
DB Seminar [Fall 2014]: Alex Beutel
Speaker:
Alex Beutel

As we record growing amounts of increasingly detailed user actions and complex interactions, how can we understand and make use of the vast amount of user data?  In order to make use of this growing user data, there are a number of technical hurdles: we must be able to understand and model our users, we must be able to handle... Read More