News & Events
[DB Seminar] Fall 2015: Round table
Seeing as I've not been able to get a speaker for today, I think we can default to a round table discussion. I think it would be nice to briefly discuss what we are all working on now, after the WWW/SDM deadlines. We can additionally discuss the focus of some of the works we submitted to these conferences, if there is interest in that. Next week, we plan to have Prashanth speak about some work he did during his MS. Read More
[Databaseology 2015] Lauren Foutz (Oracle)
In 1991, graduate students at the University of California, Berkeley created an improved database engine library for Unix they named Berkeley DB (BDB). When the up and coming web browser company Netscape requested that the authors extend and improve the library, Sleepycat Software was born to maintain BDB. In the following decades BDB has been deployed millions of times and used in commercial and open-source applications ranging from Amazon, Subversion, and MySQL. This talk details the architecture of Oracle Berkeley Read More
[DB Seminar] Fall 2015: Alex Beutel
Which seems more suspicious: 5,000 tweets from 200 users on 5 IP addresses, or 10,000 tweets from 500 users on 500 IP addresses but all with the same trending topic and all in 10 minutes? The literature has many methods that try to find dense blocks in matrices, and, recently, tensors, but no method gives a principled way to score the suspiciousness of dense blocks with different numbers of modes and rank them to draw human attention accordingly. Dense blocks Read More
Charlie Swanson (MongoDB)
From "How many documents are in my collection?" to "What state has the highest percentage of people living in its most populous city?", there are many questions MongoDB can answer about your data. In this talk, we'll see what sorts of questions can be asked, and how MongoDB finds the most efficient way to answer them. Determining the best way to answer a query is a challenging and complex subject. We'll show how to express your questions using MongoDB's query Read More
CMU/SCS team wins ICDM ’10-year highest impact’ paper award.
CMU/SCS team and alumni win the ICDM 2015 10-year highest impact paper award, for the paper Fast Random Walk with Restart and Its Applications by Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. ICDM is one of the top data mining conferences. The paper had also won the 'best research paper' award that year. It shows how to quickly compute the so-called 'personalized random walk with restarts', to estimate node proximity, with applications in image captioning and co-authorship graphs. The paper Read More
[Databaseology 2015] Ivan T. Bowman (SAP)
SQL Anywhere is an embedded SQL database engine designed from its first release in 1992 to give good performance "out of the box" in a range of environments from small devices (Raspberry Pi and handhelds) up to server class machines supporting databases of hundreds of gigabytes and thousands of users. From the beginning, SQL Anywhere was designed to offer self-management features supporting deployment as an embedded database system where the database administrator cannot immediately connect and diagnose and solve problems. Read More
DB Seminar [Fall 2015]: more WIN discussion
We'll continue last week's theme of discussing more cool talks from WIN. Other WIN attendees -- please bring some notes about the talks you liked/didn't like or thought-provoking questions from the workshop. Read More
DB Seminar [Fall 2015]: Gene Davis (Splice Machine)
In this talk, Gene will discuss Splice Machine, a full RDBMS built on top of Hadoop. He will describe what makes Splice Machine work and what their team is working on now to make it even more performant on mixed database workloads. Read More
DB Seminar [Fall 2015]: Kijung Shin
Given a large graph, how can we calculate the relevance between nodes fast and accurately? Random walk with restart (RWR) provides a good measure for this purpose and has been applied to diverse data mining applications including ranking, community detection, link prediction, and anomaly detection. Since calculating RWR from scratch takes long, various preprocessing methods, most of which are related to inverting adjacency matrices, have been proposed to speed up the calculation. However, these methods do not scale to large Read More
[Databaseology 2015] Howard Chu (LMDB)
The Lightning Memory-Mapped Database (LMDB) was introduced at LDAPCon 2011 and has been enjoying tremendous success in the intervening time. LMDB was written for the OpenLDAP Project and has proved to be the world's smallest, fastest, and most reliable transactional embedded data store. It has cemented OpenLDAP's position as world's fastest directory server, and its adoption outside the OpenLDAP Project continues to grow, with a wide range of applications including big data services, crypto-currencies, machine learning, and many others. The Read More