DB Seminar [Spring 2015]: Vagelis Papalexakis (Thesis Proposal dry run)
Given a Knowledge Base that records millions of relations of the form “Barack Obama is the president of USA”, how can we automatically learn new synonyms and enhance the Knowledge Base?
Imagine now measuring the brain activity of a person while reading words that appear in this Knowledge Base; how can we relate information processing in the brain, and information found on the World Wide Web? Can we use both pieces of data in order to enhance knowledge extraction in both scenarios?
On a third, seemingly unrelated, application, consider having different “views” of a social network, e.g. observing who is calling whom, who sends emails to whom, and who texts whom; can we use this rich information towards community and anomaly detection? What if we also have demographic information about the people of the network? Can we further enhance our analysis?
The key underlying theme behind all the above applications is the multiaspect nature of the data, with the ultimate question being: how can we take advantage of all different aspects? And if so, can we analyze sets of multiaspect data jointly? Finally, can we automatically, and in a mostly unsupervised setting, filter out aspects of the data which are redundant or not beneficial for the task at hand?
In this thesis, we work towards answering the above questions, in two different thrusts:
1) Algorithms & Models: we develop multiaspect analysis models and scalable algorithms, with specific emphasis to Tensor Analysis, that are able to efficiently extract knowledge from multiaspect data.
2) Applications: we apply our algorithms to a variety of multiaspect data problems, with specific emphasis on linking knowledge extraction from the Web and the brain.