Doctoral Programme in Informatics Engineering

I'm a PhD student at FEUP since September 2005.

My advisor is Cristina Ribeiro and my co-advisor is Gabriel David.

My Steering Committee includes Mark Sanderson and Mário J. Silva.

Information Retrieval on Time-Dependent Collections

Web Information Retrieval (WebIR) is the application of Information Retrieval concepts to the World Wide Web. The most successful approaches in this field have modeled the web's structure as a directed graph and explored this concept using different approaches. HITS and PageRank are two of the most well known algorithms within this line of research. Most of this research has origins in the area of citation analysis. Nevertheless, although time is an important dimension in the citation analysis literature, it hasn't been explored with depth within WebIR. Recent studies show that the web is a highly dynamic environment, with significant changes occurring weekly.

Baeza-Yates et al. (2004) draw an analogy between web crawling and the task of an astronomer watching the sky. What web search engines users see is not the current state of the web but an image of what the crawler captured at a specific time. Typically, the dynamic nature of the web is not incorporated in the task of ad-hoc web retrieval.

In the context of my PhD work, I intend to explore the impact of the time dimension on information retrieval in time dependent collections. I will use the web since time has a strong impact in this collection, both on the individual documents and on the interlinked structure.


