TweeProfiles: Detecting spatio-temporal patterns on Twitter

Private data

  • Author name: Tiago Daniel Sá Cunha
  • Supervisor: Carlos Soares (PhD)
  • Co-supervisor: Eduarda Mendes Rodrigues (PhD)
  • Work developed on SAPO laboratory at FEUP

Abstract

  • Online social networks present themselves as valuable information sources about their users and their respective interests. Such information has been subjected to many studies conducted by Data Mining scholars throughout the world in order to discover users’ behaviours and patterns. Besides, there has also been also investment applied in creating platforms for the continuous information extraction and for their data visualization.
  • This dissertation aims to identify tweet profiles by analysing multiple types of data: spatial, temporal, social and content. The goals set are to develop a information extraction approach to validate dimensional combination and to develop a visualization tool capable of displaying the patterns found using state of the art representations.
  • The data mining process is composed by dissimilarity matrices computation, normalization and combination. Each dissimilarity matrix is then subjected to a clustering algorithm that retrieves the information. This dissertation studies in depth appropriate distance functions for the different types of data, the normalization and combination methods available for different dimensions and the clustering algorithms existent.
  • The visualization platform is designed for a dynamic and intuitive usage, aimed at revealing the discovered patterns in the data mining process in an understandable and interactive manner. In order to accomplish such, various visualization patterns were studied and widgets chosen to better represent the information retrieved.
  • The study case on which it will be applied is the geo-referenced data from TwitterEcho, although it will be developed to use any geo-referenced tweets extracted form Twitter itself.

Final product website