The Project

This dissertation was proposed by INESC Porto UTM and its ambitions are to incorporate new techniques in analysis and annotation of A/V content, which would increase the capabilities of existing distribution services such as IPTV services, social networks and local applications that store and manage A/V content.

The Objectives

The main objective overall for this dissertation is to develop tools to classify A/V content and build an initial knowledge base on categories of A/V content. These will then be used to represent and categorize A/V content in different application scenarios. The main goal is to identify and suggest similar content based on these categorizations within different contexts.

As the ambition is to increase the capabilities of existing services, there are three main areas of interest that will be taken into account:

  • IPTV services: Ability to analyze and categorize programs or movies that are being watched by the user at that moment and then give suggestions based on those results.
  • Social Networks: The objective is to analyze and categorize the published content by a user and then search similar content and then give suggestion based on those results.
  • Storage and Management of A/V content applications: In this case, the user must categorize the content. In order to reduce the subjectivity of the process, there should be an automatic analysis and categorization of the content. This automated process can be used to enhance or refine existing descriptions that have been previously inserted by a human. Conversely, once generated, these keywords may assist a human when inserting additional descriptive data. This can be useful in professional environments such as A/V post-production, helping to create accurately annotated archives, which are easily searchable.


Nowadays, searching for the desired or useful information can be extremely hard as the amount of information available is enormous and continuously grows every year. Currently available search mechanisms often lead to incorrect results, mostly due to the unavailability of content descriptions or to the availability of incorrect or disparate tags.

The most usual searching engines are based on keywords attached to the available resources. However, the process of creating these tags is usually is manual which leads to subjective analysis. Therefore there is the need to use other alternatives for the generation and insertion of descriptive tags, preferably based on intrinsic content information, removing all the subjectivity and human interference in the process.

There has been an effort to develop a base of knowledge containing the most important features of A/V content, but the results so far are somewhat incomplete in some areas.

The possibility to describe, categorize and suggest A/V content in a automated or assisted way, bringing more accuracy to the process proves to be challenging, innovative and with the available A/V content distributors on the market, necessary.