Task 5

Name: Robust real-time glottal pulse estimation from running singing

Coordination: FEUP/FMUP

Duration: 13 months

Task description

The objective of TASK5 is to develop a computational procedure that is able to estimate reliably and in a non-invasive way, the glottal pulse from running singing, in real-time. The glottal pulse is very important because it conveys quite relevant information regarding the physiological structure of the glottis and vibration pattern of the vocal folds [Ros07, Fou00, Wal07, Leh07]. In turn, these aspects determine the quality of the phonation, either in the perspective of artistic/aesthetic quality or in the perspective of healthy/non-healthy voice quality.

The objective of this task is quite ambitious as to our knowledge no solutions have yet been developed that use running singing [Sun03, Wal07], Also, the existing solutions are quite sensitive to the fundamental frequency of the voice, which indicates that the estimation in singing is likely to be more problematic than with speech. In order to estimate the glottal pulse from the acoustic signal (i.e., in a non-invasive way), an inverse filtering strategy is required [Wal07, Leh07]. Inverse filtering presumes the source-filter model (from Fant [Ros07]) of speech production and implies the reliable estimation of the vocal tract filter and lip radiation filter [Leh07]. This estimation presents practical challenges that are difficult to overcome with real sustained speech and even more difficult to address with pathological voice. Some good results are however achieved using some iterative procedure that starts with a parametric model of the glottal pulse (for example the Liljencrants-Fant model or the Rosenberg model [Ros07]), then an estimate of the vocal tract and radiation filters is obtained which is then used to obtain an improved estimation of the glottal pulse. This approach is quite promising and it will be investigated with singing and will be adapted for real-time operation with non-stationary singing or speech. Very significant innovation results will be obtained in the context of this task that can be extended to other application scenarios where the economic value is considerable. For example, the results of the research carried out in the context of this task pave the way for the automatic remote assessment of the voice quality as when a patient calls to the hospital or clinic [Rei04]. Thus, this scenario justifies that an international patent application process be filled.

Although the objective is to develop a non-invasive procedure, invasive methods will be used to obtain data whose importance is central to complete the acoustic data in the definition of accurate models of the glottal pulse for different singing or spoken voice registers and health conditions. In particular, electroglottograph (EGG), laryngoscopic, and stroboscopic information will be captured in addition to the acoustic signal. This will be possible thanks to the participation of researchers from FMUP in this task (who are also ORL doctors), since only ORL doctors are allowed by the Portuguese law to perform these exams. Engineers (FEUP) will also be involved in this task.

The results of this task, in addition to the results of TASK4, are decisive for the success of TASK6.

Expected results

The expected outcomes of this task are 2 reports, software models and a patent application.

Human resources: 17.25 person-month.