VITA 2010

  Vienna Talk 2010 on Music Acoustics
"Bridging the Gaps"
      September 19–21


Music performance and playing technique (W. Goebl, G. Widmer)

The past decades yielded a wealth of research that cut through the various aspects of music performance: from the printed score to the actual performance, from CD recordings to symbolic data, from automatic performance rendering to individual style analysis, from the brain to the actual body movements. Advances in technology and computing power accompanied and potentially supported this development. Many of these findings could have direct impact for the performing musician, but only in rare cases were such findings applied to actual music practice. How could these findings made accessible to the performing musician? How could new technology (i.e., three-dimensional motion capture equipment) and methodology (i.e., computer modelling of individual style) be used to enhance a musicians experience in practicing and performing? The present thematic session aims to bring together researchers from different backgrounds, music teachers and students to try to bridge – or at least to reduce – the potential gap between science and performance.

hide abstracts

Widmer; Gerhard: 
(Keynote) / O
The presentation gives a broad overview of some of our research on empirical, computer-based piano performance analysis, with a special focus on techniques and methods that computer science can contribute to the quantitative study of expressive performance. These comprise, among other things, (semi-)automatic measuring methods that extract aspects such as timing from audio recordings; algorithms for aligning different performances for purposes of comparison or annotation; animated visualisations of expressive parameters; algorithms that discover patterns in performance-related data; and computer programs that attempt to model and even predict certain (low-level) aspects of performance. I will show some examples of these methods in action, and will also briefly discuss the potential use of such technology in music-didactic contexts.
In our talk we will present a new method that permits a computer to listen to, and follow, live music in real-time, by analysing the incoming audio stream and aligning it to a symbolic representation (e.g., score) of the piece(s) being played. In particular, we will present a multi-level music matching and tracking algorithm that, by continually updating and evaluating multiple high-level hypotheses, effectively deals with almost arbitrary deviations of the live performer from the score -- omissions, forward and backward jumps, unexpected repetitions, or (re-)starts in the middle of the piece. Also, we will show that additional (automatically computed) knowledge about the structure of the piece can be used to further improve the robustness of the tracking process. We will discuss the resulting system in the context of an automatic page-turning device for musicians, but it will be of use in a much wider class of scenarios that require reactive and adaptive musical companions.
During the warm up of trumpet players, face muscle contractions with increased blood flow result in a higher temperature of the overlying skin. This effect can be visualized and quantified by infrared-thermography. The analysis demonstrates that the main facial muscle activity during warm up is restricted to only a few muscle groups (M.orbicularis oris, M.depressor anguli oris). The “trumpeter’s muscle” (M.buccinator) proved to be of minor importance. Less trained players expressed a more inhomogenous thermographic pattern compared to well-trained musicians. Infrared thermography could become a useful tool for documentation of musicians playing technique
Centre for Systematic Musicology, University of Graz, Austria

We are exploring the complex relationship between accents and expression in piano performance. Accents are local events that attract a listener’s attention and are either evident from the score (immanent) or added by the performer (performed). Immanent accents are associated with (temporal, serial) grouping (phrasing), metre (downbeats), melody (peaks, leaps) and harmony (or dissonance). In piano music, performed accents involve changes in timing, dynamics, articulation, and pedalling; they vary in amplitude, form (amplitude as a function of time), and duration (the period of time during which the timing or dynamics are affected). We are analyzing a selection of Chopin Preludes using a novel method that combines aspects of the generative approach of Lerdahl and Jackendoff (1983), the modeling approach of Sundberg (1988) and the accent-based approach of Parncutt (2003). In the first stage, pianists and music theorists mark grouping, melodic and harmonic accents on the score, estimate the importance (salience) of each, and discuss their interrelationships. In the second stage, we mathematically model timing and dynamics in the vicinity of selected accents using an extended version of Director Musices ? a software package for automatic rendering of expressive performance. This work was supported by the Lise Meitner Project M 1186- N23 (“Measuring and modelling expression in piano performance”), sponsored by the Austrian Fonds zur Förderung der wissenschaftlichen Forschung (FWF).
One of the crucial points in empirical performance analysis is the acquisition of data. Automatically extracting information related to timing, dynamics and articulation from audio recordings is still not possible at the level of precision required for large-scale music performance studies. The Bösendorfer computer-controlled grand piano makes it possible to record performance data in the symbolic domain instead of the audio domain, providing very precise data. This work is part of a series of music performance studies centered on the Magaloff Corpus, a unique resource of performances recorded on such an instrument. The collection comprises Chopin's complete works for solo piano performed by Nikita Magaloff in six public appearances in spring 1989 at the Vienna Konzerthaus.

The present study focuses on the phenomenon of performance errors. We examine the errors Magaloff makes from different angles. First, we give an overview of the number of errors in the data and relate them to the tempo of the performances. Second, we investigate perceptual aspects of the errors: how well do they fit into the surrounding harmonic context, how loud are they played in comparison to other notes in the vicinity, and where are they located in terms of voice. Third, we examine two error patterns more closely that reoccur throughout the corpus: the omission of inner voices and inserted notes in sequences of parallel octaves.
According to the musical communication hypothesis proposed by Kendall and Carterette (1990), the performer plays a central role in transmitting the composer’s musical intentions to listeners. Previous research has emphasized the role of expressive strategies in performance in conveying the formal structure of a piece (Clarke, 1985; Palmer, 1989) and its emotional content (Juslin, 2000), clarifying the polyphonic texture (Palmer, 1996; Goebl, 2001; Gingras, 2006), or communicating the musical individuality of the performer (Sloboda, 2000; Gingras et al., 2008). However, the role of the constraints imposed by the physical properties of an instrument on the performer’s choice of expressive strategies has not been investigated in such detail. One exception was Walker’s dissertation (2004), which showed that expressive strategies were affected both by the choice of instrument and by the performer’s expressive intents. Here, I propose to extend Walker’s research by comparing the range and extent of expressive strategies used in the performance of three keyboard instruments with similar playing technique but vastly different timbral and acoustic properties. Through the analysis of quantitative data gleaned from research on piano (Palmer, 1996; Repp, 1996; Goebl, 2001) and, more recently, organ (Gingras, 2008) and harpsichord (Gingras, 2008, 2010) performance, I show that instrument properties affect the expressive strategies favoured by performers in a multifaceted process. I propose that performance practice tends to establish an implicit hierarchy of expressive strategies, which arises both from the physical potentialities of the instrument and from the perceptual constraints of the auditory system. Moreover, results from perceptual experiments suggest that dominant expressive strategies tend to be generalized across instruments and repertoires, while secondary strategies are more instrument- and repertoire-specific and thus prone to pronounced effects of listener expertise. In conclusion, I propose that while instrumental constraints affect both the compositional style associated with a given instrument and the expressive strategies used by its practitioners, perceptual constraints also impose limits on the potential efficiency of certain expressive strategies, thereby outlining a feedback loop that may constitute a driving force for organologic evolution.
This poster presents a kinematic examination of the movement properties of pianists’ fingers measured with three-dimensional motion capture technology. Pianists performed melodic passages at a wide range of tempi; from medium fast (143 ms inter-onset interval, IOI) to extremely fast (62 ms IOI). The main question was whether finger motion dynamics change with performance tempo, an important issue for practicing piano and training at conservatories. Kinematic landmarks were determined from the finger trajectories (maximum height, finger-key contact, and key-bottom contact). The timing of those kinematic landmarks changed considerably as the tempo became faster; piano touch was under deliberate control only at slower tempi. Individual differences in the maximum performance speed led to specific claims about desirable finger dynamics for successful piano playing.
Audio-to-score alignment is a technique, where the symbolic representation of a piece of music is synchronized to the audio signal of a recorded performance, such that the onset time of each individual note can be extracted and tempo curves can be computed. A standard algorithm used for alignment is Dynamic Time Warping. However this methods suffers from the shortcoming that notes which are played simultaneously in the score will always be aligned to the same time frame within the audio signal.

We have introduced two post-processing methods, which are not only able to overcome the flaw, but also to generally increase the accuracy of extracted note onsets. The first one facilitates tone models, trained in advance. The audio spectrum is then factorized using a dictionary of these models, such that the activation energy of each model over time is obtained. In cases where a note is played only once within a certain time span, the frame where the maximal increase in this activation energy occurs is a very accurate estimator for the onset time.

However, there are ambiguous cases, where notes are repeated and no significant peak of activation energy is found. In these cases a second revision is done by investigating the energy increase in the frequency band, corresponding to the notes fundamental frequency. Energy increases are weighted using a Gaussian window around the initial onset estimate, considering its robustness.

We have shown that this method results in more than 90% of all notes being aligned with a temporal deviation from the real onset time of less than 50ms and almost 50% of the notes with an error smaller than 10 ms. The test data used consists of several Mozart sonatas, comprising more than 30.000 notes.
Some of the many elements that can be researched in the field of musical performance can be observed from the analysis of intonation, segmentation or emotional intensity. In this paper we develop a psychoacoustic study of the interpretation of 'Oculto' ('Hidden'), an atonal piece of the composer Luis de Pablo, in order to find characteristics in the measurement of the marked parameters when this is an atonal work. The detachment or absence of tonal hierarchy could produce a peculiar behavior in the way of addressing the expressive intonation, or structural understanding to moments of great emotional intensity. The audio interpretation of a high-grade conservatory student was recorded in order to get the data analysis. After the performance, the student was asked to mark on the music score the points at which segments the musical discourse, and those in which feels greater emotional intensity. We study and analyze the relationships between segmentation points of the structure, pitch (F0) and intonation tendencies of comparative musical passages and emotional intensity.
This work is part of the research project “Audition, cognition and emotion in the atonal music performance by high level music students”, funded by the Ministry of Science and Innovation of Spain (Research National Project i+d 2008-2011, code EDU-2008-03401)
Banner Pictures: (c) PID/Schaub-Walzer