Workshop on Expressive Performance

CIRMMT Axis Three (“Musical information archiving and retrieval”) presents a half-day workshop in Montreal on the topic of expressive performance The schedule will include two keynote talks by Roger Dannenberg (Carnegie Mellon University) and Christopher Raphael (Indiana University), several 20 minute talks, a poster session accompanied by a wine and cheese, and plenty of time for informal discussion between researchers interested in expressive performance.

Location

BRAMS (Université de Montréal)

Schedule

1:00-1:30 - Johanna Devaney, Schulich School of Music, McGill University

An Overview of Empirical Performance Analysis

I will present a historical overview of empirical performance analysis, with a particular focus on work that attempts to model or describe various aspects of performance. I will also discuss the challenges of automatically extracting performance data from audio signals, with a focus on the singing voice.

1:30-2:30 - Roger Dannenberg, School of Computer Science, Carnegie Mellon University

Timing in Live Performance of Beat-Based Music

Expressive timing is often studied in connection with Western art music in which timing has been shown to be related to music structure and emotion. Relatively little attention has been paid to timing in what I will call “beat-based music,” that is, music that is normally performed at a steady tempo. While there has been much interest in automatic tempo estimation and automated beat tracking or “foot tapping”, relatively little study has been made of data. I have been capturing beat data from live performances. While this is still work in progress, I would like to share some data on actual tempo variation in live performances of different types, some new techniques to estimate the accuracy of human timing through foot pedals and hand tapping relative to the “true” beat which is not directly observable, and finally some results that show we can estimate the next true beat time using previous tap data more accurately than a human can tap.

2:30-3:00 - Steven Livingstone, Department of Psychology, McGill University

Changing Musical Emotion through Score and Performance with a Computational Rule System

Can the emotion of a musical work be changed through the modification of simple compositional and performance cues? To answer this question, CMERS was developed - a Computational Music Emotion Rule System for the control of perceived musical emotions that modifies a work at the levels of score and performance in real-time. Two rounds of perceptual testing were conducted with CMERS. In experiment 1, twenty participants responded to three music works, each with five variations: normal, happy, angry, sad, and tender. Participants gave continuous responses on a two-dimensional feedback tool (arousal and valence). System accuracy of 78% was achieved with significant shifts in valence and arousal. In experiment 2, CMERS was compared with Director Musices, an existing rule system which does not possess important score rules. Eighteen participants performed the same task; two works produced by CMERS and two by DM; each with five variations. Accuracy of CMERS and DM was 71% and 49% respectively. CMERS achieved significant shifts in both valence and arousal, while DM only achieved shifts in arousal. These results suggest the perceived emotion of music can be shifted through the manipulation of simple cues, and that aspects of the score are critical for controlling the valence of a music work.

3:00-3:30 - Ichiro Fujinaga, Schulich School of Music, McGill University

Do Key-bottom Sounds Distinguish Piano Tones?*

Paper co-authored by Werner Goebl

The timbre of a single piano tone as well as its loudness is primarily determined by the speed at which the hammer hits the strings (final hammer velocity). However, the overall sound may also be influenced by impact sounds such as the hammer-string or the finger-key impact sounds. Especially the latter can be varied with playing technique (touch) and is easily perceptible. Little is known about the nature of sounds that emerge from the interaction of key and keybed, which is the large piece of wood underneath the keys. In this study, we investigate whether the absence or presence of a key hitting the keybed makes two otherwise identical piano tones distinguishable by expert listeners. A skilled pianist produced a number of isolated tones on a computer-monitored Bösendorfer grand piano (“CEUS”) that measures the loudness and onset timing of the tones as well as the continuous position of the keys. We selected tone pairs that were identical in pitch, loudness, and tone length, but with or with out a key-keybed contact. The key-keybed contact was identified from the recorded key trajectories. Overall, the participants performed the task very well, significantly better than chance (82% correct); specifically the difference of a tone pair was even better identified (82.7% correct). F7 was slightly easier to rate than E7; there was no effect of loudness. Even though the investigated key-bottom sounds are subtle compared to other sound components, our results confirm that they can indeed audibly influence the timbre of a piano tone. The investigated effect may indeed have ecological relevance, as many important listening situations occur in the vicinity of the piano keyboard (e.g., piano practicing and piano lessons).

3:30-3:45 - Coffee Break

3:45-4:45 - Christopher Raphael, School of Informatics, Indiana University

Expressive Synthesis of Melody and Musical Prosody

I will describe ongoing work in expressive melody synthesis. This effort is focussed on musical prosody — the use and avoidance of stress in music and the associated grouping it creates. I will represent a prosodic interpretation as a note-level markup using a small alphabet of symbols. I will show that this markup can be used to generate expressive performance through a deterministic mapping from markup and score to audio. I will also look at the machine-learning problem of estimating this prosodic markup. The examples use a small collection of 50-or-so folk like melodies that have been hand annotated with prosody.

4:45-5:15 - Alexandre Bouënard, Schulich School of Music, McGill University

Going Beyond Motion Capture Data: An Application for Synthesizing Expressive Percussion Performances*

Paper co-authored by Marcelo M. Wanderley and Sylvie Gibet

The increasing availability of software for creating real-time simulations of musical instruments allows for the design of new visual and sounding media. Nevertheless, from a conceptual and pratical point of view, the question of how these new instruments can be controlled has rarely been addressed in the literature. We present a framework for the control of virtual percussion instruments by modeling and simulating virtual percussionists, based on a motion capture database and on a physically-based movement simulation environment. We discuss the benefits and limits of such an approach as a means of synthesizing new expressive percussion performances.

5:15-5:45 - Douglas Eck, Department of Computer Science, University of Montreal

Learning from performances with and without scores

I will discuss work done with Stanislas Lauly on learning a performance model based on score-aligned artistic performances. I will focus mainly on how we represented the music in order to achieve (reasonable) success. I will then go on to talk about how to do something similar without having access to a score. I will suggest using a model able to find metrical structure via beat tracking as a noisy replacement for the score. Though I don’t have performance results yet for this model, I am able to show some promising statistical analyses of our Boesendorfer datasets.

5:45-6:30 - Poster session / Boesendorfer Demo with vin-fromage

Stanislas Lauly - Department of Computer Science, University of Montreal

Demo of automatically generated expressive performances of Schubert waltzes on Boesendorfer

Michel Bernays, Faculty of Music, University of Montreal

Piano Timbre: from words to gesture to sound: Perception, verbal description & gestural control of aggregate piano timbres by highly skilled pianists

Timbre is a key to musical expressivity in virtuosic pianistic performance. When discussed amongst professionals, timbre is described with abstract terms such as dark, bright, round, velvety, shimmering, whose imagery aims at fitting the sonic nuances, but bypass the quantitative and functional characteristics of its production. Still, pianists seem able to avoid inter-individual misunderstandings in timbre description. This study then aims to determine the degree of consensus of this vocabulary among the pianistic community, and identify its gestural correlates at the keyboard level. A professional pianist played 3 short pieces, with 8 adjectives as successive instructions to color the performances, on the computer-controlled recording acoustic piano Bösendorfer CEUS, that gathered data on key movement and hammer velocity, from which to extract the specific gesture parameters with custom Matlab functions. The audio recordings were used as stimuli, over which the pianist himself proved easily capable of retrieving the timbres. In the main task, 17 other pianists provided a verbal description of each timbre they could recognize, which fitted the expected descriptor roughly one third of the time. The results got much more conclusive, and way above chance, once accounting for the semantic proximity between adjectives. This indicates the expressive intentions of a virtuosic pianist can be perceived by his peers and can be verbally described in a consensual way. Gesture analysis is now under way, with the aim of identifying meaningful correlations between statistics of synchronism, dynamics, touch, overlap, articulation, etc., and the timbres employed.

Thomas Stoll, Department of Music, University at Buffalo

Integrating Musical Expression into a Corpus-based Organizational System

Corpus-based systems of musical material are comprised of audio that is segmented, categorized, and deployed as units based on parameters derived either from the audio itself, user-generated tags, or some extra-musical data. The possibilities of encoding musical meaning and structure into the collection of sonic units via segmentation and organization are both of utmost concern to the user. The features encoded within the units may exhibit particular expressive qualities individually or collectively and fuel algorithms that reorder units in interesting ways. The material presented here comes out of ongoing research and creative work aimed at exploring the many various aspects of sound organized within corpora. It is quite open-ended owing to the extensive possibilities afforded by the corpus-based framework.