Research : What can eye-tracking technology reveal about improvised performance?
An empirical assessment.

What does a performing artist think whilst improvising and how do those thoughts relate to the musical output?

movie still dangoren ashmolean
Fig.1 Live public performance and subject of analysis

Fig.2 The Deposition by Sir Anthony divided in to primary subject areas

Presented at Perspectives on Musical Improvisation Conference, Faculty of Music, University of Oxford, September 2012.

On 19th November 2012, I addressed six artifacts in the Ashmolean Museum of Art and Archaeology, University of Oxford through live solo improvised musical performance. One of these performances addressed Sir Anthony van Dyck’s (1599-1641) oil painting The Deposition. The 6-minute performance was filmed and my gaze recorded using eye- tracking technology.

The head mounted eye-tracking technology objectively maps eye tracking data to real-world objects at 30 frames- per-second, outputting to video in which fixations are recorded as a red circle and larger seccades as lines.

The performance followed three prior visits to the museum, to select works to address and to spend time regarding them. I made notes on my thoughts regarding the works themselves and possible ways in which I might approach them in performance. Immediately after the performance I answered a short questionnaire regarding the performance in order to capture some of my thought processes and reflections from the performance itself. Some weeks later, I watched a video of the performance and recorded my reactions to it and what it caused me to recall from the performance.

Performing at the Ashmolean : eye-tracking output

The primary aim of this pilot project is to establish whether data gathered by eye- tracking technology may be used in a grounded analysis together with audio, video, musical transcription and written records to illuminate some of the mental and creative processes involved in improvised musical performance. Presented here are the key results from the project.


Performer Gaze

Eye-tracking data revealed that the performer spent at least 70% of the performance actually looking at the painting. The data extracted indicates the general subject of the performers gaze. The faces of characters received 40% of the performer’s gaze. The figure of Christ received 37% of the performer’s gaze (23% on Christ’s face, 11% on Christ’s wound and 10% on Christ’s feet). Movement of gaze between areas was frequently so quick that they register as simultaneous (faster than 0.033 seconds).


Structural Relationships

The overall structure of the eye-tracking data and musical output show a high degree of correlation and were episodic in nature. The performer’s gaze focussed almost exclusively on one area or group of related areas primarily defined by the characters depicted - Christ, Mary Magdalene, Virgin Mary and St. John. Nearly all changes in attention between faces were either accompanied or followed shortly by clear changes in pitch. The data revealed large and similarly sized sections in which the performer plays but there is no eye tracking data, either confirmed (through performance-video analysis) or suspected to be times at which his eyes are closed. E.g. most of 01:25 - 01:47 (22”), 03:42 - 04:02 (20”), 04:35 - 04:56 (21”).


Uncovering the Music

The musical material of improvised performance is not well disposed to accurate transcription, highlighting by relief the core function of western classical notation - providing a set of instructions to the performer. However, as the example below demonstrates, it is possible to clearly discern patterns of thematic establishment and development both within episodes and across the whole performance.

eye-tracking transcription excerpt
Fig.4 transcription of 2’10.00” - 2’18.75’ at playing pitch

A pitch analysis shows the modal approach taken by the performer with certain intervalic motifs occurring throughout the performance. 1 and 3 semitone intervals dominate and often appear in the same patterns such as the 3-interval structure of +3,-3,-1 semitones as in the third bar above which is first established in the third phrase of the performance and which forms the basis of the highly developed penultimate episode.

eye tracking analysis dan goren
[Click image to enlarge]
Fig 3 The screen-shot above presents a 32-second section of the synchronous analysis made . It includes the audio recording (waveform in top track), subject of gaze (divided according to the primary subject areas as show in fig.2), full length of musical phrases (bottom track) and musical transcription (bottom right) together with two stills from the video of both eye-tracking output and performance (inset) from 00’10.46” and 00’31.00”.

Data collection challenges

This pilot project revealed a number of key issues regarding the collection of eye- tracking data during a live performance. The performer irregularly changed position and posture in order to enhance the expressive output of the performance, employ a range of sound production techniques. The performer’s relative position to the painting and by extension the position of the painting within the frame presented by the video output is therefore highly dynamic. Due to this, the data must be manually extracted and at 10541 frames over the 5 minutes 51 seconds, means a lot of time is required to prepare the data. Contemporary free improvisation practice commonly involves a high degree of non- standard sound production. The performance analysed involved a wide range of techniques such as vocalisation, half-keyed notes, pitch bends and multiphonics. The relative complexity of the raw midi encoding data translated from the audio recording requires a good deal of careful editing to reveal something about the musical structure. The wide range of timbre, intonation and dynamic are both complex within in themselves and in relation to the expressive intentions of the performer. The results included around 22% of performance time in which the results were uncertain. This is due partly to the field of capture being restricted by the frame of the eye-tracking equipment. An additional filming of the performers face during the whole performance is advisable.

What does this tell us?

Initial results indicate four modes in the relationship between gaze and musical performance.

Synchronous gaze and interpretation : the performance reveal a high level of synchronicity between well defined sections of gaze on particular parts of the painting and performance output.

Pre-loading : (the performer looks before performatively interpretting). This occurs at the beginning of the performance evidenced by all the data collected. At other times the performer gaze indicates that he is thinking ahead to the proceeding phrase or section whilst still performing a preceding phrase.

Non-gaze, peformatively active interpretation : The performer plays whilst not looking at the painting, often with eyes shut. Results may indicate that in these periods, the performer’s focus is almost exclusively on his performance.

Non-gaze, peformatively inactive interpretation : Periods in which no gaze is recorded and the performer is not playing his instrument. Generally very short in this performance, in these periods the performer may be mentally or physically recuperating, sometime both.


This pilot project demonstrates that by providing an improvising performer with a visual subject to interpret and by applying a cross-referencing methodology it is possible to illuminate relationships between apprehension, interpretation and practice taking place during a free improvisation performance. The combination of eye-tracking, audio, video, performance transcription, performer notes and reflective account data show great promise for revealing how an improvising musician develops a performance in real-time.

ashmolean museum oxford oxford brookes university Think eye tracking