The Jazz Turnaround: A Back-to-Back Paradigm for Studying Improvisation

Musical improvisation involves extremely complex cognitive processes—with performers engaging in rapid, in-the-moment decision-making coupled with focussed motor attention, all-the-while maintaining awareness of the other musicians and the music. It’s no wonder that it captivates the interest of Dr Freya Bailes, a music researcher at the University of Leeds, who presented a talk on the cognitive mechanisms involved in improvisation at Goldsmiths College on the 1st of February 2018.

Who’s leading who?

One aspect of improvisation that may not automatically spring to mind is leadership. Traditionally, within certain types of improvisation such as jazz music, leadership may arise from a conductor or more likely, the lead-player within the ensemble. These individuals show where the music is going (rhythmically, harmonically, and dynamically) through subtle changes in their playing and (often fantastically cryptic!) gestures and visual cues. But how does leadership occur for two people improvising freely?

This was one of the core focuses for Bailes and Dean in a recent study investigating cognitive processes in improvisation. The researchers paired professional pianists, instructing them to perform six, three-minute improvisations. However, there was a catch! The performers had to play back-to-back at separate MIDI pianos, rendering them unable to use visual cues while improvising—Bailes wanted exchange of auditory information only.

UntitledBack-to-back pianists—the set-up used in Bailes’ experiment. (Source)

Limited directions for the format of the six improvisations were given. Two of the improvisations were labelled as completely free, one had a dynamic structure (quiet, loud, quiet), one was to be centred around a pulse, one was instructed to be led by Performer 1, and the last was to be led by Performer 2. The researchers were interested in how responses to each other’s playing would influence the development of leadership roles between the two performers. In addition, they were curious to discover the performers’ perceived leadership roles, i.e. who each performer believed was leading the improvisation at different points in the piece. Thirty minutes after performing, the performers listened back to their improvisations and rated who they felt influenced the music most in each section. They found that aural cues alone were sufficient for performers to identify who was taking the lead.

Sweat for science

Bailes and Dean aimed to probe both conscious and unconscious measures of the performers’ experiences. Therefore, in addition to the conscious measure (leadership rating), they employed an unconscious measure by recording the physiological arousal of performers while improvising. Arousal is considered a component of the emotional response triggered by listening to music (Khalfa et al. 2002; Rickard, 2004). They measured arousal by recording changes in skin conductance (SC)—a type of electrodermal activity caused by variations in the sweat glands, controlled unconsciously by the sympathetic nervous system (Khalfa, Isabelle, Jean-Pierre, & Manon, 2002).

Untitled.png SC.pngSkin conductance (SC) is captured via skin electrodes placed on the fingertips or, when confronted with jazz pianists, on the left ankle. (Source)

SC is often measured on the fingertips—a bit problematic for pianists! Instead, Bailes and Dean measured SC on the pianist’s left ankle, as the performers were able to keep that part of their body still (the right ankle was left free for pedalling). Interestingly, it was previously hypothesised that SC might increase more during transitions in the improvisations, as those points in the music require increased attention and effort to develop a new pattern in the music (Dean & Bailes, 2016). An analysis of a case study for one duo found that SC typically did increase during transitions, e.g. when a new dynamic section began. One player’s SC matched the musical structure of the improvisations, while the other player had an overall greater variability in SC but did not always follow the shape of the music. In general, improvisation could intensify a performer’s arousal state by focussing attention on the moment-by-moment decision making, and awareness and reaction to the other performers’ actions. That’s a lot to think about at once!

How about you, the audience?

In addition to obtaining leadership perception from the performers, Bailes also investigated listeners’ (specifically non-musicians) perceptions on the leadership roles during the improvisation. It seems that Bailes enjoys tackling difficult topics as she herself described ‘perception of leadership in improvisation’ as an impossible task for non-musicians! It was the researchers’ turn to improvise, as they devised an alternative approach—to ask an open question to the non-musician listeners: “Indicate where any significant changes in sound occur within the improvised piece of music”. The piece they listened to was taken directly from the recordings of the professional musicians used for their study on leadership roles (Dean & Bailes, 2016). The questions asked were left deliberately open to interpretation as they didn’t want to bias any perceptions. Participants were asked to listen to the piece at a computer and move the mouse to indicate change. Large changes in music were to be indicated by faster mouse movements, and smaller changes by slower movements.

In addition to this, non-musicians were asked to report the level of perceived arousal expressed during the piece of music. By moving the mouse along a scale (moving up the scale = higher arousal), participants mapped out the level of arousal over the time-course of the music. Here, the team were interested in whether outside-listeners were sensitive to the physiological arousal of the performers. By comparing the outside-listeners’ perceived arousal with the SC of the performers over the time course of the music, Bailes and Dean were able to analyse the data for any correlations that may support their hypotheses.

Bailes and Dean developed a couple of interesting hypotheses concerning outside-listeners. Firstly, they predicted the perceptions of the outside-listeners would align with the performers’ perceptions of changed leadership, however, this was not the case. Instead, the case study analysis of one duo revealed that the listeners’ perception of changes in sound aligned with the computational segmentation of each pianist’s performed key velocity. Their second hypothesis was that the outside-listeners’ perception of arousal would align with the performers’ level of physiological arousal, as measured by their SC level, over the time-course of the music. Interestingly, a mixed result was found regarding their second hypothesis. In the same case study, Performer 2’s skin conductance correlated with the listeners perceptions of arousal, and yet in the same piece of music, performer one’s skin conductance did not align with the listeners perceptions. Bailes suggests that perhaps individual differences in SC (Performer 1 was more prone to sweating!) may have weakened the link between perceptions and physiological measures of arousal.

Untitled.png arousalDiagram illustrating the levels of arousal measured in the experiment (Bailes, 2018)

The research presented by Bailes and Dean display some interesting details. Their research looked at a range of intriguing questions regarding improvisation, from leadership roles to arousal and the interactions between performers’ perception and physiological changes and non-musician’s perceptions. Bailes and Dean’s research suggests that when two performers are playing together, aural cues alone are enough to allow performers to agree on who was leading the music at any given point. However, their case study also found no evidence to support some of the hypotheses proposed, potentially highlighting the intricacy of investigating such concepts. It seems that when you’re fascinated by researching impossible tasks, you can’t always expect straightforward results—but that’s all part of the fun.

Nicholas Feasey, Taylor Liptak, and Alex Lascelles


Bailes, F. (2018). Cognitive processes in improvisation [Powerpoint slides]. Retrieved from

Dean, R. T., & Bailes, F. (2016). Relationships between generated musical structure, performers’ physiological arousal and listener perceptions in solo piano improvisation. Journal of New Music Research, 45(4), 361-374.

Khalfa, S., Isabelle, P., Jean-Pierre, B., & Manon, R. (2002). Event-related skin conductance responses to musical emotions in humans. Neuroscience letters, 328(2), 145-149.

Rickard, N. S. (2004). Intense emotional responses to music: a test of the physiological arousal hypothesis. Psychology of music, 32(4), 371-388.

Rowe, M. (2011, May 13). Jazz Code. [Web log post]. Retrieved January 20, 2018, from


Posted in Uncategorized | Leave a comment

“The seductiveness of music lies in its ability to titillate the senses”: Elaine Chew on musical structure

Think about the last time a piece of music took you by surprise… What triggered it? How did you feel? Did others react the same way? You might become aware of musical structure through an established rhythmic pattern or a subverted harmonic expectation. It exists at various levels within a piece of music, from short motifs, through to longer patterns.

Structure is an integral music component and indeed, music is often described as  “organised sound” (Varèse, cited in Goldman, 1961). Composers conceive and organise structure, performers express it, and listeners decipher it. Differences in our perception of these structures will dictate our expectation of the forthcoming music and alter our individual experiences of music.

So how can we make sense of our musical experiences by analysing and quantifying structure? Elaine Chew is a self-described “mathemusical scientist” (Chew, 2016, p. 37). Her research on musical structure spans conceptual art through to mathematical modelling and gives new insights into music perception. She spoke to Goldsmiths’ Music, Mind, and Brain MSc students about the perception and apperception of musical structure.

“When practise becomes performance”
(Chew & Child, 2014)
Sight reading as a means of structural insight

The process of sight-reading requires an array of neurological and motor functions, including “perception (de-coding note patterns), kinesthetics (executing motor programs), memory (recognising patterns) and problem-solving skills (improvising and guessing)” (Parncutt & McPhereson, 2002, p. 78). This reliance on pattern decoding means that sight-reading could provide insight into a performer’s initial comprehension of musical structure.

Pic 1

Source: Chew, 2013 (click for enlarged image)

Prior to the nineteenth century, public performances of music usually consisted of scores being performed at first sight, without practice (Parncutt & McPhereson, 2002). Nowadays, music is usually painstakingly rehearsed beforehand. In 2013, Elaine Chew worked with composer Peter Child and conceptual artist Lina Viste Grønli to challenge our expectations of performance. After a visit to the Berlin Philharmonic, Viste Grønli found her thoughts fixated on the musicians’ warming-up “performance” and began questioning how these chaotic, unplanned sounds could be captured. What followed was Practising Haydn. Chew was recorded practising Haydn’s Piano Sonota in E Flat, and the session was then meticulously transcribed to create a new score and publicly performed.

Screen Shot 2018-04-10 at 15.48.12

Source: Chew, 2013

The transcribed practise session and its comparison to Haydn’s original score leaves a fascinating trace of the cognitive processing of musical structure. Childs’ score is full of metrical changes, repetitions, pauses, and interruptions – quite unlike anything you’d expect in a piece of Haydn’s music. These alterations mark structural points at which musical patterns and expectations are subverted. And this process is an example of one type of conceptual analysis into structure through the composer/performer relationship.

“How music works, why music works and […] how to make music work”
(Chew, 2016, p 38)
Modelling musical structure and expectancy

In order to better understand the processes that leads the composer and listener to create and perceive musical boundaries, it is important to develop mathematical models of music cognition that describe variations of musical expectancy and tension (Huron, 2006). Tension can be induced by both tonal and temporal patterns. Also, the musical properties that comprise such patterns are multidimensional and dynamic (Herremans & Chew, 2016), which means they can be difficult to model accurately.

Chew argues that important parallels can be drawn between our understanding of the physical world and our experience of musical structure (Chew, 2016). As she explains, people can imagine and describe forms of physical movement with ease: What does it feel like to march in a muddy swamp? How vividly can you remember your first time accelerating down a ski slope or on a rollercoaster? Can you picture the swooping sensation of a falcon changing its course in flight? For most people, these thought experiments are intuitive. Composers can, therefore, draw from our common knowledge of the physical world to design equally vivid musical gestures.

More importantly, concepts from physics can constitute a reliable framework to describe musical structure. In the same way that physicists use mathematics to model physical phenomena, mathemusical scientists can describe the musical world in mathematical terms. They can use mathematical modelling techniques from physics to develop more accurate mathematical models of music perception: Chew uses the concept of gravity and the properties of Newtonian mechanics to model the dynamics of tonal tension and the effect of musical pulse on expectancy.

The spiral array model of tonality is a geometric representation of tonal space, which represents pitch classes, chords and keys, where each pitch class corresponds to spatial coordinates along a helix.

Screen Shot 2018-04-10 at 15.49.56

Source: Chew, 2016, p 44

Newton’s law of gravitation allows us to localise the centre of gravity of a non-uniform object by integrating the weight of all the points in that object. Accordingly, the gravitational pull is concentrated at the center of gravity of a given object and the gravitational force between two objects is inversely proportional to the square of the distance between these objects. In mathematical terms, we have:

Screen Shot 2018-04-10 at 15.51.21

Likewise, the tonic is the centre of gravity of the given tonal context – also defined as the “center of effect”. As the tonic changes within the modulations of the harmonic structure, the centre of effect will change accordingly. Tones that are harmonically distant from the centre of effect induce tension, and tones that are closer to it allow for resolution. However, tonal tension that is distant from the centre of effect is less like gravity and more like an elastic force, which was also defined by Newton. Hence, tones moving apart and towards the tonal centre in time create a musical narrative (see Herremans’ and Chew’s paper on tension ribbons for more information).

Tonality is not the only parameter that shapes our experience of musical tension. Through her Expression Synthesis Project (Chew et al., 2005), Chew also illustrates how timing and beat can affect expectancy, and how this can be modelled according to Newtonian mechanics. The pull is dictated by the musical pulse, and Newton’s three laws of motion are used to operationalise timing, where the time elapsed between two beats are analogous to the distance between two points in space (Chew, 2016).

Chew’s work on the mathematical modelling of musical structure allowed her and her colleague Alexandre Francois to develop MuSA.RT software, which can analyse the tonal structure of a given musical piece and provide a corresponding graphical representation using the spiral array model. In this video, Chew demonstrates how the software responds to musical information:

Another exciting application of Chew’s work is its potential for artificial music composition. MorpheuS is an automatic music generation system developed by Chew and her colleague Dorien Herremans which uses machine-learning techniques and pattern detection algorithms in conjunction with Chew’s tonal tension model to produce novel music in a specific style or in the combination of several styles. For instance, below is a recording of three pieces morphed from A Little Notebook for Anna Magdalena by J. S. Bach and three pieces morphed from 30 and 24 Pieces for Children by Kabalevsky.

“Listening as a creative act”
(Smith, Schankler & Chew, 2014)
Individual differences in structure perception

Whilst music perception and cognition can be tracked to a certain extent, Chew emphasises our individual differences. Musical structure arises from parameters on different layers, which give the listener enough space for interpretation. Attention seems to play a crucial role in shaping perception and untangling ambiguities, which is in turn influenced by the personal listening history and expectations (Smith et al., 2014a). As music listening and making require the integration of diverse musical parameters (Herremans & Chew, 2016), researchers’ predictions of personal experience are limited by diverging musical features we deem relevant.

Perception of musical boundaries, for example, is predictable from novelty peaks, which capture the extent to which different musical features change over time (Smith et al., 2014b). Timbre, harmony, key, rhythm or tempo might be decisive. And again, boundary and novelty annotations by listeners reveal individual deviances across those musical parameters. Therefore, not every novelty peak makes us perceive a structural boundary, as personal attention obscures physical events. Some theories of structure perception such as Lerdahl & Jackendoff’s (1983) Generative Theory of Tonal Music, ascribe gestalt rules to the process, but research suggests that our perceptions vary from person to person (Smith et al., 2014a). When repeatedly exposed to the same musical piece, people even disagree with themselves about structure (Margulis, 2012)!

Structure is often ambiguous – particularly in improvised music – so it’s important to remember that our perception is flexible. In Practising Haydn the transcription process was open to interpretation – how and why did the composer decide which changes warranted formal transcription? This ambiguity of structural boundaries is likely due to the multidimensional complexity of musical patterns and the aggregate nature of the perceptual process.

These projects emphasise the creative nature of listening, the breadth of Chew’s work, and the important role that structure plays in our understanding of music perception and cognition in general. Next time you’re listening to that exciting piece of music, take a minute to remember how complex and unique your experience may be.

Lena Esther Ptasczynski, Fran Board, and Paul Bejjani

Chew, E., François, A., Liu, J., Yang, A. (2005). ESP: A Driving Interface for Expression Synthesis. Proceedings of the 2005 International Conference on New Interfaces for Musical Expression (NIME05), Vancouver, BC, Canada, 224-227.
Chew, E. (2013). About practising Haydn. Retrieved January 31, 2018, from
Chew, E., & Child, P. (2014). Multiple Sense Making: When Practice Becomes Performance. Cambridge, UK: Cambridge University.
Chew, E. (2016). Motion and gravitation in the musical spheres (in Mathemusical Conversations: Mathematics and Computation in Music Performance and Composition). (J. Smith, E. Chew, & G. Assayag, Eds.). Singapore: World Scientific Publishing Company.
Goldman, R. F. (1961). Varèse: Ionisation; Density 21.5; Intégrales; Octandre; Hyperprism; Poème Electronique. Instrumentalists, cond. Musical Quarterly, 47(133–134), Robert Craft. Columbia MS 6146 (stereo).
Herremans, D., & Chew, E. (2016). Tension ribbons: Quantifying and visualising tonal tension. Second International Conference on Technologies for Music Notation and Representation, 8–18.
Huron, D. (2006). Sweet Anticipation: Music and the Psychology of Expectation. MIT Press.
Lerdahl, F., & Jackendoff, J. (1983). Generative Theory of Tonal Music. Cambridge, MA: MIT Press.
Margulis, E. H. (2012). Musical Repetition Detection Across Multiple Exposures. Music Perception: An Interdisciplinary Journal, 29(4), 377–385.
Parncutt, R., & McPherson, G. E. (2002). The Science and Psychology of Music Performance Creative Strategies for Teaching and Learning. Research Studies in Music Education, 19(1), 78–78.
Smith, J., Shankler, I., & Chew, E. (2014a). Listening as a Creative Act: Meaningful Differences in Structural Annotations of Improvised Performances. Society for Music Theory, 20(3).
Smith, J., Chuan, C.-H., Chew, Chew, E. (2014b). Audio properties of perceived boundaries in music. IEEE Transactions on Multimedia. Special Issue on Music Data Mining, 16(5), 1219-1228.
Posted in Uncategorized | Leave a comment

Step in time: Musical ensemble coordination in cross-cultural settings

Blog banner

“We hold these truths to be self evident, that all men are created equal”; an eloquent start to Thomas Jefferson’s Declaration of Independence, but also an apt summary of a model too often assumed in modern psychology research. Generalised claims regarding human behaviour are based almost exclusively on studies sampling WEIRD societies – Western, Educated, Industrialized, Rich, and Democratic societies.


Figure 1. Participant demographics from meta-analysis. Source: Jakubowski, 2016

During her talk entitled “Musical ensemble coordination in ecological and cross-cultural settings” at Goldsmiths, University of London, Dr Kelly Jakubowski introduces the demographic breakdown of participants from a selection of 97 studies on the effects of background music. She describes that 37% of the studies use a sample of “undergraduate students” and 29% of the studies use a sample of “university students”, totalling more than 60% of the sample pool being drawn from higher education establishments, as shown in Figure 1. From her selection, studies relating to music and synchrony (the field in which Dr Jakubowski is currently conducting her research) make bold claims such as ‘In a world rife with isolation, the aligned representations in interpersonal synchrony may provide a means for togetherness and connection.’, (Hove & Risen, 2009). But with participants sampled from such a narrow demographic, how can such claims be substantiated across societies and cultures?

Dr Jakubowski, along with colleagues at Durham university, is researching Interpersonal Entrainment in Music Performance (IEMP), which explores how people coordinate movements in time to perform music together. From a Western classical symphony orchestra to a South Indian carnatic music ensemble, all musicians utilise interpersonal entrainment (the timing coordination between individuals) to create a cohesive musical performance. However, different patterns of coordination and levels of synchrony are used across cultures and musical styles. How levels of asynchrony in musical performance affects aesthetic judgement is a topic of debate. Ethnomusicologist Charles Keil argues that for music to be meaningful and involving for listeners, it must be “out of time” and “out of tune” (a phenomenon he describes as “participatory discrepancy”), perhaps because this suggests a relatable element of human error. However, this view is contested by perceptual studies (e.g. Senn et al., 2016) which show that participants actually preferred as much synchrony as possible in musical performance.

The team at Durham University are currently studying both audio and visual coordination in musical performance across cultures, including Indian classical music, Malian Djembe, jazz, Tunisian Stambeli, and Cuban dance music. The research into audio coordination involves studying synchronicity of instruments using sound onset detection. In addition to this, their phase relationship in measured, indicating which instrument leads or follows another. An interesting finding is that the variability of asynchrony between a drummed instrument and plucked instrument (such as a guitar) decreases as the note density increases, i.e. the faster the music, the more out of sync the instruments. This in contrast to asynchrony between two drummed instruments, which remains constant regardless of the speed of the music.



Figure 2. Validation study comparing Video and Motion Capture Data. Source: Jakubowski et al., 2017

Ancillary movements are those made by performers which are not directly related to sound production. These movements are critical to musical ensemble coordination, and are the focus of the visual research by the team at Durham. Sound-producing movements occur over a timescale of milliseconds, and so can only be captured by specialist Motion Capture systems which have a temporal resolution of 120 – 160 frames per second. Ancillary movements, on the other hand, occur over a much longer timescale of seconds, such that standard video recording (with a frame rate of 25 – 30 fps) can be used to record these. Current motion capture systems deliver high definition data, but are most often constrained to a laboratory environment, due to the nature of the fixed camera sensors required for data collection. These systems are not particularly useful for field work where conditions are rarely under the researcher’s full control, and data must usually be collected promptly in an occasionally less than ideal situation. The miniaturisation of camera sensors over the last 10 years has allowed researchers to work in the field and collect high quality video footage, in this case of ensembles performing together, and bring this back to the lab for analysis. One might imagine that for movement tracking, data from a Motion Capture system would far outperform that from a standard video extraction. However, a validation study conducted by Dr Jakubowski’s team (as shown in Figure 2) reported high correlations of .75 to .94 between the output of the two systems, suggesting that data extracted from video field recordings can be used to accurately track these ancillary movements.

During this validation study, the team analysed movement data from a collection of 30 videos of duos improvising in a controlled environment, and extracted an aggregate measure (the Cross-wavelet Transform) which is related to periods of peak movement between the two performers. They then compared this with a panel of expert musicians’ indications of visual interaction between the performers in the videos, aiming to validate this measure as a quantitative predictor of the experts’ qualitative indications. 72% of the periods of interaction could be predicted using just this CWT measure, a result which increased to over 90% when more specific frequency bands were added. With this new quantitative predictor, Jakubowski and colleagues were able to collect field video recordings from other cultures and perform movement tracking and analysis, to compare how patterns of movement coordination emerge as a function of other performance attributes (such as musical style, structure, metre, and performer hierarchy).

The second area of research for the IEMP team is synchrony and entrainment perception by listeners. Humans are able to distinguish the onset of two sounds as distinct with a separation as short as just 2 milliseconds. For the listener to correctly identify which sound preceded the other, a minimum separation of 15 – 20 milliseconds is required. This latter judgement can however be affected by a common perceptual bias related to cultural instrumental hierarchy and roles, such as the assumption that a melody instrument will “lead” before an accompanying one. The IEMP team are looking into factors which affect this asynchrony perception, as it is an important part of a listener’s evaluation of performance quality and engagement. In one study, participants were exposed to audio visual recordings of improvising duos and asked to rate the synchrony of the performers (how “together” they felt the performance was). Their results show that clips which the participants rated as high in synchrony had high spectral flux (a measure of number of events in time, or ‘complexity’) in low-frequency sub-bands, a quality generally related to ratings of rhythmic strength and musical groove.

A cross culture field study was undertaken by the team, to investigate aesthetic judgement and discrimination of temporally adjusted recordings of Western jazz, Malian djembe and Uruguayan candombe music. This found a perhaps unsurprising preference across cultures for asynchrony minimisation. Interestingly however, participants in the UK listening to Malian djembe music (which is naturally non-isochronous) preferred an isochronous variant, whereas Malians preferred the non-isochronous original, and were able to better discriminate between micro adjustment of metric subdivision in their own music than music from other societies. This is perhaps because this non-isochronous rhythm is more culturally engrained than in other participants.

Though Dr Jakubowski’s work is ongoing, preliminary results clearly indicate that across cultures and societies, people’s perceptions and preference for musical features are not uniform. Differences in asynchrony and entrainment have no doubt contributed to the plethora of distinct musical styles which have developed around the world. Establishing awareness about these differences in perception is an important step towards addressing them in wider research. This in turn may help us better understand the variations that are being observed, in terms of where and how they may arise.


Frederick Taylor



Burger, B., Ahokas, R., Keipi, A., & Toiviainen, P. (2013). Relationships between spectral flux, perceived rhythmic strength, and the propensity to move. In R. Bresin (Ed.), Proceedings of the Sound and Music Computing Conference 2013, SMC 2013, Stockholm, Sweden (pp. 179-184). Berlin: Logos Verlag Berlin. Retrieved from

Henrich, J., Heine, S., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33(2-3), 61-83. doi:10.1017/S0140525X0999152X

Hirsh, I. J. (1959). Auditory perception of temporal order. The Journal of the Acoustical Society of America, 31(6), 759-767. doi:10.1121/1.1907782

Jakubowski, K. (2016, September 30). How Weird is music psychology? [Blog Post]. Retrieved from

Jakubowski, K., Eerola, T., Alborno, P., Volpe, G., Camurri, A., & Clayton, M. (2017). Extracting Coarse Body Movements from Video in Music Performance: A Comparison of Automated Computer Vision Techniques with Motion Capture Data. Frontiers in Digital Humanities, 4(9). doi:10.3389/fdigh.2017.00009

Keil, C. (1987). Participatory Discrepancies and the Power of Music. Cultural Anthropology, 2, 275–283. doi:10.1525/can.1987.2.3.02a00010

Hove, M. J., & Risen J. L. (2009). It’s All in the Timing: Interpersonal Synchrony Increases Affiliation. Social Cognition, 27(6), 949-960. doi:10.1521/soco.2009.27.6.949

Senn, O., Kilchenmann, L., von Georgi, R., & Bullerjahn, C. (2016). The Effect of Expert Performance Microtiming on Listeners’ Experience of Groove in Swing or Funk Music. Frontiers in Psychology7, 1487. doi:10.3389/fpsyg.2016.01487

Posted in Uncategorized | Leave a comment

‘Dynamite in my ears’ – The Exceptional Musicality of some Individuals with Autism

Can people who struggle with speech be exceptional musicians? What does this have to do with autism? Professor Adam Ockelford, Director of the Applied Music Research Centre at the Roehampton University in London provided explanations to these questions in his talk at Goldsmiths University of London in November 2017.

Firstly, let’s define autism. As Prof. Ockelford proposed, imagine listening to the radio in an unfamiliar language that you cannot turn off. Imagine seeing through a stained-glass window, broken and glued together again in a random way. Autism is a developmental spectrum condition, influencing how people perceive the world and communicate with others. It affects approximately 1/100 children in the UK (The National Autistic Society, 2017). Children with so-called ‘classic’ autism often have little or even no speech. In ‘Asperger’ Syndrome language is not affected. People with autism have fragmentary perception (e.g., the stained glass), and seem to pay more attention to details than to the whole. The drawings of Stephen Wiltshire re-create highly detailed landscapes. Remarkably, rather than first sketching an outline, Stephen makes his way from one end of the canvas to the other in painstaking detail.


(Wiltshire, 2016)

Autistic people have a love for pattern and predictability. In that sense, it becomes easy to see why these individuals may have social and communication deficits: after all, humans are nothing if not unpredictable. Furthermore, Prof. Ockelford explained how these individuals have difficulty understanding the emotions of others (cf. ‘Theory of Mind’), which makes socialising a huge challenge.


Prof. Ockelford (2013) suggests that the developmental trajectories of music and language, which generally evolve together, can possibly diverge in some children on the autism spectrum, causing a delay in language understanding and use. These behavioural differences make autistic children perceive the world in more perceptual and musical ways; for example an early fascination with everyday sounds and objects (e.g. a microwave). It is interesting to note that such sounds are musical, producing pitches, tones and comprising different colours and harmonics (Ockelford, 2013). This is, as suggested by Prof. Ockelford (2013), one of the outcomes of an Exceptional Early Cognitive Environment. In other words, certain sounds do not acquire wider meaning or functional significance, but are instead processed purely in terms of their sonic qualities – in musical terms.

This strong and early detail-oriented connection with music may be the reason why such a large number of children on the autism spectrum possess absolute pitch: 1 in 20 within the autistic population compared with 1 in 10,000 in the western societies. As defined by Takeuchi and Hulse (1993), absolute pitch (AP) is the ability to identify or produce a pure tone at a particular pitch, without the use of an external reference pitch (e.g. piano).

Prof. Ockelford observes that the ability to recognise and sing a tone (in autism AP cases), generally comes before any theoretical aspects of music (naming notes, scales, etc.), and can be independent of language. A video was shown in which Freddie, an autistic student aged 10 at that time, was asked to reproduce a small melody on the piano. Surprisingly, instead of playing the notes on the piano, he sang them, barely brushing the piano keys. The music seemed to have sounded in his head, negating the need for him to actually play the instrument. As Prof. Ockelford reminds us “The vivid nature of perception – crucial for our functioning and survival – beguiles us into thinking that music exists beyond ourselves in a material way.” (Ockelford, 2017, p. 61).



Prof. Ockelford’s latest book Comparing Notes (2017) is an excellent resource for understanding the history of his work. It’s extremely well researched, and discusses many aspects of how we (both on and off the spectrum) derive meaning from the fuzziness that is music. Through his work with Derek Paravicini, diagnosed as having ‘classical’ autism, he discovered that Derek’s process of imitation led to agency, to musical structure, to music making sense. Derek is acclaimed as one of the greatest musical ‘savants’ ever to have lived. You can watch the amazing Ted talk featuring both Prof. Ockelford and Derek here:

Prof. Ockelford commented that music cannot be cynical. It is innocent, a pure method of communication, which is thought to far precede the language we use today. The children he works with are always fun, excited and still hit the ceiling with pleasure when much loved chords and phrases are presented to them time and time again. Classical musicians alike visit and spend time playing with the Professor’s students. This includes MSc Music, Mind and Brain students whom he encourages to get in touch and participate. Blowing the cobwebs off their perhaps grey palettes, which after years of playing in familiar circles, could greatly benefit from the addition of fresh and brighter colour.

A collaboration of this nature was alluded to in the talk with a young girl named Romy. A lover of Bach, music is Romy’s language of communication and her humorous character was conveyed through purposefully playing the ‘wrong notes’ to avoid interaction with early piano teachers. In this way, music becomes a proxy language for children and adults on the autism spectrum. Children like Romy are extremely musically advanced, having the ability to transpose mid-piece and, in Romy’s case, communicate her disapproval through playing notes in the most opposed tonality to the original key. The shift in pattern allows Romy to portray her colourful personality in the most complex way that is astounding to most advanced musicians.



So what were Prof. Ockelford’s concluding thoughts? The fundamental idea is that through the repetition of words and sounds from our surrounding environment, both language and everyday sounds can be processed as music. The early cognitive environment of a child on the autistic spectrum is a complex one, however it is our responsibility to understand the message these remarkable individuals convey, not vice versa!

Catherine Smyth, Luca Kiss, Patrick Reis, & Simon Andrew Whitton


The National Autistic Society (2017). Autism. Retrieved from

Wiltshire, S. (2016). Cologne, Germany. Retrieved 18th November 2017 from:

Ockelford, A. (2013). Music, language and autism. 1st ed. Jessica Kingsley Publishers, 211-215.

Ockelford, A. (2017). Comparing Notes : How we make sense of music. Profile Books LTD: London
Takeuchi, A. H., & Hulse, S. H. (1993). Absolute pitch. Psychological Bulletin, 113(2), 345-361.

Posted in Uncategorized | Leave a comment

Music Psychology Research in the Field



“Music is something we all do, and we can all speak about it”

Dr Alexandra Lamont’s powerful statement above was the spine of her presentation. Since the emergence of the discipline of music psychology in the 1950s, until the 1980s, the vast majority of research studies were heavily influenced by the cognitive approach, and carried out on populations with formal musical training. Unsurprisingly, theories of Western music led the experiments, frequently carried out under controlled laboratory conditions, influenced by the early methods of experimental psychology (Wilhelm Wundt, 1832-1920). Thus, following the path of William James (another key figure in modern psychology), it became increasingly necessary to understand what people do with music and what it means to them in everyday life. This is the focus of Dr Lamont’s research.

Lamont explained how since the 80s, researchers aimed to explore findings in the cognitive domain using infants and children, as well as less musically trained populations and non-Western music cultures. They made interesting insights into human responses and the powerful print of musical enculturation (Trehub, Shellenberg & Kamenetsky, 1999). As a result, the neuroscientific approach developed widely from the 90s, using implicit measures to work out differences between people with and without musical training. A more naturalistic view: the interactive approach, was developed in parallel and was applied by Lamont, aiming to understand how children respond to music while playing exploratory and interactive musical tasks. Yet, in the last ten years, research has been mostly lab-based (Tirovolas & Levitin, 2011), and heavily influenced by cognitive experimental psychology.

In the following, we walk through some different directions of qualitative research in music psychology, focusing on people’s narrative experiences of music, their everyday life experiences and the effect of these on their music perception.


  • Turn to Narrative 

Lamont introduced us to the first of three possible approaches to music psychology qualitative research – the narrative. This essentially refers to the ability to verbalise your predisposition to a subject, in this case: music. But do untrained musicians possess the appropriate vocabulary to do this? Wolpert (2000) found that only 40% of non-musicians were able to differentiate a transposed accompaniment of a song from the original while 60% struggled to detect anything at all. Nevertheless, Lamont assured us that a lack of musical vocabulary does not stop musical conversations from happening.

Music can actually be a great ice-breaker, allowing strangers to become acquainted (Rentfrow & Gosling, 2006). And semi-structured interviews have offered Greasley, Lamont & Sloboda (2013) a deeper insight into the effects of engagement and participants’ ability to talk about their own music collections at length. A rare few who were less engaged had greater difficulty articulating what music meant to them more personally. Lamont’s own interests in similar research developed towards qualitatively interviewing those who weren’t particularly musically active. Musical participation will never be limited to trained musicians. She mentioned Gabrielsson’s (2011) Strong Experiences of Music project (SEM), which asked participants:

Write about your strongest and most intense experience of music. 

No mention of emotion, no instructions more detailed than that – complete narrative freedom. As a result, there was expected feedback (‘it was not until they played one particular song … that the intensity of their music hit me.’) … some less expected … (‘The intensity also left quickly, but in general my mood changed in a positive way.’) … and others completely unexpected, to say the least… (‘[I made] a conscious decision to end my relationship … the song and mood had such a profound effect on me.’)


  • Turn to Everyday 

In today’s age, music can be heard frequently in daily life, e.g. at home, in shops, restaurants, etc, and can induce certain moods but also exist in the background of your main activities. Yet it differs from the emotional and sentimental associations of a piece of music or a music concert, for example. Studies show that music is widely used to simply pass time (Greasley & Lamont, 2011). With the fast growing awareness of how music surrounds the general public, more in-depth research is needed to reveal the mechanisms through which everyday music affects emotional wellbeing.

Some studies (Sloboda, 2010) believe that the purpose of studying music listening in everyday life is to explore the ways in which listening experiences induce emotions and condition individuals. According to Sloboda (2010), everyday music is widely encountered but often forgettable. However, the majority of people experience daily life with music in various genres and also in different places and technological enhancements (e.g. iPods, iPhones, etc.) have furthered possible methodology. Using automated text messages, Greasley & Lamont (2011) revealed that there are two types of listeners: less engaged and highly engaged listeners, and both differ on how they choose their music and for how many hours they listen. The former tend to listen for an average of 12 hours per week, less likely to self-select their music, and more likely to use music out of habit or to feel less lonely. The latter is decidedly the opposite with average listening existing around 21 hours per week, with self-chosen music to create a certain atmosphere for themselves. Music is clearly pervasive, and exists to fulfill numerous purposes in day-to-day living.


  • Turn to Context



Finally, Lamont turned our focus onto real-life context and its effects on music perception. The way people perceive, form and give meaning to their everyday experiences occur due to a complex web of interactions between stimuli and the environment in which they exist. Although context effects have been a known factor in cognitive psychology for many years (McClelland, 1991), investigating music perception in naturalistic conditions and not entirely artificial ones – like in laboratories – have offered great opportunities for deeper insights into the way people perceive their lived musical experiences.

Such studies in cognitive psychology are quite limited, however Lamont presented some noticeable ones, evidently influenced by theories of phenomenology and ethnographic methods in order to highlight the possible methodological approaches of natural context’s impact on cognitive processes. For example, North & Hargreaves (1996) asked people to discuss on the “where”, “when”, “with whom” and “why” of their musical experiences, whilst Groarke & Hogan (2016), generated a model associating functions of music listening with wellbeing, by asking a small number of people to reflect on the ways in which they engaged with music in different settings and compared them by age. More recently, Lamont, et al. (2017) pursued a case study of an older people’s choir, by exploiting qualitative research methods; i.e. interviews, observation, focus groups and participatory discussions. The study found that social relationships, meaning and accomplishment were the main reasons why older people chose to be part of this community choir (Lamont et al., 2017).  Moreover, for a recent project aiming to delve into the festival experience, Lamont herself employed participant-observation-methods by attending the event, and in doing so, literally put “psychology in the field” into practice.

Whilst there will always be drawbacks, such as the inability to gain full control over research conditions, there are new doors that technological developments are opening for the future. Music constantly surrounds us, and by being able to get as close to the phenomenon as we can, we gain invaluable insight. But of course, balance is the key. As Dr Lamont phrased it: we need William James as much as Wilhelm Wundt!


Written by: Ahmad Bin Abdul Latiff, Aspasia Papadimitriou, María Sánchez Moreno, and Sarah Hashim



Gabrielsson, A. (2011). Strong experiences with music: Music is much more than just music. Oxford: Oxford University Press (translation of Gabrielsson, A., 2008, Starka musikupplevelser – Musik är mycket mer än bara music, Hedemora: Gidlunds).

Greasley, A. E., & Lamont, A. (2011). Exploring engagement with music in everyday life using experience sampling methodology. Musicae Scientiae, 15(1), 45-71.

Greasley, A. E., Lamont, A., & Sloboda, J. A. (2013). Exploring musical preferences: An in-depth study of adults’ liking for music in their personal 
collections. Qualitative Research in Psychology, 10(4), 402-427

Groarke, J. M., & Hogan, M.J. (2016). Enhancing wellbeing: An emerging model of the adaptive functions of music listening. Psychology of Music, 44(4), 769–791. DOI:

Lamont, A., Murray,    M., Hale, R. & Wright-Bevans, K. (2017). Singing in later life: the anatomy of a community choir. Psychology of Music. DOI:

McClelland, J. L. (1991). Stochastic interactive processes and the effect of context on perception. Cognitive Psychology, 23(1), 1-44. DOI:

North, A.C. & Hargreaves, D.J. (1996). The effects of music on responses to a dining area, Journal of Environmental Psychology, 16, 55-64. DOI:

Rentfrow, P. J. & Gosling, S. D. (2006). Message in a ballad: The role of musical preferences in interpersonal perception. Psychological Science, 17(3), 236-242.

Sloboda, J. (2010). Music in everyday life: the role of emotions. Handbook of Music and Emotion: Theory, Research, Applications, 493-514. URL:

Tirovolas, A., & Levitin, Daniel. (2011). Music Perception and Cognition Research from 1983 to 2010: A Categorical and Bibliometric Analysis of Empirical Articles in “Music Perception”. Music Perception, 29(1), 23-36.

Trehub, S., Schellenberg, E., Kamenetsky, S., & Carr, Thomas H. (1999). Infants’ and Adults’ Perception of Scale Structure. Journal of Experimental Psychology: Human Perception and Performance, 25(4), 965-975.

Wolpert, R.S. (2000). Attention to Key in a Nondirected Music Listening Task: Musicians vs. Nonmusicians. Music Perception, 18(2), 225-230.

Posted in Invited Speaker Series | Leave a comment

Hearing Musical Tempo: You need more than just your ears.

Invited Speaker Series: A talk by Justin London at Goldsmiths, University of London. 

There is no doubt that there are some songs that you just can’t help but dance to… but have you ever stopped to think that there is more going on than grooving to the beat?
There are several complex processes taking place as you figure out where the beat is, what it is and how fast it’s going. All this takes place before you begin to move your feet, nod your head or clap your hands.

Carlton Dance                                                                The  Carlton Dance

It Can Get Complicated!

In his presentation to the students on the MSc in Music, Mind and Brain at Goldsmiths, Justin London from Carleton College (USA) explained how our judgment of tempo is dependent on multiple factors both auditory and non-auditory. The auditory factors affecting our judgments include Beat Rate per Minute (BPM), Rate of surface activity, Dynamics and Spectral Flux.

Spectral Flux a measure of how the acoustical energy in various parts of the auditory spectrum varies over time. Music with low spectral flux will contain fewer events. while music with high spectral flux will necessarily have more events and can be more complex.

London spoke about a study which tested participants perceived tempo judgments on music with High Flux and Low Flux at different tempos (London et al., 2015). Each test of tempo was played to the participant both quietly and loudly, to also assess the effect of volume on tempo perception. The study revealed that music with High Flux was perceived as being faster than the simple Low Flux music.

     Flux Test J.London

                                 Figure 1. Justin London’s results bar graph for Flux test.


Step to The Beat!

London and his colleagues (2016) conducted a study to examine if the perception of musical tempo can be affected by the visual information provided. They started by recording videos of participants dancing to songs that are known to have high ‘grooviness’, where ‘groove’ was described as the degree to which a listener will want to move along with the beat of a song (Think Motown, Stevie Wonder).

The dancers were required to dance along to songs with their tempo increased or decreased by 5% (time- stretched versions), as well as in the song’s original tempo (baseline tempi versions). For the time-stretched versions, participants were asked to dance freely, whereas for the baseline tempi versions, participants were requested to dance either in a relaxed (slow) or vigorous (fast) manner.

Motion capture animations based on the video recordings of the dance movements were then presented to a different group of participants to rate the speed of the songs. The second group of participants were exposed to the music in 3 different ways: audio-only, audio with video, and video-only.

dot image

Figure 2. An example of how participants were exposed in the audio with video condition (from London et al., 2016)

The results of the study found that the relaxed versions of the song were perceived as slower, and vigorous versions as faster. There seems to be a visual-auditory tempo illusion, which means that the differing movement (relaxed / vigorous) of the dancers can influence one’s perceived tempo, even when their body movements and the beat of music are in synchrony (London et al., 2016).

Results also showed that the songs with relaxed versions were perceived as even slower when compared to the songs with highest tempo (130BPM) when presented without audio (video only). This was explained that it is acceptable to have slow movements in a fast song, but not fast movements in a slow song. (Imagine rocking your heart out to the song Starry Starry Night (Vincent)!) Thus, London and his colleagues concluded that when one encounters a conflicting connection between the audio and visual input, this information would be further integrated in a meaningful manner – in a way that makes the “best sense” to the perceiver.

Watch What You’re Listening To!

The McGurk Effect (McGurk & Macdonald, 1976) is an fascinating example of cross modal perception. It demonstrates the powerful link between sight and hearing. A great example of the McGurk Effect has been demonstrated using a video loop of a person pronouncing a syllable like “ga” twice. The video and audio tracks are purposefully played out of sync to begin with and gradually synchronize. The confusion experienced was initially created and then resolved by vocal motor neurons. Recent research has discovered the existence of

mirror neurons. These are the same neurons which make you laugh or cringe when you see someone fall over Charlie Chaplin style Manfredi, Adorni and Proverbio, (2014).

Mirror neurons are multimodal association neurons that increase their activity during the execution of certain actions and whilst hearing or seeing corresponding actions being performed by others (Schreuder, 2014). There is marked increase their activity during the execution of certain actions and whilst hearing or seeing corresponding actions being performed by others.

When it comes to perceiving musical tempo, during a live performance, these neurons play an important role in “feeling” the beat. Imagine seeing a marimba player or Taiko drummer perform. As you watch their hand elevate and strike there is a point where what you see will affect what hear. Schutz and Lipscomb, (2007) found that the gesture of a percussionist can affect the observers visual perception of the tempo of the performance. Observing a drummer perform an epic solo whilst attending to the rhythm of the numerous beats he or she plays generally leads to jaw dropping observation. The flux and volume, as mentioned before, will also affect how you perceive a performance and react. Not to mention the adrenaline rush most people feel when they see their favorite band performing.

Click on the link below to experience the confounding Mcgurk effect; “magic of the mind” for yourself.

Truly, we can see that there are many sensory modalities involved in such a simple task like keeping up with the rhythm of a song. So the next time when you’re in concert watching your favorite performer singing and moving along with the song, remember that seeing them is (almost?) having as big an impact as hearing them.

This blog was written following Justin London’s presentation to Goldsmiths’ Music, Mind and Brain MSc students on 1/12/ 2016 as part of the ‘Invited Speaker’ series.

Authors: Kelly Kai Ling Yap, Joseph Trott and Sinanezelo Mancama.

For more details on the Music, Mind and Brain MSc, please visit: brain/.


Acharya, S. and Shukla, S. (2012). Mirror neurons: Enigma of the metaphysical modular brain. Journal of Natural Science, Biology and Medicine, 3(2), p.118.

Fowler, C. A., Galantucci, B., Saltzman E. (2003). Motor theories of perception. The handbook of brain theory and neural networks, MIT Press, 705-707.

Galantucci, B., Fowler, C. A., & Turvey M. T. (2006). The motor theory of speech perception reviewed. Psychonomic Bulletin & Review, 13(3), 361–377.

London, J. (2016). Hearing Musical Tempo: You Need More than Your Ears. Presentation, Goldsmiths, University of London.

London, J., Burger, B., Thompson, M., & Toiviainen P. (2016). Speed on the dance floor: Auditory and visual cues for musical tempo. Acta Psychologica, 164, 70–80.

Manfredi, M., Adorni, R. and Proverbio, A. (2014). Why do we laugh at misfortunes? An electrophysiological exploration of comic situation processing. Neuropsychologia, 61, pp.324-334.

McGurk, H., MacDonald, J. (1976). “Hearing lips and seeing voices”. Nature. 264 (5588), 746– 748.

Schutz, M. and Lipscomb, S. (2007). Hearing gestures, seeing music: Vision influences perceived tone duration. Perception, 36(6), 888-897.

Schreuder, D. (2014). Vision and Visual Perception (1st ed.). Archway Publishing.

Posted in Uncategorized | Leave a comment

Helping to find your voice: A project by Karen Wise on the science of singing

Singing is part of everyday life: from supporting our favourite sports teams with the National Anthem, to soothing children with a lullaby. Yet, while I always burst into ‘Happy Birthday’ when the cake comes out at a party, my grandmother stays mute. It’s sad that she feels unable to engage in such an essential human activity, however she’s not alone.


Staying mute: a political stance or is Jeremy Corbyn too shy to sing?

According to a 2005 study by Cuddy and colleagues around 17% of people define themselves as ‘tone deaf’, whilst researchers Hutchins & Peretz (2010) estimate a fifth of adults are ‘poor pitch singers’. These individuals, and the wider population of self-identified ‘non-singers’, often hold negative beliefs about singing and their own abilities. One cause of such beliefs can be criticism from someone such as a teacher or parent during childhood, but this is not the only source.

My grandmother believes that she “can’t” sing and that singing is an all-or-nothing, innate ability. However music education and development expert Graham Welch (1985) conceptualizes singing as a continuum of skill. A skill requiring the coordination of perception, cognition and movement, that we normally acquire during early childhood, but that can also manifest later in life.

Below is a simplified representation of how singing behavior changes over time, based on data from thousands of children (Welch, 2005).


As we get older we develop greater accuracy over our voices, becoming aware that vocal pitch can be consciously controlled (pitch direction control) and developing the ability to move in the right direction, reproducing the ups and downs in a melody (contour accuracy). Interval accuracy is the ability to make theses ups or downs the right distance apart. At this stage, pitching may be fairly good within a phrase – a line sung in one breath – but not as accurate between phrases, once a breath is taken. By the age of eleven the vast majority of individuals are able to sing a simple song in tune (tonal stability). The green arrow represents the development of vocal use, with the range of notes we can sing accurately increasing as we learn vocal control (Rutkowski, 1990). This is an overall trajectory, and not a series of linear steps, as the same child can produce performances at different places on the continuum depending on the difficulty of the song, the context, etc.

Maybe “non-singers” have difficulties with one or more stages of these processes, however nothing concrete is currently identified as little is known about what occurs developmentally after the age of eleven or the role of training or maturation. One interesting finding was a study by researchers Demorest and Pfordresher (2015) that compared the singing accuracy of primary school children with secondary school children (years 7-9) and university students. They found that whilst children’s skills increased dramatically from 5 years to 13 years of age, university students performed at the level of primary school children, indicating that adults who have stopped singing may regress in their singing skills.

But there is hope: an exploratory intervention study by Numminen and colleagues (2015) found that, given the appropriate supportive singing opportunities such as vocal training, negative beliefs of non-singers can be changed and singing skills can be improved.

This transformation intrigues Dr Karen Wise, a Research Fellow at the Guildhall School of Music and Drama. As a psychologist, lecturer, and mezzo-soprano Dr Wise has a particular interest in the psychology of singing, focusing on singing difficulties in untrained and ‘non-singing’ adults. She is currently heading a multi-method, interdisciplinary, intervention project, funded by the Arts and Humanities Research Council. The project, called ‘Finding a voice: The art and science of unlocking the potential of adult non-singers’, is observing the journey of learning to sing in adulthood.


Singing together: over 320 choirs are part of the Rock Choir organisation

Despite the growth in music-making opportunities that are ostensibly for non-singers, such as the un-auditioned choral group Rock Choir, Dr Wise is unsure whether they are as inclusive or as evidence-based as they could be. Although researchers are investigating the role of music for health, she argues that studies have, so far, only focused on interventions for specific groups, overlooking the large population of non-singers, and focusing strongly on outcomes rather than detailed description of interventions. The lack of documentation, systematic research and evidence for effective strategies to support adult singers needs is a glaring gap in research, and is one she hopes to fill.


As a singer and teacher herself, Dr Wise’s perspective is a unique one. Not only does she wish to investigate singing in adulthood from a cognitive psychological perspective but also from that of a vocal pedagogue. Rightly, she has highlighted the gap between the scientific community and that of singing practitioners like teachers and choir leaders. Additionally, there is barely any concrete literature on how singing teachers deal with poor-pitch singers as their strategies have not been systematically investigated or solidified. With her work, Wise hopes to encourage communication between these fields to integrate scientific research with vocal pedagogy, piecing together the relationship between singing skills and other aspects of musicality, whilst developing a greater understanding of what it means to sing.

The 33-month-long project is currently ongoing and structured in two strands. The first is a naturalistic study tracking the progress of 20 non-singing adult participants as they undergo a yearlong practical singing course at Guildhall School of Music and Drama (September 2016 – July 2017). They will receive a combination of individual lessons, group singing sessions and workshops.

The researchers used a broad definition of ‘non-singer’, accepting individuals who avoid singing, self-define as tone-deaf, only sing in the shower, or believe they can’t sing. Following a recruitment drive, an overwhelming 355 initial respondents were whittled down to the final 20 participants (11 women, 9 men aged 23-71). They have a range of musical and singing skill levels as well as different attitudes and self-beliefs.

Dr Wise and her team used several psychometric tools to assess the participants’ baseline skills, including the Gold-MSI – a battery of tests that flexibly assess individuals’ ability to engage with music, involving a self-report questionnaire and tests of melody memory and beat perception – and the Seattle Singing Accuracy Protocol, which is an online 15-20 minute test of singing accuracy and related skills. They also asked about beliefs regarding singing, singing identity and self-perceptions, along with participants’ educational level, their engagement with singing activities over the past year, and aspects of health that may affect singing or listening.

The researchers will monitor the participant’s progress in a variety of ways including video recording individual and group lessons, asking both participants and teachers to keep diaries and other reflective writing, and conducting interviews. The assessments used to evaluate the participants’ singing abilities prior to the intervention will be repeated at two more time points, during and after the singing course. Wise hopes to find an improvement in singing abilities over the yearlong training course, whether minor or huge. And of great interest are the kinds of changes that take place and how skills develop, as well as how people experience their journey of learning to sing.

Still in development, the second strand of the project is based on evidence of a correlation between auditory imagery, the ability to imagine sounds vividly in the mind’s ear, and singing accuracy (see Pfordresher & Halpern, 2013). It will look at the relationship between singing, auditory imagery, and other cognitive skills through the use of a specially designed app.

And hopefully this research will help adults like my grandmother find their voice.


Authors: Robyn Donnelly & Jen Mair

For more information on Dr. Wise or the project, please visit:



Cuddy, L., Balkwill, L., Peretz, I., and Holden, R.R. (2005). Musical difficulties are rare: A study of ‘tone deafness’ among university students. Annals of the New York Academy of Sciences, The Neurosciences and Music II: From Perception to Performance 1060: 311–324.

Demorest, S., Pfordresher, P. (2015) Singing Accuracy Development from K-Adult: A Comparative Study. Music Perception: An Interdisciplinary Journal, Vol. 32 No. 3. 293-302.

Hutchins, S. and Peretz, I. (2012). A frog in your throat or in your ear: Searching for the causes of poor singing. Journal of Experimental Psychology: General 141(1): 76–97. doi:10.1037/a0025064.

MacDonald, R. A. R. (2013). Music, health, and well-being: A review. International Journal of Qualitative Studies on Health and Well-Being, 8, 10.3402/qhw.v8i0.20635.

Numminen, A., Lonka, K., Raino, A. P., & Ruismäki, H. (2015). “Singing is no longer forbidden to me – it’s like part of my human dignity has been restored’. Adult non-singers learning to sing: an explorative intervention study. The European Journal of Social and Behavioural Sciences. 12: 1660-1674.

Pfordresher, P.Q., Halpern, A. R. & Greenspon, E.B. (2015). A mechanism for sensorimotor translation in singing: The Multi-Modal Imagery Association (MMIA) model.

Rutkowski, J. (1990). The measurement and evaluation of children’s singing voice development. The Quarterly Journal of Teaching and Learning 1: 81–95.

Welch, G.F. (1985). A schema theory of how children learn to sing in tune. Psychology of Music 13: 3–17.

Welch, G.F. (2005). Singing as Communication. In: D. Miell, R. MacDonald, and D.J. Hargreaves (eds), Musical Communication, pp. 239–259. Oxford: Oxford University Press.

Wise, K.J. (2009). Understanding “tone deafness”: A multi-componential analysis of perception, cognition, singing and self-perceptions in adults reporting musical difficulties. PhD Thesis, Keele University.



Posted in Invited Speaker Series | Leave a comment