Many studies have revealed shared music–language processing resources by finding an influence of music harmony manipulations on concurrent language processing. However, the nature of the shared resources has remained ambiguous. They have been argued to be syntax specific and thus due to shared syntactic integration resources. An alternative view regards them as related to general attention and, thus, not specific to syntax. The present experiments evaluated these accounts by investigating the influence of language on music. Participants were asked to provide closure judgements on harmonic sequences in order to assess the appropriateness of sequence endings. At the same time participants read syntactic garden-path sentences. Closure judgements revealed a change in harmonic processing as the result of reading a syntactically challenging word. We found no influence of an arithmetic control manipulation (experiment 1) or semantic garden-path sentences (experiment 2). Our results provide behavioural evidence for a specific influence of linguistic syntax processing on musical harmony judgements. A closer look reveals that the shared resources appear to be needed to hold a harmonic key online in some form of syntactic working memory or unification workspace related to the integration of chords and words. Overall, our results support the syntax specificity of shared music–language processing resources.
1.1 Syntax versus attention explanations for music–language interactions
Music and language are universal human faculties . Moreover, both of these cognitive domains are based on structured auditory sequences, i.e. they contain discrete elements (e.g. words in language, tones/chords in music) which relate to each other in a rule-governed way in order to form higher order structures (e.g. sentences in language and harmonic sequences in music). This similarity has led researchers to hypothesize that the same processing resources underlie both domains. Supporting evidence comes from a host of experiments which found music harmony manipulations to change language syntax processing [2–8]. However, the nature of the shared resources which lead to these music–language interactions has remained controversial. On the one hand, they are argued to be specific to music and language syntax, i.e. they are the result of shared syntactic processing resources . On the other hand, these effects are thought to be an instance of many other, similar, interactions including those between music harmony and language semantics [10,11]. In this view, music–language interactions arise due to the effect of music manipulations on general attention. In this paper, a previously little explored direction of influence will be investigated: the linguistic influence on music harmony perception. By doing so, we directly test predictions from a syntax account versus those from general attention accounts.
1.2 Music–language interactions as explained by a syntax account
Patel’s  shared syntactic integration resource hypothesis (SSIRH) argues that music and language share structural processing resources. It was designed to reconcile neuropsychological evidence for domain independence  with neuroimaging studies which found evidence for shared resources in terms of similar activation sites and time-courses for linguistic syntactic processing and tonal harmonic processing1 [13,14]. Both kinds of findings together were thought to support a distinction between representations in long-term memory (domain specific, explaining neuropsychological double-dissociations) and resources for online syntactic integration (domain general). Similar activation effects reflect domain-general syntactic integration resources which in turn draw on domain-specific representations.
A key prediction of Patel’s  SSIRH is an interaction between music and language syntax processing when both occur at the same time. This prediction has been supported by two behavioural studies finding impaired syntactic integration abilities in language if a concurrent musical tone or chord is difficult to harmonically integrate [2,5]; see also [3,7,8]. For example, Slevc et al.  presented participants with syntactic garden-path sentences of the form below.
Participants reading the sentence in (a) are likely to, at first, misinterpret ‘the defendant’ as being advised. Upon encountering the verb ‘was’, readers realize that someone is being advised about ‘the defendant’, a syntactic reanalysis which taxes integration resources. In (b) the correct reading is enforced early on when reading ‘that’. As a consequence, reading times on disambiguating words such as ‘was’ are typically longer in sentences like (a) compared with (b). Crucially, Slevc et al.  found that this syntactic garden-path effect is intensified by a harmonically unexpected (out-of-key) chord presented concurrently with the disambiguating word (‘was’). A timbral unexpectancy (an unusual instrument) was without effect. The authors interpreted this as evidence against acoustic deviancy alone underlying the musical influence on language. The SSIRH explains this pattern by assuming that an out-of-key chord is difficult to integrate into the prevailing key, i.e. it taxes resources involved in syntactic integration. This leaves fewer resources available for concurrently reanalysing the syntactic structure of the sentence, hence the increased linguistic garden-path effect. An unexpected instrument, on the other hand, does not tax integration resources, and hence has no effect.
1.3 Music–language interactions as explained by attention accounts
An alternative explanation for the influence of music on reading times is based on general attention mechanisms. These can take two forms. One proposal is an attentional load account. For example, Perruchet & Poulin-Charonnat  found a semantic garden-path effect when a harmonically unexpected (out-of-key) chord was presented concurrently with the disambiguating word. No semantic garden-path effect was found when the chord was expected (in-key) instead. They found no influence of the music harmony manipulation on semantic error processing. The results were argued to support an attentional load account, i.e. one domain (e.g. music) influences another (e.g. language) only if enough attentional resources are left to process both. Semantic garden-path processing is thought to allow for the distribution of attention across domains, while semantic error processing does not. Crucially, the key prediction of this account for the present investigation is that the nature of the processing difficulty, e.g. language syntax or something else, is irrelevant, placing it in sharp contrast to the syntax account by Patel .
A different attentional account was put forward by Poulin-Charronnat et al.  to explain an effect of music harmony on the semantic priming effect seen in lexical decisions. Participants were presented with sung sentences ending either on a word or a non-word and participants were asked to decide on the lexical status of the final item in the sentence. Words could be either semantically expected or not. The semantic expectancy effect was larger when the final word was sung on pitches forming an expected (tonic) rather than a less expected (subdominant) final chord. This result was taken to reflect attentional entrainment, as theorized in Jones’ dynamic attending theory [15,16]. According to this account, attentional fluctuations entrain to harmonic accents such as expected chords. This leads to an attentional peak when a tonic is heard. Greater attention at this point in time facilitates linguistic processing, hence the increased semantic priming effect. The same account has also been suggested to explain the music effect on visual processing  and phoneme monitoring . Crucially, the nature of the stimulus is irrelevant in this account as long as it leads to attentional entrainment. This—again—is in sharp contrast to Patel’s  account which suggests that only syntactic manipulations in language will affect music processing.
1.4 The present paper: can language influence music perception? If so, how is music perception affected?
All the studies mentioned so far investigated musical influences on language or other tasks. We decided to reverse the direction of influence, i.e. we focus on linguistic influences on music processing. People listened to music while reading sentences or control arithmetic problems. The predictions of the syntax and the attention accounts outlined above are quite clear. Patel’s  SSIRH predicts that only syntactic manipulations will affect music harmony processing. Specifically, a challenging language syntax condition (ambiguous S-coordinations [19–21] in experiment 1 and object-relative clauses  in experiment 2) should tax musico-linguistic integration resources, leading to an impaired ability to harmonically integrate chords into an unfolding sequence. Non-syntactic processing challenges (difficult arithmetic operations in experiment 1 and semantic garden-path sentences  in experiment 2) should be without effect. The aforementioned attentional accounts, on the other hand, predict a non-specific effect, i.e. an effect for both the syntactic manipulations and the control manipulations.
Either way, the influence of a concurrent task on music perception could take two different forms since there are two aspects to harmony processing. First, the listener has to integrate chords in order to establish a harmonic key. Second, he/she must integrate chords with an already established key which is held online in some form of unification space (see §5.3 for details). We investigated these processes by presenting the challenging moment in concurrent tasks while the music piece modulated from an established harmonic key to a new key, i.e. when an old key has to be kept online without being reinforced by chords and a new key has to be established from the incoming chords. Chord sequences ended either on an authentic cadence typical for the first established key (probing whether the old key could be held online after it was no longer reinforced) or the second established key (probing whether a new key could be established after a critical moment in the concurrent task). The measure of interest which provides us a window into these processes is found in closure ratings. Participants are simply asked to rate their feeling of completeness (closure) regarding the music . Closure is high if the last chords can be integrated into an available key. Reduced closure ratings indicate difficulty with harmonic integration—as expected in the case of syntactically challenging sentences compared to easier sentences.
2. Experiment 1 (exploratory): language syntax versus arithmetic difficulty
Fifty-eight participants were invited to take part in the experiment. One participant did not advance to the experiment as she did not understand the musical task after repeated chance-level practice trial performance. Three participants were rejected due to an error in counterbalancing.2 This leads to a final sample of 54 participants (19 men, 44 right handed). They were all Dutch native speakers, aged 23 on average (s.d.=6.3), with 3.9 years of musical training on average (s.d.=3.7). Thirty were self-described non-musicians, 19 amateur musicians, 5 semi-professional musicians. They were paid €16 or undergraduate course credits for their participation and were naive as to the purpose of the experiment.
2.1.2 Design and material
We employed a 2 (Key: first key ending or second key ending) × 3 (Difficulty: ambiguous S-coordination, unambiguous S-coordination or NP-coordination) design for the musico-linguistic part of the experiment and a 2 (Key: first key ending or second key ending) × 2 (Difficulty: hard, easy) design for the musico-arithmetical part. All factors were manipulated within-subjects. All critical stimuli can be found in the appendices. Sound files are available as the electronic supplementary material. The task paired musical stimuli (harmonic sequences requiring a closure rating) with linguistic stimuli (visually presented sentences requiring a comprehension answer) or arithmetic stimuli (visually presented arithmetic formulae requiring a solution judgement). Auditory music stimuli were presented concurrently with visual language or arithmetic. The critical point in time was always the ninth position (highlighted or underlined in the examples below: figure 1, and examples 1 and 2), corresponding to the start of a new key in the musical material, the garden-path disambiguation in the ambiguous S-coordination sentences of the language stimuli and a difficult operation in the hard arithmetic trials.
188.8.131.52 Music stimuli
The music stimuli consisted of chord sequences specifically composed for this experiment by the first author. As shown in figure 1, 10 items were composed beginning in C-major (seven chords), followed by a pivot chord which is part of both the C-major and B-flat-major keys (F-major chord or d-minor chord), followed by four chords in the B-flat-major key, ending with two chords forming an authentic cadence in C-major (first key ending) or B-flat-major (second key ending). These two versions of the 10 items were transposed twice (first key = D-major and second key = C-major; first key = B-flat major and second key = A-flat major). This resulted in 60 critical chord sequences (10 items × 2 endings × 3 transpositions). Next to the critical items, filler items in only one key (e.g. C-major) were constructed. For example, the second-key-region was transposed into C-major and a first-key-typical ending (C-major cadence) was chosen. Half of the filler sequences ended in an authentic cadence (requiring high closure ratings), half did not (dominant followed by supertonic or subdominant, requiring low closure ratings). This resulted in 60 filler chord sequences which allowed us to check whether participants paid attention to the musical task. Overall, participants were exposed to as many modulating (critical) sequences as non-modulating (filler) sequences. All chord sequences were played by a piano at a tempo of 96 bpm, consisting only of crotchets (quarter notes) except for the last chord which was made up of dotted minims (three-quarter notes).
184.108.40.206 Language syntax stimuli
The critical language stimuli were a modified version of a stimulus set developed by Kerkhofs et al. . As can be seen in example 1, 60 critical items were constructed in three versions each: an ambiguous S-coordination (garden-path version), an unambiguous S-coordination (intermediate difficulty version with a disambiguating comma before ‘en’, and) and an NP-coordination (non-garden-path version). They were 14 words long. Pre-test results (see §2.1.3) revealed that the unambiguous S-coordination sentences were not perceived as significantly easier than the ambiguous S-coordination sentences. Therefore, we will concentrate on the contrast between the challenging syntax condition (ambiguous S-coordination) and the easier syntax condition (NP-coordination) here; see the electronic supplementary material for an analysis of critical trials involving unambiguous S-coordination sentences.
(1) Language example
(1a) ambiguous sentence-coordination (S-coordination, garden-path condition)
Translation: The surgeon consoled the man and the woman put her hand on his forehead.
(1b) noun-phrase-coordination (NP-coordination, non-garden-path condition)
Translation: The surgeon consoled the man and the woman because the operation had not been successful.
The beginning of each sentence (‘De chirurg troostte de man…’, The surgeon consoled the man…) was followed by ‘en’ (and) and a two-word noun phrase (‘de vrouw’, the woman), followed by a six-word ending of the sentence. This ending began either with a verb (ambiguous S-coordinations, ‘…legde haar hand op zijn voorhoofd.’, …put her hand on his forehead.) or without (NP-coordination, ‘…omdat de operatie niet gelukt was.’, …because the operation had not been successful.). Thus, these two stimulus versions only differed by the sentence ending.
The 60 filler items were also 14 words long and were made up of diverse syntactic constructions. 20 of the 60 filler stimuli included a NP-coordination, ensuring that overall participants read as many S-coordination sentences as NP-coordination sentences. Comprehension prompts of critical trials targeted the S- or NP-coordination ambiguity (e.g. 1: ‘De chirurg troostte alleen de man’. The surgeon only consoled the man.—true for S-coordination, false for NP-coordination). Filler comprehension prompts targeted various aspects of the filler sentence. Half the prompts required a ‘matches the sentence’ response, half required a ‘no match’ response.
220.127.116.11 Arithmetic stimuli
The arithmetic stimuli were made up of seven numbers and seven operators each. Only additions and subtractions involving numbers equal to or below 10 were used after the first number. Interim solutions never exceeded 21 and were always positive integers. Arithmetic operations were classified as easy, hard or intermediate. Easy operations were additions or subtractions of the numbers 1, 2 or 10. Hard operations did not respect a 10-boundary, e.g. the addition of two numbers smaller than 10 whose sum is greater than 10. All other operations were seen as of intermediate difficulty. Note that the time pressure of rapid serial visual presentation meant that the arithmetic task was not too easy as shown by an accuracy of 84% on average.
There were 40 critical items, thus ensuring an equal number of arithmetic and linguistic critical trials in each Difficulty × Key-ending design cell (10 trials). As shown in example 2, their hard and easy versions were identical with regards to the first eight positions as well as the final two positions and the solution. The two stimulus versions differed according to the operations required at positions 9 (underlined in example 2) to 12. At the ninth position, in the hard version, the task required a difficult arithmetic operation, while in the easy version the operation was easy. In both hard and easy stimuli the operations after position 12 were easy. The 40 filler items were of the same length and only included easy and intermediate operations. Akin to comprehension prompts in the language task, prompts in the arithmetic task either matched the true solution (50% of trials) or differed from it by 1 (50% of trials).
(2) Arithmetic example
2.1.3 Pre-test: the strength of the language syntax and arithmetic manipulations
A pre-test (N=24) showed that the difficulty manipulations in the language syntax task and the arithmetic task were both noticeable as revealed by trial difficulty ratings (Difficulty main effect; language: p<0.001; arithmetic: p<0.001; see the electronic supplementary material). As mentioned before, ambiguous and unambiguous S-coordination sentences were not perceived as significantly different [t23<1], suggesting the contrast ambiguous S-coordination versus NP-coordination will be more powerful in revealing linguistic influences on music processing. Therefore, we will focus on this contrast. Overall, the arithmetic manipulation was more salient (Manipulation × Difficulty interaction, p=0.028). This suggests that effects based on general task difficulty should be more easily visible in the arithmetic task. Thus, if there is an effect of language on music and it is due to general task difficulty, then the arithmetic control task should also show an effect.
Participants were instructed to perform two tasks simultaneously: an auditory music task presented concurrently with a language task or an arithmetic task (figure 2). The type of concurrent task (language or arithmetic) was blocked and the order counterbalanced. An experimental session was organized as follows. Participants first completed an experimental block. Thereafter, they completed a musical background questionnaire and a working memory test (digit span ), followed by another experimental block. A testing session took approximately 2 h.
In the music task participants had to judge the closure, i.e. feeling of completeness, of a chord sequence on a 7-point Likert scale (1 = low closure, 7 = high closure). Closure judgements were required after each chord sequence. Participants were asked to answer a language/arithmetic comprehension prompt which followed one-third of the closure judgements (randomly chosen). Given the limited interest in the language/arithmetic task behaviour itself and in order to save time, we only included comprehension prompts on catch trials in order to ensure task adherence. Their unpredictable occurrence meant that the linguistic/arithmetic task had to be carried out throughout the experiment. Participants received feedback on their language/arithmetic accuracy every 20 trials.
In the musico-linguistic part of the experiment participants completed 120 trials, 40 of which were critical trials containing a garden-path (ambiguous S-coordination) or non-garden-path sentence (NP-coordination) in the language task and a harmonic sequence with modulation in the music task. Critical linguistic items were randomly paired with critical musical sequences.
Each language item was only shown in one version to each participant. The choice of item version was counterbalanced. Specifically, language item 1 might be presented in language condition A to participant 1, condition B to participant 2 and condition C to participant 3. For these three participants, the position of the item in the experiment was the same. Furthermore, the language item was paired with the same music stimulus for these three participants. For the next three participants, this item had a different position (determined by chance) and was matched with a different music stimulus (determined by chance). This way, a particular music–language item match could have no systematic influence on the results and the position in the experiment is counterbalanced for different language conditions. Each music item was presented once in every version to each participant. Trials were pseudo-randomized with the following constraints: (i) at least three trials between different versions of a music item, (ii) at most three filler or critical trials after each other, and (iii) at most three same answer conditions (prompt matching or not matching) after each other. The music stimuli were presented auditorily (608 ms per chord). The language stimuli were presented word by word (500 ms per word followed by 108 ms ISI) at the centre of the screen. The onset of a word presentation coincided with the onset of a chord.
In the musico-arithmetic part of the experiment participants completed 80 trials, 40 of which were critical trials containing a hard or an easy arithmetic problem in the arithmetic task and a harmonic sequence with modulation in the music task. The other half were filler trials. The combination of musical and arithmetic items, counterbalancing and trial randomization was done in the same way as described for the musico-linguistic task, except that only 80 musical sequences were used (original items and transposition down by two semitones, otherwise all versions of an item) because of the reduced trial count. The arithmetic stimuli were presented in a similar way as the language material (500 ms per number or operator followed by 108 ms ISI).
The closure ratings of critical trials were negatively skewed with a median rating of 6 (of 7), i.e. our data displayed a ceiling effect likely because all critical sequences ended on an authentic cadence. Therefore, an analysis based on mean ratings appeared inappropriate as the mean would not be a good representation of the typical closure rating behaviour. Instead, we based our analysis on the median. Specifically, we applied a subject-specific median-split analysis on ratings. For each participant, we coded ratings as either indicating high closure (subject-specific median of critical trial ratings or higher) or not. The analyses were performed on the proportion of trials in each condition which indicated high closure. This analysis strategy amplifies small differences in ratings between different concurrent task conditions while acknowledging that most critical trials are perceived as highly complete (high closure).
2.2.1 General task performance
Before presenting the crucial results of the critical trials, we first present evidence for good task adherence. Concerning the music task, using the filler trials which ended either in an authentic cadence (requiring high closure ratings) or not (requiring low closure ratings), we found that all participants had an equal or greater proportion of high-closure ratings for the authentic cadence endings than the no-cadence endings (Mauthentic cadence=0.81>Mno cadence=0.12, difference s.d.=0.21). Therefore, all participants appear to have rated filler trial music as expected by music theory.
Language task accuracy (including filler and critical trials) was high in general (M=85%, s.d.=8%). As expected, the critical trial accuracy showed a clear difference between challenging ambiguous S-coordination (M=83%, s.d.=16%) and less challenging NP-coordination trials (M=90%, s.d.=15%; t(53)=2.28, p=0.027, pη2=0.089). Arithmetic task accuracy was similar in general (M=84%, s.d.=10%). As expected, the critical trial accuracy showed a clear difference between the hard (M=75%, s.d.=23%) and the easy arithmetic trials (M=92%, s.d.=10%; t(53)=5.25, p<0.001, pη2=0.342).
2.2.2 Critical task performance
18.104.22.168 Critical language results
The critical closure data of the musico-linguistic part of the experiment were analysed in a 2 (Difficulty: ambiguous S-coordination, NP-coordination) × 2 (Key: first key ending, second key ending) within-subjects ANOVA; see figure 3a. There was a Difficulty main effect (F1,53=11.99, p=0.001, pη2=0.184). Otherwise, first key endings were less likely (M=0.60, s.d.=0.21) to receive a high closure rating than second-key endings (M=0.77, s.d.=0.14; Key, F1,53=51.74, p<0.001, pη2=0.494). These two factors did not interact (Key × Difficulty, F1,53=1.80, p=0.185, pη2=0.033). Still, as shown in figure 3a, the simple main effects revealed a Difficulty effect only for first key endings (t53=3.28, p=0.002, pη2=0.169), not for second key endings (t53=1.66, p=0.104, pη2=0.049).
22.214.171.124 Critical arithmetic results
The critical closure data of the musico-arithmetic part of the experiment were analysed in a 2 (Difficulty: hard, easy) × 2 (Key: first key ending, second key ending) within-subjects ANOVA (figure 3b). There was no effect of arithmetical difficulty (F1,53<1). Otherwise, first key endings were less likely (M=0.59, s.d.=0.20) to receive a high closure rating than second-key endings (M=0.68, s.d.=0.18; Key, F1,53=13.69, p=0.001, pη2=0.205). These two factors did not interact (Key × Difficulty, F1,53<1).
126.96.36.199 Language syntax versus arithmetic
In order to see whether the difficulty effect in the musico-linguistic task was significantly different from the one in the musico-arithmetic task we performed a three-way within-subjects ANOVA on the harmonic closure ratings with the factors Task (language, arithmetic), Difficulty (ambiguous S-coordination/hard, NP-coordination/easy) and Key (first key ending, second key ending). Indeed, the difficulty effect was greater in the language task than in the arithmetic task (Task × Difficulty, F1,53=5.20, p=0.027, pη2=0.089). Otherwise, all three factors showed a main effect (Task, F1,53=5.52, p=0.023, pη2=0.094) (Difficulty, F1,53=8.16, p=0.006, pη2=0.133) (Key, F1,53=41.33, p<0.001, pη2=0.438). The Key and Task factors interacted (F1,53=8.83, p=0.004, pη2=0.143) while Key and Difficulty did not (F1,53<1). The three-way interaction was marginally significant (Task × Difficulty × Key, F1,53=3.47, p=0.068, pη2=0.062).
2.3 Discussion: language syntax affects harmony judgements, arithmetic does not
The results of experiment 1 show that a syntactic garden-path effect can indeed influence harmony perception by reducing the ability to integrate chords, as shown by lower closure ratings at the end of a harmonic sequence. An arithmetic control condition, without syntactic demands but with a more salient difficulty manipulation, did not affect harmony perception. This shows that general trial difficulty is not involved in the syntax effect. Instead, as predicted by Patel’s  SSIRH, a syntactic challenge in language probably taxed shared musico-linguistic syntactic integration resources which in turn can then only sub-optimally process chords. A non-specific effect of linguistic or arithmetic difficulty on music ratings, as predicted by attentional accounts [11,15], was not observed. Note that the syntax effect was not specific to either type of music ending, suggesting that harmonic processing in general was affected, rather than holding a key online or establishing a new key.
The observed main effect of key is not of interest in this study. The fact that second-key endings more often receive high closure ratings than first-key endings likely reflects the relatively fast decline of key availability if a harmonic key is not reinforced by heard chords. Whether this is also the case without a concurrent task was investigated in a post-test which is part of the electronic supplementary material. Briefly, this post-test showed that no-cadence, first-key, and authentic cadence endings were rated very similarly without a concurrent language or arithmetic task. Second-key endings, however, less often received a high closure rating without a concurrent task compared to with, leading to a smaller difference between first-key and second-key ending ratings in this post-test. Still, the numerical trends were in the same direction, i.e. there is no indication in the post-test data that second-key endings are heard as less well closed compared to first-key endings. In sum, participants who only did the music task performed this task similarly to participants who did it while engaging in a second task at the same time.
However, a series of open questions remain concerning the effect of language on music ratings. Firstly, the difficulty in the language domain consisted in the syntactic re-analysis of a conjunction (‘en’, and) held in memory. This sixth word either linked two sentences (S-coordination) or two noun-phrases (NP-coordination) depending on the ninth word in the sentence. The arithmetic control task required no such retrieval of an item from memory. Therefore, it could be hypothesized that the different effects of linguistic and arithmetic difficulty are based on a working memory mechanism: one either has to retrieve a temporally distant element or not. Re-analysing a word held in memory because of an unexpected new word could be thought to impair the ability to simultaneously hold chords in working memory. This in turn might impair the ability to use them for harmonic integration. In order to evaluate this alternative scenario we included digit span as a covariate in the analysis of the musico-linguistic task data. If working memory, rather than common syntactic integration resources, was responsible for the language effect, then the greater availability of working memory resources should lead to a reduction in the influence of language on music. An ANCOVA analysis with z-scored digit span as a measure of working memory did not support this scenario. Neither forward digit span, nor backward digit span, nor overall digit span significantly modulated any of the main effects or interactions (ps>0.1; see the electronic supplementary material). This supports a functional distinction between syntactic processing and general working memory [25–28].
A second open question relates to the specificity of the language manipulation. The linguistic contrast chosen in experiment 1 did not control for semantic or lexical differences after the eighth word of the sentence. Experiment 2 was designed to directly evaluate the influence of language syntax and language semantics while tightly controlling lexical items. We included a new syntactic contrast which matches lexical items between conditions (object- and subject-extracted relative clauses). The semantic manipulation consisted of semantic garden-path sentences (also called lexical garden-path sentences) in which a disambiguating word leads to the semantic re-interpretation of an ambiguous word held in memory. Note that semantic garden-path effects have previously been shown to be modulated by a music harmony manipulation . This renders this particular semantic manipulation an ideal testing ground for the specificity of the syntactic effect encountered in experiment 1.
Finally, an additional reason for including a second experiment here is the post hoc nature of the analysis strategy we chose in experiment 1, i.e. the decision to focus on the proportion of high-closure ratings rather than raw ratings due to an unexpected ceiling effect. As opposed to the exploratory nature of experiment 1, experiment 2 was purely confirmatory and used the same analysis strategy.
3. Experiment 2 (confirmatory): language syntax versus language semantics
Sixty-two participants were invited to take part in the experiment (11 men, 54 right handed). None had participated in experiment 1. They were all Dutch native speakers, aged 21 on average (s.d.=3.0), with 3.3 years of musical training on average (s.d.=3.9). Forty-six were self-described non-musicians, 16 amateur musicians. They were paid € 12 or undergraduate course credits for their participation and were naive as to the purpose of the experiment.
3.1.2 Design and material
We employed a 2 (Key: first key ending or second key ending) × 2 (Difficulty: object-relative clause, subject-relative clause) design for the language syntax part of the experiment and a 2 (Key: first key ending or second key ending) × 2 (Difficulty: semantic garden-path, non-garden-path) design for the semantic part of the experiment. All factors were manipulated within-subjects. As in experiment 1, auditory musical stimuli and visual linguistic stimuli were presented concurrently. The critical point in time was always the ninth position (underlined in examples 3 and 4 below), corresponding to the start of a new key in the musical material and the disambiguating word in the sentences.
188.8.131.52 Music stimuli
We used the same music stimuli as in experiment 1.
184.108.40.206 Language syntax stimuli
As can be seen in example 3, 40 critical items were constructed in two versions each: an object-extracted relative clause and a subject-extracted relative clause. Sentences were 14 words long and were identical apart from the relative clause verb (‘hielp’/‘hielpen’, ) which either agreed in number with the second noun-phrase (‘de zoon’, the son) in the object-extracted relative clause condition or the first noun phrase (‘de vrienden’, the friends) in the subject-extracted relative clause condition.
(3) Language example (syntactic manipulation)
(3a) object-extracted relative clause (OR)
Translation: The friends who the son helped to get back on their feet let him see the building.
(3b) subject-extracted relative clause (SR)
Translation: The friends who helped the son get back on his feet let him see the building.
The 40 filler items included various syntactic constructions. Comprehension prompts of critical trials targeted the object- or subject-relative clause ambiguity (e.g. 3: ‘De vrienden hielpen de zoon’. The friends helped the son.—false for object-relative clause, true for subject-relative clause). Half the prompts required a ‘matches sentence’ response, half the opposite response.
220.127.116.11 Language semantics stimuli
As can be seen in example 4, 40 critical items were constructed in two versions each: a semantic garden-path version and a non-garden-path version. Sentences were 14 words long and were identical apart from one manipulated word which occurred in positions two to seven (‘muis’/‘veldmuis’, mouse/field vole) which was either ambiguous or not. The disambiguating word (‘rondlopen’, run around) was always the ninth word of the sentence.
(4) Language example (semantic manipulation)
(4a) semantic garden-path (GP)
Translation: The programmer let his mouse run around on the table after he had fed it.
(4b) non-garden-path (non-GP)
Translation: The programmer let his field vole run around on the table after he had fed it.
Additionally, 40 filler items were included. They used various syntactic constructions and avoided semantic ambiguities. Comprehension prompts of critical trials targeted various parts of the sentence (e.g. 4: ‘De muis was gevoerd’. The mouse had been fed.—true). Half the prompts matched the sentence.
3.1.3 Pre-test 1: sentence completions show the intended misinterpretation of semantic material
Before starting the main experiment we conducted two pre-tests with two purposes. Firstly, we wanted to choose the 40 best semantic items out of an original set of 60. Items were included if they exhibited the typical semantic garden-path pattern of a semantically ambiguous word being initially misinterpreted until a disambiguating word changes the interpretation. Below we only report the analysis involving the 40 items which were used in experiment 2. Secondly, as in experiment 1, we wanted to establish the strength of the manipulations in the syntactic and the semantic parts of the experiment.
In a first pre-test (N=54) participants were asked to complete sentence beginnings which did not include a disambiguating word (words 1 to 8, e.g. ‘De programmeur liet zijn muis op de tafel…’, The programmer let his mouse…). Sentence completions revealed that the intended word interpretation was overwhelmingly adopted in the non-garden-path condition (93.6% of trials) while this hardly happened in the semantic garden-path condition (13.3%). This is exactly the pattern expected of semantic garden-path sentences (see the electronic supplementary material).
3.1.4 Pre-test 2: the strength of the syntactic and semantic manipulations
The aim of the second pre-test was to establish the strength of the difficulty manipulations of the syntactic and the semantic items. Participants (N=24) read sentences word by word while the timing of word presentation was under their control (self-paced reading). Afterwards they rated trials for difficulty. The semantic manipulation led to higher reading times from the disambiguating word onwards in the semantic garden-path condition (compared to non-garden-path condition). The difference was 44 ms (p=0.005) for the critical disambiguating word and 85 ms (p<0.001) for the post-critical word. Overall, this resulted in a greater perceived difficulty with these sentences (p=0.003). Moreover, the syntactic manipulation was equally salient as seen in reading times (reading time difference on critical ninth word: 64 ms (p=0.009), on post-critical word: 58 ms (p=0.008)) and difficulty ratings (p=0.001; see the electronic supplementary material). Thus, if we observe differences between the syntactic and the semantic manipulations these cannot be attributed to quantitatively different processing requirements of the two manipulations as they were matched in this respect.
The task was the same as in experiment 1. The type of manipulation (syntax or semantics) was blocked and the order counterbalanced. Experimental sessions did not include a working memory test. A testing session took approximately 90 min.
The analysis was the same as in experiment 1.
3.2.1 General task performance
Before presenting the crucial critical trial data, we first show evidence of good task-adherence. Concerning the music task, using the filler trials we found, as expected, that all participants had a greater proportion of high-closure ratings for the authentic cadence endings than the no-cadence endings (Mauthentic cadence=0.76>Mno cadence=0.09, difference s.d.=0.18).
Accuracy during the syntax part was high in general (M=70%, s.d.=9%). As expected, prompts after object-relative clauses were answered less accurately (M=37%, s.d.=24%) than those after subject-relative clauses (M=76%, s.d.=19%; t61=9.80, p<0.001, pη2=0.661). Accuracy during the semantics part was higher in general (M=84%, s.d.=7%). There was no significant accuracy difference between garden-path (M=86%, s.d.=14%) and non-garden-path trials (M=85%, s.d.=13%; t61<1). However, this null effect should not be taken to mean that the semantic garden-path effect was somehow absent. Instead, it reflects the fact that the comprehension prompts in the semantics part often did not target the semantic ambiguity of the sentences. As pre-test 2 showed, using the same comprehension prompts, critical semantics trials still exhibited a garden-path effect both in reading time and difficulty ratings, showing that it is not necessary to focus comprehension prompts on the garden-path manipulation in order for it to have an effect.
3.2.2 Critical task performance
18.104.22.168 Critical results of the syntax part
The critical closure data of the syntax part were analysed in a 2 (Difficulty: object-relative clause, subject-relative clause) × 2 (Key: first key ending, second key ending) within-subjects ANOVA (figure 4a). Object-relative clauses led to significantly fewer (M=0.64, s.d.=0.15) high closure ratings than subject-relative clauses (M=0.68, s.d.=0.14; Difficulty, F1,61=5.45, p=0.023, pη2=0.082). Otherwise, first key endings were less likely (M=0.59, s.d.=0.17) to receive a high closure rating than second-key endings (M=0.73, s.d.=0.15; Key, F1,61=38.53, p<0.001, pη2=0.387). These two factors did not interact (Key × Difficulty, F1,61=2.48, p=0.121, pη2=0.039). Still, as shown in figure 4a, the simple main effects revealed a Difficulty effect only for first key endings (t61=2.51, p=0.015, pη2=0.094), not for second key endings (t61<1).
22.214.171.124 Critical results of the semantics part
The critical closure data of the semantics part were analysed in a 2 (Difficulty: semantic garden-path, non-garden-path) × 2 (Key: first key ending, second key ending) within-subjects ANOVA (figure 4b). Closure ratings did not differ between semantic garden-path (M=0.63, s.d.=0.16) and non-garden-path trials (M=0.62, s.d.=0.15; F1,61<1). Otherwise, first-key endings were less likely (M=0.56, s.d.=0.16) to receive a high closure rating than second-key endings (M=0.69, s.d.=0.15; Key, F1,61=27.00, p<0.001, pη2=0.307). These two factors did not interact (Key × Difficulty, F1,61<1).
126.96.36.199 Language syntax versus language semantics
In order to see whether the influence of the syntax manipulation on music ratings was significantly different from the influence of the semantics manipulation we performed a three-way within-subjects ANOVA with the factors Manipulation (syntax, semantics), Difficulty (object-relative clause/semantic garden-path, subject-relative clause/non-garden-path) and Key (first key ending, second key ending). Indeed, the difficulty effect was different in the syntax part and the semantics part (Manipulation × Difficulty, F1,61=4.10, p=0.047, pη2=0.063). Otherwise, only one main effect was significant (Key, F1,61=44.56, p<0.001, pη2=0.422) (Manipulation, F1,61=3.85, p=0.054, pη2=0.059) (Difficulty, F1,61<1) (Manipulation × Key, F1,61<1) (Difficulty × Key, F1,61=1.39, p=0.242, pη2=0.022) (Manipulation × Difficulty × Key, F1,61<1).
3.3 Discussion: language syntax affects harmony judgements, semantics does not
In line with the exploratory first experiment, experiment 2 has shown that a syntactic challenge in language can influence harmony perception. Moreover, it was shown that a non-syntactic language manipulation, namely a semantic garden-path manipulation, has no such effect. This pattern of results is indicative of a shared syntactic resource account , not a general attention mechanism [11,15]. Interestingly, our findings suggest that semantic processing itself does not influence music harmony perception in contrast to the influence of music itself on semantic processing found by . Potential reasons for this will be discussed.
The difference between the syntactic effect and the semantic effect is remarkable. These two manipulations were rated as equally salient and led to very similar reading time changes. Nonetheless, the nature of the manipulation had different effects on the concurrent music perception. As discussed in §2.3, one could also propose a memory-based explanation for the difference between syntactic and semantic effects on music. Differences in the distance between the ambiguous word and the disambiguating ninth word could be responsible. However, when looking at these distances it becomes clear that the semantic manipulation (distance ambiguous–disambiguating word: M=4.08, s.d.=1.65, 95% CI based on bootstrapping with 50 000 sample=[3.58;4.60]) was intermediate between the two syntactic manipulations (experiment 1: always three words between ‘en’, and, and the ninth word; experiment 2: distance between ‘die’, who and disambiguating ninth word, M=4.98 words, s.d.=1.03, 95% CI=[4.65;5.28]). This makes a memory based explanation for the difference between the semantic and the syntactic effects unlikely.
Having investigated what kind of manipulations do exert an influence on music harmony perception, we will now turn towards a more detailed analysis of the syntax effect itself. So far, we were unable to distinguish an effect on either of two harmonic sub-processes: holding a key online (reflected in first-key ending ratings) and establishing a new key (reflected in second-key ending ratings). The greater statistical power afforded by combining the data of the syntactic manipulations of both experiments might illuminate this issue.
4. Experiments 1 and 2: combined analysis of the syntax effect
We combined the data of the syntax manipulations of experiment 1 (ambiguous S-coordination, NP-coordination) and experiment 2 (object-relative clause, subject-relative clause) to form a single Difficulty factor (figure 5). A 2 (Experiment: 1, 2) × 2 (Difficulty) × 2 (Key: first key ending, second key ending) mixed between- and within-subjects ANOVA exhibited the main effects of Difficulty and Key which we observed before (Difficulty, F1,114=17.63, p<0.001, pη2=0.134) (Key, F1,114=89.47, p<0.001, pη2=0.440). The factor Experiment was without effect (Experiment, F1,114<1) (Difficulty × Experiment, F1,114=1.50, p=0.224, pη2=0.013) (Key × Experiment, F1,114<1) (Difficulty × Key × Experiment, F1,114<1). However, the greater power revealed a previously non-significant interaction between Difficulty and Key (F1,114=4.21, p=0.042, pη2=0.036). Follow-up t-tests showed that the difficulty effect was specific to first-key-typical endings (t115=4.09, p<0.001, pη2=0.127) while it was non-significant for second-key-typical endings (t115=1.43, p=0.155, pη2=0.018). Note that including musical training as a covariate does not change the pattern of these results. In the overall ANCOVA z-scored musical training years did not modulate any of the main effects or interactions (ps>0.2); see the electronic supplementary material. In the electronic supplementary material, we also provide analyses by items.
4.2 Discussion: a first-key-specific syntax effect
Using the greater power afforded by the combination of the data of experiments 1 and 2, we have shown that the syntax effect is specific to harmonic sequences ending in the first key. This suggests that during music listening shared music–language resources are involved in holding a key online in order to interpret incoming chords in its context. Building up a new key seems to be processed by different resources. We will discuss the implications of these findings in the general discussion. One should bear in mind that experiment 1 was exploratory in nature, and thus the result of the combined analysis should be replicated.
The comparison also showed that the syntax manipulation in experiment 1 appears to be numerically stronger (pη2=0.169 for the difficulty effect related to first key endings) than in experiment 2 (pη2=0.094). Looking at the comprehension accuracy data reveals that this small difference might be related to participants sometimes giving up a full syntactic analysis of the very challenging object-relative clauses (accuracy=37%) but do so less in the challenging ambiguous S-coordination trials (accuracy=83%). This might indicate that syntactic integration resources can also be taxed too much, leading to incomplete parsing of the sentence . The details of such nonlinear effects should be worked out in future studies. For the present purposes we can conclude that challenging syntax processing, even when it leads to incomplete syntactic interpretations, draws resources away from the harmonic integration of chords.
5. General discussion
We present two experiments which investigated whether music harmony processing can be influenced by a concurrent language or arithmetic task. Patel’s  SSIRH hypothesizes that previously observed musical influences on the processing of language syntax are due to shared musico-linguistic resources involved in syntactic integration. Therefore, under this account music harmony should be influenced only if the concurrent task involves a syntactic manipulation. Our results support this prediction. While two different language syntax contrasts (ambiguous S-coordination versus NP-coordination and object-relative clause versus subject-relative clause) modulated harmonic closure ratings, arithmetical difficulty as well as a semantic garden-path manipulation had no such effect. This contradicts accounts relating previous music–language interactions solely to general attention mechanisms which would have predicted an influence of the arithmetic and semantic control manipulations.
The functional role of shared syntax resources in terms of harmonic processing was further investigated by looking at the syntax effect on ratings of different kinds of musical endings. This analysis revealed that a syntactic challenge in language reduces the music listener’s ability to maintain an already established key online for the purpose of harmonic integration of chords. Building up a new key was unaffected. In what follows, we will discuss these results.
5.2 Novel support for shared syntactic integration resources
The present findings offer novel support for Patel’s  SSIRH. The novelty stems from the use of a musical measure of interest, as opposed to previously used linguistic measures such as language comprehension accuracy , reading times  or word judgement times . Our findings suggest that the direction of influence is not crucial for interference effects. Music can influence language and vice versa. This is in line with studies which measured brain activity during music and language processing in similar paradigms. For example, in an electroencephalogram (EEG) event-related potential (ERP) study, Steinbeis & Koelsch  found language processing in the form of the left anterior negativity to be altered by a concurrent harmonic violation (see also ). They also found the early right anterior negativity elicited by an unexpected chord to be influenced by language syntax errors (but see ). This was taken as evidence for a change in musical processing as a result of a linguistic syntactic violation. Clearly, both offline measures as used here or by Fedorenko et al.  and online measures of processing [3–6,8] reveal a mutual influence between music and language.
However, there is also a negative deflection in the EEG signal which peaks around 650 ms after stimulus onset and which is elicited by an unexpected chord. This deflection is often called the N5 and it can be modulated by a language semantics manipulation , suggesting an effect of language semantics on music harmony in contradiction to our results in experiment 2. However, the N5 is not elicited in some experiments  or participant groups . Moreover, other EEG studies working with similar interference paradigms failed to find an interaction between music harmonics and language semantics [4,31] (see also ). Therefore, it appears difficult to interpret the apparent contradiction between Steinbeis & Koelsch’s  ERP result and our semantics null effect in experiment 2. More research is needed in order to better characterize the N5. Until then we are inclined to rely on the straightforward interpretation of our behavioural measure to indicate an absence of semantic influences on harmonic processing. Only language syntax appears to reliably influence music harmony behaviour.
5.3 The role of shared resources in the music network
In the current experiment only the first harmonic key which was held online was affected by a concurrent language syntax manipulation. This could be taken to reflect shared resources’ role as a specifically syntactic working memory in the brain’s music network . In terms of language, this aspect of working memory makes ‘syntactic information actively available over sustained periods of the sentence while new information is being processed continuously’ , p. 88. An additional role of syntactic working memory could be to hold a harmonic key online while new chords are integrated. This process might be impaired if concurrently processing complex sentences restricts the syntactic working memory resources left available for music.
Alternatively, music and language might share syntactic unification resources , p. 416/417. According to this account, lexical items are retrieved from memory into a ‘unification workspace’ in which ‘constituent structures spanning the whole utterance are formed’ by combining items according to syntactic principles, i.e. by unifying them. The more demanding unification operation associated with disambiguating words of garden-path sentences might impair the ability of shared resources to keep a harmonic key in the unification workspace. Therefore, the key is no longer a potential site with which incoming chords can be unified. This study cannot distinguish between these two accounts.
5.4 Non-syntactic overlap between music and language?
We found no evidence for a general attentional mechanism driving the linguistic influence on music, as both control conditions were without effect. Thus, the nature of the manipulation (syntactic rather than something else) appears important. However, according to Perruchet & Poulin-Charronat’s  attentional load account influences from one domain on the other only emerge if enough attentional resources are left to process both. In the present experiments, it could be argued that critical language syntax trials might be overall less demanding than control task trials, leaving more attention to music perception. However, there is no evidence for such an account here. As the difficulty ratings of the pre-tests revealed, critical language syntax trials were sometimes seen as overall easier (experiment 1) or harder (experiment 2) than their control task counterparts (ps≤0.001; see electronic supplementary material). Thus, in this study, there is no linear relationship between the effect of a task on music perception and its overall difficulty. General attention as well as shared syntactic resources might be involved in musical influences on language [5,11], while language influencing music relies solely on syntactic resources. This proposal should be tested in the future.
Next to attention based explanations, an error based explanation of previous studies’ results [3,4,6] was put forward by Rogalsky et al. . However, our study only used well-formed sentences and musical sequences without errors, as done in previous behavioural studies investigating musical influences on language [2,5]. Similarly, our results could also be said to be due to a working memory based mechanism (not to be confused with the notion of syntactic working memory [26,33]). In order to hold a key online listeners might need working memory resources which the syntactic/semantic reinterpretation of a previously encountered word taxes. However, phonological working memory was not associated with the syntax effect found in experiment 1, and working memory effects would lead to similar semantic and syntactic garden-path influences on harmonic processing. Instead, we found that only the latter affects music ratings. This leads us to favour the SSIRH put forward by Patel  as the likely mechanism behind our results, rather than attentional accounts [11,15,16], an error processing explanation , or a working memory account.
This study found evidence that music and language share resources for syntactic integration . Our results suggest that challenging language syntax impacts on the ability to integrate chords into an already established key. It appears that music and language are processed in a similar way by the human brain due to a key commonality: they are both structured sequences, i.e. their constituent elements relate in a rule-governed way to each other. Shared syntax resources are responsive to structured sequences in general—whether they are linguistic or musical in nature.
All participants provided informed, written consent and the study was approved by the local ethics committee ‘CMO Arnhem Nijmegen’ (CMO no 2001/095 and amendment ‘Imaging Human Cognition’ 2006, 2008), in accordance with the research involving human subjects Act, following the principles of the Declaration of Helsinki.
The datasets supporting this article have been uploaded as part of the electronic supplementary material. Matlab code for reproducing figures 3, 4, and 5 is also provided as part of the electronic supplementary material.
R.K. conceived and designed the experiments, acquired the data, analysed them and wrote the paper. R.M.W. conceived the experiments and revised the paper. P.H. conceived the experiments and revised the paper. All authors gave final approval for publication.
We report no competing interests.
This work was supported by an NWO Spinoza Prize awarded to P.H. and a PhD grant from the Max Planck Society to R.K.
We thank L. Robert Slevc, Aniruddh D. Patel, Sean J. Fallon and Susanne Vogel for helpful discussions on the background and analysis of the study. We are grateful to Gwilym Lockwood for providing useful feedback on an earlier version of this draft.
A.1. Experiments 1 and 2: music
Only the C-major/B-flat major version is shown. Two transpositions were made of these items. See §188.8.131.52. for details. no. item 1 2
↵1 In the Western tonal tradition, music harmony refers to the organization of pitches in terms of scales and chords (simultaneous soundings of several pitches). There are twelve pitches per octave, adjacent pitches being separated by one semitone. Pitches occupying the same ordinal position in different octaves are said to belong to the same pitch class (e.g. all the C-notes on a piano). Each scale (e.g. C-major scale) consists of seven pitch classes (in-key notes, e.g. C, D, E, F, G, A, B). When playing in a musical key (e.g. C-major key) these seven pitch classes are emphasized and the five remaining pitch classes (out-of-key notes, e.g. C#, D#, F#, G#, A#) are relatively unexpected; the same applies to the chords based on these tones.
↵2 Including these three participants does not alter the findings of the study. Notably, the crucial Difficulty main effect in the musico-linguistic part of the experiment was still significant (F1,56=12.59, p=0.001, pη2=0.184), as was the Task × Difficulty interaction when including Task as a factor (F1,56=5.50, p=0.023, pη2=0.089).
- Received December 11, 2015.
- Accepted January 6, 2016.
© 2016 The Authors. Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.