[Smt-talk] Classical Form and Recursion

Fred Lerdahl awl1 at columbia.edu
Wed Apr 15 21:48:22 PDT 2009

Dmitri writes:

> My own view is that the uncritical exporting of the 
> performance/competence distinction from linguistics to music is highly 
> problematic -- in fact, I'd say it's the very root of our 
> disagreement.  My view is that in the linguistic case, most people 
> approach idealized "competence" at least for relatively uncomplicated 
> sentences.  The idealization, in other words, is small.
> In TPS, I think the idealizations involved are very large -- actual 
> listeners are not close to having the abilities attributed to them by 
> TPS, for instance because they can't hear the return to the tonic 
> after about a minute and a half.  (Among many other issues: they also 
> can't reliably hear the V chord.)  Furthermore, and more 
> interestingly, the very definition of "competence" involves delicate 
> aesthetic questions that do not arise in the linguistic case.  Is the 
> listener "competent" who loves Beethoven but does poorly on ear 
> training exams?  What about the one who hates music but passes every 
> test?  Does the ideal listener have perfect pitch?  Etc.
> 	The interesting observation here is that semantics (non-obviously) 
> helps to underwrite the definition of "linguistic competence"; 
> operationally, our only access to syntactic competence runs through 
> semantics.  Absent semantics, the notion of "competence" becomes 
> thorny.
> In fact, I've been giving a talk entitled "Why linguistics is a bad 
> model for music theory" in which the basic points I make are:
> 	1. The problem of information loss in music perception is large, and 
> should not be idealized away.
> 	2. The notion of a "performance/competence distinction" in music 
> theory is highly problematic, and should not be assumed automatically.
> 	3. Without semantics, it's very hard to set a non-controversial lower 
> threshold on "competence."

The competence-performance issue is not important to me, or generally 
to practitioners of cognitive science in language or music, except as a 
marker of the study a mental capacity according to certain simplifying 
idealizations. The value of the results reveals the usefulness of the 
idealizations. Dmitri questions the idealizations but offers nothing in 
return. Meanwhile progress is being made using them.

I am currently co-teaching a course on music and language with the 
auditory perceptual psychologist Robert Remez. Our students (from 
music, linguistics, psychology, computer science, neuroscience, etc.) 
report on all kinds of research about music and language. No one is 
worried about the competence-performance issue. One talks instead about 
this or that theory or experiment concerning some aspect of music 
and/or language. There is a lot of research on music and language (a 
good reference is Patel's book, "Music, Language, and the Brain," 
Oxford, 2008). Rather than make a broad assertion that "linguistics is 
a bad model for music theory," I suggest direct engagement with some of 
these theories and experiments.

> 1. TPS assumes that listeners can reliably tell whether something is a 
> V or I chord, because this information is an input to its various 
> algorithms.
> 	2. We all agree that ordinary listeners cannot reliably name whether 
> something is a V or I chord.
> Now there's a friction between #1 and #2, one that the TPS-defender 
> has to explain away.  You don't make this vanish by changing the topic 
> to "tension" -- at least, not if the tension-calculation requires 
> knowledge of whether something is a I or a V.  (In other words, 
> talking about tension hides, but does not resolve, the contradiction.) 
>  As far as I can see, your only option is to say that the difficulties 
> labeling I and V are entirely the result of trying to retrieve the 
> relevant information from the unconscious -- it's there, but we don't 
> realize it.  Thus, on your view, when we learn to apply Roman 
> numerals, we're not learning to perceive more accurately -- instead, 
> we're learning to *label* perceptions that we already have.  (The is 
> reminiscent of Meno's Socrates.)
> I, for one, do not find this to be compelling -- it seems more likely 
> that when we learn Roman numerals, we're learning to perceive more 
> accurately.  But this then weakens my credence in those parts of TPS 
> which depend on your implicit-knowledge hypothesis.

Dmitri seems to have trouble with two points that are foundational in 
cognitive science: the distinction between explicit and implicit 
knowledge, and the amazing complexity of implicit knowledge. Both 
points are beyond dispute within the field. It would be better to move 
on to issues that are indeed under investigation--for instance, the 
structures and principles involved in a capacity's mental 
representation (my interest), the role of statistical learning (see 
Huron, "Sweet Anticipation," Oxford, 2006), the different roles of 
representation and processing and their neural instantiation (Patel), 
or neuropsychological evidence of what music and language do and do not 
share (Peretz & Coltheart, 2003).

As to a I or V chord, my unexceptional view is that while naming things 
can improve performance, the main mental action is unconscious. To take 
a well-studied area, the computations behind the construction of a 
visual field are incredibly complex and largely opaque to 
consciousness. The same is true of auditory scene analysis (see 
Bregman's book of that name) and, still more so, of music perception.

Unconscious music processing does not include the label "V chord." 
Rather, the experience is of a certain sensory quality combined with a 
state of tension and expectation. The machinery of my theory attempts 
to account quantitatively for the shifting states of tension and 
expectation. The quantitative aspect is an advance over GTTM because 
its predictions are more precise, hence more testable and falsifiable. 
The empirical paper with Krumhansl shows reasonable success in the 
machinery's predictions; but the results are provisional, and no doubt 
the theory can be improved. In any case, there is no "friction" between 
Dmitri's two points above.

Dmitri quotes Art Samplaski and replies:

>> I agree with Fred's underlying principle, that "average
>> listeners" (here taken to mean, are untrained but have
>> listened to a fair deal of Western music) can distinguish
>> "dominant" because of tonal tension.
> On reflection, I'm not sure that Fred believes this -- strange as it 
> seems, he may believe something closer to the opposite.
> What I'm thinking is that according to TPS perceived tension is the 
> *result* of certain calculations that are performed subconsciously -- 
> calculations that in effect use the knowledge that a particular 
> configuration of notes is a V chord.  So, tension is produced by the 
> (unconscious) knowledge that something is a V chord, rather than the 
> other way around.  It's not that we hear something tense and think -- 
> "oh, an unstable sonority, must be a V7."  Or we may do this 
> consciously, but the unconscious already knows that we've heard a V7, 
> because it's got access (essentially) to the score.

I would rewrite the last phrase to say access not to the score but to 
the sound signal after it is processed into pitches and rhythms. 
Otherwise Dmitri interprets me correctly. As in other areas of 
unconscious behavior, listeners perform computations on the musical 
input. From these computations arise intuitive understanding, feelings 
of tension and expectation, and affective response. This means, 
according to TPS, that, among other things, the listener mentally 
registers ongoing configurations of TPS's "basic space," which itself 
reflects the underlying psychoacoustics. At the end of chapter 2, I 
note that any such configuration can be stated in a more or less 
neurally plausible hierarchical binary notation. Another possibility is 
that the tonotopic mapping of pitch on the basilar membrane and in the 
auditory cortex (for a review, see Weinberger's chapter in the 2nd 
edition of Deutsch's "Psychology of Music") extends to more complex 
spatial mappings in the brain. In TPS the numerical format is primary, 
but no one knows yet how mental structures such as these are neurally 

It is of course possible that such structures are figments of the 
theoretical imagination. But then one must come up with another way to 
explain the perceptual distinctions and experiences that musicians 
sense and that ordinary listeners show in experiments.

I could go on with responses to the indefatigable Dmitri, but life is 
not infinite. He writes:

> Let me say that I think all of these questions are very, very deep, 
> and very interesting....I don't by any means think any of these 
> questions are conclusively settled.  I'd love to see more thinking and 
> discussion about all of this -- in Spectrum, at SMT, on this list, or 
> whatever.

Jean-Jacques Nattiez once said to me that Americans are good at 
building technologies but poor at reflecting on foundational issues. 
Dmitri's challenges, as well as those of others to whom I have not 
responded, are a healthy rejoinder to Nattiez. I agree that the issues 
raised in these exchanges are deep, and I thank Dmitri and others for 
raising them.

Fred Lerdahl
Columbia University
awl1 at columbia.edu

More information about the Smt-talk mailing list