[Smt-talk] Classical Form and Recursion

Dmitri Tymoczko dmitri at Princeton.EDU
Sun Apr 5 14:39:54 PDT 2009

On Apr 5, 2009, at 1:16 PM, Panayotis Mavromatis wrote:

> First of all, it will be fair to say that I recognize a lot of  
> common methodological ground between you and me.  Therefore, I  
> offer my comments in the spirit of clarification, not radical  
> disagreement.  However, as they say, the devil is in the details,  
> on which I will gently but firmly insist.

Agreed!  Though I suspect that underneath our similar methodological  
interests, there lie dissimilar musical intuitions.  But it's fun  
debating with you, because we do have so much common ground.  For  
example, we both sit around in our spare time making Markov models.

>> Typical spoken English contains syntactic units that are about 13  
>> words long, as compared to written English, in which the syntactic  
>> units are 22 words long.  This contrasts with the length of  
>> classical movements, which can be 20 minutes long, and can contain  
>> hundreds of measures and tens of thousands of notes.  Furthermore,  
>> the accuracy of linguistic perception is significantly higher than  
>> that of musical perception -- any way you slice it, there is an  
>> enormous amount of information loss in musical perception, whereas  
>> linguistic perception is remarkably accurate.  The differences  
>> here are dramatic and not at all subtle -- we're talking orders of  
>> magnitude, rather than factors of 2.
> To your above estimates, I respond by simply quoting my earlier  
> comment: "It is generally agreed that this capacity depends on the  
> specific type of mental coding involved, and cannot be simply  
> defined in terms of the symbolic content of the stimulus at the  
> surface level."  In other words, it may not be good enough to  
> simply count measures and notes.  And the way we count could affect  
> the answer even by an order of magnitude.  Memory is a  
> reconstructive process, in which a chunk that has been learned  
> schematically, and counts as one unit in terms of information  
> processing load, can be unpacked to represent a large number of  
> surface events, such as notes.  The question is: What are the  
> chunks that are involved in music processing specifically?

You're right -- what you say is conceivable.  I'm making a burden-of- 
proof argument and a scientific plausibility argument.  You, ex- 
string theorist that you are, are interested in the Planck-scale  
issues of principle.  I'm more concerned with what we might find at  
the local particle accelerator.

My claim is: given the apparently vast differences between the time  
scales involved in Schenkerian recursion and linguistic recursion,  
there would seem to be a problem.  If a chord is at all analogous to  
a word, which it seems it might be, then there's a mismatch of  
several orders of magnitude.  So the burden of proof would seem to  
fall upon those who claim that movements are highly embedded  
recursive structures, analogous to sentences.

Furthermore, let's remember that there's an analytico-musical  
practice here: so if you want, let's pull out your favorite Schenker  
graph of some complex piece, and see how many levels of hierarchical  
embedding it has.  How much information does the listener have to  
store in memory at one time, *according to the graph*?   After all,  
there does exist an actual body of claims about recursive depth,  
temporal spans of recursive structures, etc., and these claims can be  
compared to what we know from linguistics.

Anything is, of course, possible.  It may be that the apparent  
challenges can be met.  I'm just pointing to the fact that they're  
there, and that we really haven't tried to address them as a community.

> Also, accuracy is not a direct measure of processing capacity,  
> since the former also involves the strength of the relevant long- 
> term memory structures evoked in parsing the stimulus.  It is  
> indeed likely that we have better and quicker access to structures  
> that enable us to parse a sentence than we do for those structures  
> that help us parse musical structure.  So accuracy does not simply  
> boil down to how big a hierarchical tree we can fit in working  
> memory in each case.

I meant to suggest that accuracy is an *additional* problem over and  
above the problem of processing capacity.  When we listen to words,  
we're pretty accurate.  We get most of them.  And this helps us  
construe the syntactical structure of the sentences we hear.

As any one who's ever taught ear training knows, we're nowhere near  
as accurate when we listen to music.  We miss modulations, we think  
that a V chord is something else, etc., etc.  Yet these details are  
presumably important for getting prolongational structure right.  So  
the defender of recursion has to explain how our brain constructs  
these enormously ramified recursive analyses from a stimulus that is  
being perceived very inaccurately.  It's as if we were listening to  
someone mumbling a complex 20-minute sentence in the midst of an  
intense thunderstorm, so that we miss every seventh word or something.

As you say, though, anything is possible!


Dmitri Tymoczko
Associate Professor of Music
310 Woolworth Center
Princeton, NJ 08544-1007
(609) 258-4255 (ph), (609) 258-6793 (fax)

More information about the Smt-talk mailing list