Pitch is a perceptual property of sounds that allows their ordering on a frequency-related scale, or more commonly, pitch is the quality that makes it possible to judge sounds as "higher" and "lower" in the sense associated with musical melodies. Pitch is a major auditory attribute of musical tones, along with duration, loudness, and timbre.
Pitch may be quantified as a frequency, but pitch is not a purely objective physical property; it is a subjective psychoacoustical attribute of sound. Historically, the study of pitch and pitch perception has been a central problem in psychoacoustics, and has been instrumental in forming and testing theories of sound representation, processing, and perception in the auditory system.
Pitch is an auditory sensation in which a listener assigns musical tones to relative positions on a musical scale based primarily on their perception of the frequency of vibration. Pitch is closely related to frequency, but the two are not equivalent. Frequency is an objective, scientific attribute that can be measured. Pitch is each person's subjective perception of a sound wave, which cannot be directly measured. However, this does not necessarily mean that most people won't agree on which notes are higher and lower.
The oscillations of sound waves can often be characterized in terms of frequency. Pitches are usually associated with, and thus quantified as, frequencies (in cycles per second, or hertz), by comparing the sounds being assessed against sounds with pure tones (ones with periodic, sinusoidal waveforms). Complex and aperiodic sound waves can often be assigned a pitch by this method.
According to the American National Standards Institute, pitch is the auditory attribute of sound according to which sounds can be ordered on a scale from low to high. Since pitch is such a close proxy for frequency, it is almost entirely determined by how quickly the sound wave is making the air vibrate and has almost nothing to do with the intensity, or amplitude, of the wave. That is, "high" pitch means very rapid oscillation, and "low" pitch corresponds to slower oscillation. Despite that, the idiom relating vertical height to sound pitch is shared by most languages. At least in English, it is just one of many deep conceptual metaphors that involve up/down. The exact etymological history of the musical sense of high and low pitch is still unclear. There is evidence that humans do actually perceive that the source of a sound is slightly higher or lower in vertical space when the sound frequency is increased or reduced.
In most cases, the pitch of complex sounds such as speech and musical notes corresponds very nearly to the repetition rate of periodic or nearly-periodic sounds, or to the reciprocal of the time interval between repeating similar events in the sound waveform.
The pitch of complex tones can be ambiguous, meaning that two or more different pitches can be perceived, depending upon the observer. When the actual fundamental frequency can be precisely determined through physical measurement, it may differ from the perceived pitch because of overtones, also known as upper partials, harmonic or otherwise. A complex tone composed of two sine waves of 1000 and 1200 Hz may sometimes be heard as up to three pitches: two spectral pitches at 1000 and 1200 Hz, derived from the physical frequencies of the pure tones, and the combination tone at 200 Hz, corresponding to the repetition rate of the waveform. In a situation like this, the percept at 200 Hz is commonly referred to as the missing fundamental, which is often the greatest common divisor of the frequencies present.
Pitch depends to a lesser degree on the sound pressure level (loudness, volume) of the tone, especially at frequencies below 1,000 Hz and above 2,000 Hz. The pitch of lower tones gets lower as sound pressure increases. For instance, a tone of 200 Hz that is very loud seems one semitone lower in pitch than if it is just barely audible. Above 2,000 Hz, the pitch gets higher as the sound gets louder. These results were obtained in the pioneering works by S. Stevens  and W. Snow. Later investigations, i.e. by A. Cohen, had shown that in most cases the apparent pitch shifts were not significantly different from pitch-matching errors. When averaged, the remaining shifts followed the directions of Stevens' curves but were small (2% or less by frequency, i.e. not more than a semitone).
Theories of pitch perception try to explain how the physical sound and specific physiology of the auditory system work together to yield the experience of pitch. In general, pitch perception theories can be divided into place coding and temporal coding. Place theory holds that the perception of pitch is determined by the place of maximum excitation on the basilar membrane.
A place code, taking advantage of the tonotopy in the auditory system, must be in effect for the perception of high frequencies, since neurons have an upper limit on how fast they can phase-lock their action potentials. However, a purely place-based theory cannot account for the accuracy of pitch perception in the low and middle frequency ranges. Moreover, there is some evidence that some non-human primates lack auditory cortex responses to pitch despite having clear tonotopic maps in auditory cortex, showing that tonotopic place codes are not sufficient for pitch responses.
Temporal theories offer an alternative that appeals to the temporal structure of action potentials, mostly the phase-locking and mode-locking of action potentials to frequencies in a stimulus. The precise way this temporal structure helps code for pitch at higher levels is still debated, but the processing seems to be based on an autocorrelation of action potentials in the auditory nerve. However, it has long been noted that a neural mechanism that may accomplish a delay--a necessary operation of a true autocorrelation--has not been found. At least one model shows that a temporal delay is unnecessary to produce an autocorrelation model of pitch perception, appealing to phase shifts between cochlear filters; however, earlier work has shown that certain sounds with a prominent peak in their autocorrelation function do not elicit a corresponding pitch percept, and that certain sounds without a peak in their autocorrelation function nevertheless elicit a pitch. To be a more complete model, autocorrelation must therefore apply to signals that represent the output of the cochlea, as via auditory-nerve interspike-interval histograms. Some theories of pitch perception hold that pitch has inherent octave ambiguities, and therefore is best decomposed into a pitch chroma, a periodic value around the octave, like the note names in western music--and a pitch height, which may be ambiguous, that indicates the octave the pitch is in.
The just-noticeable difference (jnd) (the threshold at which a change is perceived) depends on the tone's frequency content. Below 500 Hz, the jnd is about 3 Hz for sine waves, and 1 Hz for complex tones; above 1000 Hz, the jnd for sine waves is about 0.6% (about 10 cents). The jnd is typically tested by playing two tones in quick succession with the listener asked if there was a difference in their pitches. The jnd becomes smaller if the two tones are played simultaneously as the listener is then able to discern beat frequencies. The total number of perceptible pitch steps in the range of human hearing is about 1,400; the total number of notes in the equal-tempered scale, from 16 to 16,000 Hz, is 120.
The relative perception of pitch can be fooled, resulting in aural illusions. There are several of these, such as the tritone paradox, but most notably the Shepard scale, where a continuous or discrete sequence of specially formed tones can be made to sound as if the sequence continues ascending or descending forever.
Not all musical instruments make notes with a clear pitch. The unpitched percussion instrument (a class of percussion instrument) does not produce particular pitches. A sound or note of definite pitch is one where a listener can possibly (or relatively easily) discern the pitch. Sounds with definite pitch have harmonic frequency spectra or close to harmonic spectra.
A sound generated on any instrument produces many modes of vibration that occur simultaneously. A listener hears numerous frequencies at once. The vibration with the lowest frequency is called the fundamental frequency; the other frequencies are overtones. Harmonics are an important class of overtones with frequencies that are integer multiples of the fundamental. Whether or not the higher frequencies are integer multiples, they are collectively called the partials, referring to the different parts that make up the total spectrum.
A sound or note of indefinite pitch is one that a listener finds impossible or relatively difficult to identify as to pitch. Sounds with indefinite pitch do not have harmonic spectra or have altered harmonic spectra--a characteristic known as inharmonicity.
It is still possible for two sounds of indefinite pitch to clearly be higher or lower than one another. For instance, a snare drum sounds higher pitched than a bass drum though both have indefinite pitch, because its sound contains higher frequencies. In other words, it is possible and often easy to roughly discern the relative pitches of two sounds of indefinite pitch, but sounds of indefinite pitch do not neatly correspond to any specific pitch.
A pitch standard (also concert pitch) is the conventional pitch reference a group of musical instruments are tuned to for a performance. Concert pitch may vary from ensemble to ensemble, and has varied widely over musical history.
Standard pitch is a more widely accepted convention. The A above middle C is usually set at 440 Hz (often written as "A = 440 Hz" or sometimes "A440"), although other frequencies, such as 442 Hz, are also often used as variants. Another standard pitch, the so-called Baroque pitch, has been set in the 20th century as A = 415 Hz--approximately an equal-tempered semitone lower than A440 to facilitate transposition. The Classical pitch can be set to either 427 Hz (about halfway between A415 and A440) or 430 Hz (also between A415 and A440 but slightly sharper than the quarter tone). And ensembles specializing in authentic performance set the A above middle C to 432 Hz or 435 Hz when performing repertoire from the Romantic era.
Transposing instruments have their origin in the variety of pitch standards. In modern times, they conventionally have their parts transposed into different keys from voices and other instruments (and even from each other). As a result, musicians need a way to refer to a particular pitch in an unambiguous manner when talking to each other.
For example, the most common type of clarinet or trumpet, when playing a note written in their part as C, sounds a pitch that is called B♭ on a non-transposing instrument like a violin (which indicates that at one time these wind instruments played at a standard pitch a tone lower than violin pitch). To refer to that pitch unambiguously, a musician calls it concert B♭, meaning, "...the pitch that someone playing a non-transposing instrument like a violin calls B♭."
Pitches are labeled using:
For example, one might refer to the A above middle C as a?, A4, or 440 Hz. In standard Western equal temperament, the notion of pitch is insensitive to "spelling": the description "G4 double sharp" refers to the same pitch as A4; in other temperaments, these may be distinct pitches. Human perception of musical intervals is approximately logarithmic with respect to fundamental frequency: the perceived interval between the pitches "A220" and "A440" is the same as the perceived interval between the pitches A440 and A880. Motivated by this logarithmic perception, music theorists sometimes represent pitches using a numerical scale based on the logarithm of fundamental frequency. For example, one can adopt the widely used MIDI standard to map fundamental frequency, f, to a real number, p, as follows
This creates a linear pitch space in which octaves have size 12, semitones (the distance between adjacent keys on the piano keyboard) have size 1, and A440 is assigned the number 69. (See Frequencies of notes.) Distance in this space corresponds to musical intervals as understood by musicians. An equal-tempered semitone is subdivided into 100 cents. The system is flexible enough to include "microtones" not found on standard piano keyboards. For example, the pitch halfway between C (60) and C♯ (61) can be labeled 60.5.
The following table shows frequencies in Hertz for notes in various octaves, named according to the "German method" of octave nomenclature:
The relative pitches of individual notes in a scale may be determined by one of a number of tuning systems. In the west, the twelve-note chromatic scale is the most common method of organization, with equal temperament now the most widely used method of tuning that scale. In it, the pitch ratio between any two successive notes of the scale is exactly the twelfth root of two (or about 1.05946). In well-tempered systems (as used in the time of Johann Sebastian Bach, for example), different methods of musical tuning were used.
In almost all of these systems interval of the octave doubles the frequency of a note; for example, an octave above A440 is 880 Hz. If however the first overtone is sharp due to inharmonicity, as in the extremes of the piano, tuners resort to octave stretching.
In atonal, twelve tone, or musical set theory a "pitch" is a specific frequency while a pitch class is all the octaves of a frequency. In many analytic discussions of atonal and post-tonal music, pitches are named with integers because of octave and enharmonic equivalency (for example, in a serial system, C♯ and D♭ are considered the same pitch, while C4 and C5 are functionally the same, one octave apart).
Discrete pitches, rather than continuously variable pitches, are virtually universal, with exceptions including "tumbling strains" and "indeterminate-pitch chants". Gliding pitches are used in most cultures, but are related to the discrete pitches they reference or embellish.
For the purposes of this book we decided to take a conservative approach, and to focus on the relationship between pitch and musical melodies. Following the earlier ASA definition, we define pitch as 'that attribute of sensation whose variation is associated with musical melodies.' Although some might find this too restrictive, an advantage of this definition is that it provides a clear procedure for testing whether or not a stimulus evokes a pitch, and a clear limitation on the range of stimuli that we need to consider in our discussions.
The one with the slowest vibration rate--the one lowest in pitch--is referred to as the fundamental frequency, and the others are collectively called overtones.