Get Standard German Phonology essential facts below. View Videos or join the Standard German Phonology discussion. Add Standard German Phonology to your PopFlock.com topic list for future reference or share this resource on social media.
While the spelling of German is officially standardised by an international organisation (the Council for German Orthography) the pronunciation has no official standard and relies on a de facto standard documented in reference works such as Deutsches Aussprachewörterbuch (German Pronunciation Dictionary) by Eva-Maria Krech et al.,Duden 6 Das Aussprachewörterbuch (Duden volume 6, The Pronunciation Dictionary) by Max Mangold and the training materials of radio and television stations such as Westdeutscher Rundfunk, Deutschlandfunk, or Schweizer Radio und Fernsehen. This standardised pronunciation was invented, rather than coming from any particular German-speaking city, but the city that Germans most consider to be closest to the standard is Hanover. Standard German is sometimes referred to as Bühnendeutsch (stage German), but the latter has its own definition and is slightly different.
Some scholars treat /?/ as an unstressed allophone of /?/. Likewise, some scholars treat /?/ as an allophone of the unstressed sequence /?r/. The phonemic status of /?:/ is also debated - see below.
In non-standard accents of the Low German speaking area, as well as in some Austrian accents it may be pronounced as a narrow closing diphthong [o?].
/?/ has been variously described as mid central unrounded . and close-mid central unrounded . It occurs only in unstressed syllables, for instance in besetzen[b?'z?t?s?n] ('occupy'). It is often considered a complementary allophone together with , but which cannot occur in unstressed syllables. If a sonorant follows in the syllable coda, the schwa often disappears so that the sonorant becomes syllabic, for instance Kissen['k?sn?] ('pillow'), Esel['?e:zl?] ('donkey').
/?/ has been variously described as mid near-front unrounded  and open-mid front unrounded .
/?:/ has been variously described as mid front unrounded  and open-mid front unrounded .
/oe/ has been variously described as open-mid near-front rounded  and somewhat lowered open-mid near-front rounded .
/?/ has been variously described as somewhat fronted open-mid back rounded  and open-mid back rounded .
/?/ is near-open central unrounded . It is a common allophone of the sequence /?r/ common to all German-speaking areas but Switzerland.
/a/ has been variously described as open front unrounded  and open central unrounded . Some scholars differentiate two short /a/, namely front /a/ and back /?/. The latter occurs only in unstressed open syllables, exactly as /i, y, u, e, ø, o/.
Standard Austrian pronunciation of this vowel is back .
Front or even is a common realization of /a/ in northern German varieties influenced by Low German.
/a:/ has been variously described as open central unrounded  and open back unrounded . Because of this, it is sometimes transcribed /?:/.
Back is the Standard Austrian pronunciation. It is also a common realization of /a:/ in northern German varieties influenced by Low German (in which it may even be rounded ).
Wiese (1996) notes that "there is a tendency to neutralize the distinction between [a(:)], [a], and [?]. That is, Oda, Radar, and Oder have final syllables which are perceptually very similar, and are nearly or completely identical in some dialects." He also says that "outside of a word context, [?] cannot be distinguished from [a]. (As early as 1847, Verdi's librettist found it natural, when adaptinga play by Schiller into the Italian language, to render the distinctly German name Roller as Rolla.)
Although there is also a length contrast, vowels are often analyzed according to a tenseness contrast, with long /i:, y:, u:, e:, ø:, o:/ being the tense vowels and short /?, ?, ?, ?, oe, ?/ their lax counterparts. Like the English checked vowels, the German lax vowels require a following consonant, with the notable exception of [?:] (which is absent in many varieties, as discussed below). /a/ is sometimes considered the lax counterpart of tense /a:/ in order to maintain this tense/lax division. Short /i, y, u, e, ø, o/ occur in unstressed syllables of loanwords, for instance in Psychometrie/psyçome't?i:/ ('psychometry'). They are usually considered allophones of tense vowels, which cannot occur in unstressed syllables (unless in compounds).
Northern German varieties influenced by Low German could be analyzed as lacking contrasting vowel quantity entirely:
/a:/ has a different quality than /a/ (see above).
These varieties also consistently lack /?:/, and use only /e:/ in its place.
The following usually are not counted among the German diphthongs as German speakers often feel they are distinct marks of "foreign words" (Fremdwörter). These appear only in loanwords:
[o?a], as in Croissant[ko?a's], colloquially: [ko?a'sa?].
Many German speakers use  and  as adaptations of the English diphthongs and in English loanwords, according to Wiese (1996), or they replace them with the native German long vowels /o:/ and /e:/. Thus, the word okay may be pronounced ['k] or /o:'ke:/. However, Mangold (2005) and Krech et al. (2009) do not recognize these diphthongs as phonemes, and prescribe pronunciations with the long vowels /e:/ and /o:/ instead.
In the varieties where speakers vocalize /r/ to in the syllable coda, a diphthong ending in  may be formed with every vowel except /?/ and /?/:
German diphthongs ending in  (part 1), from Kohler (1999:88)
German diphthongs ending in  (part 2), from Kohler (1999:88)
^1Wiese (1996) notes that the length contrast is not very stable before non-prevocalic /r/ and that "Meinhold & Stock (1980:180), following the pronouncing dictionaries (Mangold (1990), Krech & Stötzer (1982)) judge the vowel in Art, Schwert, Fahrt to be long, while the vowel in Ort, Furcht, hart is supposed to be short. The factual basis of this presumed distinction seems very questionable." He goes on stating that in his own dialect, there is no length difference in these words, and that judgements on vowel length in front of non-prevocalic /r/ which is itself vocalized are problematic, in particular if /a/ precedes.
According to the "lengthless" analysis, the aforementioned "long" diphthongs are analyzed as [i], [y], [u], , [e], [ø], [o] and [a]. This makes non-prevocalic /ar/ and /a:r/ homophonous as [a] or [a:]. Non-prevocalic /?r/ and /?:r/ may also merge, but the vowel chart in Kohler (1999) shows that they have somewhat different starting points - mid-centralized open-mid front for the former, open-mid front for the latter.
Wiese (1996) also states that "laxing of the vowel is predicted to take place in shortened vowels; it does indeed seem to go hand in hand with the vowel shortening in many cases." This leads to [i], [y], [u], [e], [ø], [o] being pronounced the same as , , , , [oe], . This merger is usual in the Standard Austrian accent, in which e.g. Moor 'bog' is often pronounced [m]; this, in contrast with the Standard Northern variety, also happens intervocalically, along with the diphthongization of the laxed vowel to [V], so that e.g. Lehrer 'teacher' is pronounced ['l] (the corresponding Standard Northern pronunciation is ['le:]). Another feature of the Standard Austrian accent is complete absorption of  by the preceding /?, ?:/, so that e.g. rar 'scarce' is pronounced [:].
With approximately 25 phonemes, the German consonant system has an average number of consonants in comparison with other languages. One of the more noteworthy ones is the unusual affricate/p?f/.
In the Standard Austrian variety, /k/ may be affricated to before front vowels.
/t?s, s, z/ can be laminal alveolar [ts?, s?, z?], laminal post-dental [ts?, s?, z?] (i.e. fronted alveolar, articulated with the blade of the tongue just behind upper front teeth), or even apical alveolar [ts?, s?, z?]. Austrian German often uses the post-dental articulation. /s, z/ are always strongly fricated.
Laminal, articulated with the foremost part of the blade of the tongue approaching the foremost part of the hard palate, with the tip of the tongue resting behind either upper or lower front teeth.
Apico-laminal, articulated with the tip of the tongue approaching the gums and the foremost part of the blade approaching the foremost part of the hard palate. According to Morciniec & Pr?dota (2005), this variant is used more frequently.
/?, ð/ are used only in loanwords, mostly from English, such as Thriller/'l?/, though some speakers substitute /?/ with any of /t, s, f/ and /ð/ with any of /d, z, v/. There are two variants of these sounds:
Apical post-dental, articulated with the tip of the tongue approaching the upper incisors.
Apical interdental, articulated with the tip of the tongue between the upper and lower incisors.
/r/ has a number of possible realizations:
Voiced apical coronal trill/tap [r?, ], either alveolar (articulated with the tip of the tongue against the alveolar ridge), or dental (articulated with the tip of the tongue against the back of the upper front teeth).
Distribution: Common in the south (Bavaria and many parts of Switzerland and Austria), but it is also found in some speakers in central and northern Germany, especially the elderly. It is also one of possible realizations of /r/ in the Standard Austrian accent, but a more common alveolar realization is an approximant . Even more common are uvular realizations, fricatives [? ~ ?] and a trill .
Voiced uvular trill , which can be realized as voiceless after voiceless consonants (as in treten). According to Lodge (2009) it is often a tap intervocalically (as in Ehre).
Distribution: Occurs in some conservative varieties - most speakers with a uvular /r/ realize it as a fricative or an approximant. It is also one of possible realizations of /r/ in the Standard Austrian accent, but it is less common than a fricative [? ~ ?].
Dorsal continuant, about the quality of which there is not a complete agreement:
Mangold (2005) states that "with educated professional radio and TV announcers, as with professional actors on the stage and in film, the [voiced uvular] fricative [realization of] /r/ clearly predominates."
In the Standard Austrian accent, the uvular fricative is also the most common realization, although its voicing is variable (that is, it can be either voiced or voiceless ).
Kohler (1999) writes that "the place of articulation of the consonant varies from uvular in e.g. rot ('red') to velar in e.g. treten ('kick'), depending on back or front vowel contexts." He also notes that is devoiced after voiceless plosives and fricatives, especially those within the same word, giving the word treten as an example. According to this author, [?] can be reduced to an approximant in an intervocalic position.
Distribution: Almost all areas apart from Bavaria and parts of Switzerland.
Near-open central unrounded vowel is a post-vocalic allophone of (mostly dorsal) varieties of /r/. The non-syllabic variant of it is not always near-open or central; it is similar to either or , depending on the environment.
Distribution: Widespread, but less common in Switzerland.
The voiceless stops , , are aspirated except when preceded by a sibilant. Many southern dialects do not aspirate /p t k/, and some northern ones do so only in a stressed position. The voiceless affricates /p?f/, /t?s/, and /t/ are never aspirated, and neither are any other consonants besides the aforementioned /p, t, k/.
The obstruents /b, d, ?, z, ?, d?/ are voiceless lenis [b?, d?, , z?, , d] in southern varieties, and they contrast with voiceless fortis [p, t, k, s, ?, t].
Before and after front vowels (/?, i:, ?, y:, ?, ?:, e:, oe, ø:/ and, in varieties that realize them as front, /a/ and/or /a:/), the velar consonants /?, k, ?/ are realized as post-palatal [, k?, ]. According to Wiese (1996), in a parallel process, /k, ?/ before and after back vowels (/?, u:, ?, o:/ and, in varieties that realize them as back, /a/ and/or /a:/) are retracted to post-velar [k?, ] or even uvular [q, ?].
There is no complete agreement about the nature of /j/; it has been variously described as a fricative , a fricative, which can be fricated less strongly than /ç/, a sound variable between a weak fricative an approximant and an approximant , which is the usual realization in the Standard Austrian variety.
In many varieties of standard German, occurs in careful speech before word stems that begin with a vowel. Some varieties of standard German do not have [?], e.g. Swiss Standard German. It is not usually considered a phoneme. In colloquial and dialectal speech, [?] is often omitted, especially when the word beginning with a vowel is unstressed.
The phonemic status of affricates is controversial. The majority view accepts and , but not or the non-native ; some accept none, some accept all but , and some accept all.
Although occurs in native words, it only appears in historic clusters of + (e.g. deutsch < OHGdiutisc) or in words with expressive quality (e.g. glitschen, hutschen). [t] is, however, well-established in loanwords, including German toponyms of non-Germanic origin (e.g. Zschopau).
and occur only in words of foreign origin. In certain varieties, they are replaced by and altogether.
is occasionally considered to be an allophone of , especially in southern varieties of German.
and are traditionally regarded as allophones after front vowels and back vowels, respectively. For a more detailed analysis see below at ich-Laut and ach-Laut. According to some analyses, is an allophone of after /a, a:/ and according to some also after /?, ?, a/. However, according to Moosmüller, Schmid & Brandstätter (2015), the uvular allophone is used after /?/ only in the Standard Austrian variety.
Some phonologists[who?] deny the phoneme and use /n?/ instead along with /nk/ instead of /?k/. The phoneme sequence /n?/ is realized as  when can start a valid onset of the next syllable whose nucleus is a vowel other than unstressed , , or . It becomes otherwise. For example:
Ganges/'?ans/ ['?as] ~ /'?ans/ ['?as]
Ich-Laut and ach-Laut
Ich-Laut is the voiceless palatal fricative (which is found in the word ich[ç] 'I'), and ach-Laut is the voiceless velar fricative (which is found in the word ach[ax] the interjection 'oh', 'alas'). Note that Laut[lat] is the German word for 'sound, phone'. In German, these two sounds are allophones occurring in complementary distribution. The allophone occurs after back vowels and /a a:/ (for instance in Buch[bu:x] 'book'), the allophone after front vowels (for instance in mich[m?ç] 'me/myself') and consonants (for instance in Furcht[fçt] 'fear', manchmal['mançma:l] 'sometimes'). (This happens most regularly: if the ⟨r⟩ in Furcht is pronounced as a consonant, ch represents ; however if, as often happens, it is vocalized as [?], resembling the vowel [a], then ⟨ch⟩ may represent , yielding [fxt].)
In loanwords, the pronunciation of potential fricatives in onsets of stressed syllables varies: in the Northern varieties of standard German, it is , while in Southern varieties, it is , and in Western varieties, it is (for instance in China: ['çi:na] vs. ['ki:na] vs. ['?i:na]).
The diminutivesuffix-chen is always pronounced with an ich-Laut[-ç?n]. Usually, this ending triggers umlaut (compare for instance Hund[h?nt] 'dog' to Hündchen['h?ntçn?] 'little dog'), so theoretically, it could only occur after front vowels. However, in some comparatively recent coinings, there is no longer an umlaut, for instance in the word Frauchen['f?aç?n] (a diminutive of Frau 'woman'), so that a back vowel is followed by a , even though normally it would be followed by a , as in rauchen['?ax?n] ('to smoke'). This exception to the allophonic distribution may be an effect of the morphemic boundary or an example of phonemicization, where erstwhile allophones undergo a split into separate phonemes.
The allophonic distribution of after front vowels and after other vowels is also found in other languages, such as Scots, in the pronunciation of light. However, it is by no means inevitable: Dutch, Yiddish, and many Southern German dialects retain (which can be realized as instead) in all positions. It is thus reasonable to assume that Old High Germanih, the ancestor of modern ich, was pronounced with rather than . While it is impossible to know for certain whether Old English words such as niht (modern night) were pronounced with or , is likely (see Old English phonology).
Despite the phonetic history, the complementary distribution of and in modern Standard German is better described as backing of after a back vowel, rather than fronting of after a front vowel, because is used in onsets (Chemie[çe'mi:] 'chemistry') and after consonants (Molch[m?lç] 'newt'), and is thus the underlying form of the phoneme.
According to Kohler, the German ach-Laut is further differentiated into two allophones, and : occurs after /u:, o:/ (for instance in Buch[bu:x] 'book') and after /a, a:/ (for instance in Bach[ba?] 'brook'), while either or may occur after /?, ?, a/, with predominating.
Various German consonants occur in pairs at the same place of articulation and in the same manner of articulation, namely the pairs /p-b/, /t-d/, /k-?/, /s-z/, /?-?/. These pairs are often called fortis-lenis pairs, since describing them as voiced-voiceless pairs is inadequate. With certain qualifications, /t-d/, /f-v/ and /?-ð/ are also considered fortis-lenis pairs.
Fortis-lenis distinction for /?, m, n, ?, l, r, h/ is unimportant.
The fortis stops /p, t, k/ are aspirated in many varieties. The aspiration is strongest in the onset of a stressed syllable (such as Taler['t?a:l?] 'thaler'), weaker in the onset of an unstressed syllable (such as Vater['fa:t] 'father'), and weakest in the syllable coda (such as in Saat[za:t?] 'seed'). All fortis consonants, i.e. /p, t, k, f, ?, s, ?, ç, x, p?f, t?s, t/ are fully voiceless.
The lenis consonants /b, d, ?, v, ð, z, ?, j, r, d/ range from being weakly voiced to almost voiceless [b?, d?, , v?, ð?, z?, , j?, r?, d] after voiceless consonants:Kasbah['kasb?a] ('kasbah)', abdanken['?apd?a?kn?] ('to resign'), rotgelb['?o:tlp] ('red-yellow'), Abwurf['?apvf] ('dropping'), Absicht['?apzçt] ('intention'), Holzjalousie['h?lt?saluzi:] ('wooden jalousie'), wegjagen['v?kj?a:?n?] ('to chase away'), tropfen['tp?fn?] ('to drop'), Obstjuice['?o:pstdu:s] ('fruit juice'). Mangold (2005) states that they are "to a large extent voiced" [b, d, g, v, ð, z, ?, j, r, d] in all other environments, but some studies have found the stops /b, d, ?/ to be voiceless word/utterance-initially in most dialects (while still contrasting with /p, t, k/ due to the aspiration of the latter).
/b, d, ?, z, ?/ are voiceless in most southern varieties of German. For clarity, they are often transcribed as [b?, d?, , z?, ].
The nature of the phonetic difference between the voiceless lenis consonants and the similarly voiceless fortis consonants is controversial. It is generally described as a difference in articulatory force, and occasionally as a difference in articulatory length; for the most part, it is assumed that one of these characteristics implies the other.
In various central and southern varieties, the opposition between fortis and lenis is neutralized in the syllable onset; sometimes just in the onset of stressed syllables, sometimes in all cases.
The pair /f-v/ is not considered a fortis-lenis pair, but a simple voiceless-voiced pair, as remains voiced in all varieties, including the Southern varieties that devoice the lenes (with however some exceptions). Generally, the southern is realized as the voiced approximant . However, there are southern varieties which differentiate between a fortis (such as in sträflich['?tr?:fl?ç] 'culpable' from Middle High German stræflich) and a lenis ([v?], such as in höflich['hø:v?l?ç] 'polite' from Middle High German hovelîch); this is analogous to the opposition of fortis and lenis [z?].
In varieties from Northern Germany, lenis stops in the syllable coda are realized as fortis stops. This does not happen in varieties from Southern Germany, Austria or Switzerland.
Since the lenis stops /b, d, ?/ are unvoiced or at most variably voiced (as stated above), this cannot be called devoicing in the strict sense of the word because it does not involve the loss of phonetic voice. More accurately, it can be called coda fortition or a neutralization of fortis and lenis sounds in the coda. Fricatives are truly and contrastively voiced in Northern Germany. Therefore, the fricatives undergo coda devoicing in the strict sense of the word. It is disputed whether coda devoicing is due to a constraint which specifically operates on syllable codas or whether it arises from constraints which "protect voicing in privileged positions."
As against standard pronunciation rules, in western varieties including those of the Rhineland, coda fortis-lenis neutralization results in voicing rather than devoicing if the following word begins with a vowel. For example, mit uns becomes [m?dns] and darf ich becomes [da?v]. The same sandhi phenomenon exists also as a general rule in the Luxembourgish language.
Stress in German usually falls on the first syllable, with the following exceptions:
Many loanwords, especially proper names, keep their original stress. E.g. Obama/o'ba:.ma/
Nouns formed with Latinate suffixes, such as -ant, -anz, -enz, -ion, -ismus, -ist, -ment, -tät: Idealismus/ide.a'l?sm?s/ ('idealism'), Konsonant/k?nzo'nant/ ('consonant'), Tourist/tu'st/ ('tourist')
Verbs formed with the French-derived suffix -ieren, e.g. studieren/?tu'di:n/ ('to study'). This is often pronounced /i:n/ in casual speech.
Compound adverbs with her, hin, da, or wo as they are stressed on the first syllable of the second element, e.g. dagegen/da'?e:n/ ('on the other hand'), woher/vo'he:/ ('from where')
In addition, German uses different stress for separable prefixes and inseparable prefixes in verbs and words derived from such verbs:
Words beginning with be-, ge-, er-, ver-, zer-, ent-, emp- and a few other inseparable prefixes are stressed on the root.
Words beginning with the separable prefixes ab-, auf-, ein-, vor-, and most prepositional adverbs are stressed on the prefix.
Some prefixes, notably über-, unter-, um-, and durch-, can function as separable or inseparable prefixes and are stressed or not accordingly.
A few homographs with such prefixes exist. They are not perfect homophones. Consider the word umschreiben. As 'umoschreiben (separable prefix), it means 'to rewrite' and is pronounced ['mab?n], with stress on the first syllable. Its associated noun, die 'Umschreibung is also stressed on the first syllable - ['mab]. On the other hand, um'schreiben (inseparable prefix) is pronounced [m'ab?n], with stress on the second syllable. This word means 'to paraphrase', and its associated noun, die Um'schreibung is also stressed on the second syllable - [m'ab]. Another example is the word um'fahren; with stress on the root ([m'fa:n]) it means 'to drive around (an obstacle in the street)', and with stress on the prefix (['mfa:n]) it means 'to run down/over' or 'to knock down'.
Like all infants, German infants go through a babbling stage in the early phases of phonological acquisition, during which they produce the sounds they will later use in their first words.Phoneme inventories begin with stops, nasals, and vowels; (contrasting) short vowels and liquids appear next, followed by fricatives and affricates, and finally all other consonants and consonant clusters. Children begin to produce protowords near the end of their first year. These words do not approximate adult forms, yet have a specific and consistent meaning. Early word productions are phonetically simple and usually follow the syllable structure CV or CVC, although this generalization has been challenged. The first vowels produced are /?/, /a/, and /a:/, followed by /e/, /i/, and /?/, with rounded vowels emerging last. German children often use phonological processes to simplify their early word production. For example, they may delete an unstressed syllable (Schokolade 'chocolate' pronounced ['la:d?]), or replace a fricative with a corresponding stop (Dach[dax] 'roof' pronounced [dak]). One case study found that a 17-month-old child acquiring German replaced the voiceless velar fricative[x] with the nearest available continuant[h], or deleted it altogether (Buch[bu:x] 'book' pronounced [buh] or [bu:]).
Vowel space development
In 2009, Lintfert examined the development of vowel space of German speakers in their first three years of life. During the babbling stage, vowel distribution has no clear pattern. However, stressed and unstressed vowels already show different distributions in the vowel space. Once word production begins, stressed vowels expand in the vowel space, while the F1 - F2 vowel space of unstressed vowels becomes more centralized. The majority of infants are then capable of stable production of F1. The variability of formant frequencies among individuals decreases with age. After 24 months, infants expand their vowel space individually at different rates. However, if the parents' utterances possess a well-defined vowel space, their children produce clearly distinguished vowel classes earlier. By about three years old, children command the production of all vowels, and they attempt to produce the four cardinal vowels, /y/, /i/, /u/ and /a/, at the extreme limits of the F1-F2 vowel space (i.e., the height and backness of the vowels are made extreme by the infants).
Generally, closed-class grammatical words (e.g. articles and prepositions) are absent from children's speech when they first begin to combine words. However, children as young as 18 months old show knowledge of these closed-class words when they prefer stories with them, compared to passages with them omitted. Therefore, the absence of these grammatical words cannot be due to perceptual problems. Researchers tested children's comprehension of four grammatical words: bis[b?s] ('up to'), von[f?n] ('from'), das[das] ('the' neuter singular), and sein[zan] ('his'). After first being familiarized with the words, eight-month-old children looked longer in the direction of a speaker playing a text passage that contained these previously heard words. However, this ability is absent in six-month-olds.
The acquisition of nasals in German differs from that of Dutch, a phonologically closely related language. German children produce proportionately more nasals in onset position (sounds before a vowel in a syllable) than Dutch children do. German children, once they reached 16 months, also produced significantly more nasals in syllables containing schwas, when compared with Dutch-speaking children. This may reflect differences in the languages the children are being exposed to, although the researchers claim that the development of nasals likely cannot be seen apart from the more general phonological system the child is developing.
Phonotactic constraints and reading
A 2006 study examined the acquisition of German in phonologically delayed children (specifically, issues with fronting of velars and stopping of fricatives) and whether they applied phonotactic constraints to word-initial consonant clusters containing these modified consonants. In many cases, the subjects (mean age = 5;1) avoided making phonotactic violations, opting instead for other consonants or clusters in their speech. This suggests that phonotactic constraints do apply to the speech of German children with phonological delay, at least in the case of word-initial consonant clusters. Additional research has also shown that spelling consistencies seen in German raise children's phonemic awareness as they acquire reading skills.
Sound changes and mergers
A merger found mostly in Northern accents of German is that of /?:/ (spelled ⟨ä, äh⟩) with /e:/ (spelled ⟨e⟩, ⟨ee⟩, or ⟨eh⟩). Some speakers merge the two everywhere, some distinguish them everywhere, others keep /?:/ distinct only in conditional forms of strongverbs (for example ich gäbe[':b?] 'I would give' vs. ich gebe['?e:b?] 'I give' are distinguished, but Bären['be:n] 'bears' vs. Beeren['be:n] 'berries' are not. Standard pronunciation of Bären is ['b?:n]).
Another common merger is that of /?/ at the end of a syllable with [ç] or [x], for instance Krieg[ki:ç] ('war'), but Kriege['ki:??] ('wars'); er lag[la:x] ('he lay'), but wir lagen['la:??n] ('we lay'). This pronunciation is frequent all over central and northern Germany. It is characteristic of regional languages and dialects, particularly Low German in the North, where ⟨g⟩ represents a fricative, becoming voiceless in the syllable coda, as is common in German (final-obstruent devoicing). However common it is, this pronunciation is considered sub-standard. Only in one case, in the grammatical ending -ig (which corresponds to English -y), the fricative pronunciation of final ⟨g⟩ is prescribed by the Siebs standard, for instance wichtig['v?çt?ç] ('important'). The merger occurs neither in Austro-Bavarian and Alemannic German nor in the corresponding varieties of Standard German, and therefore in these regions -ig is pronounced .
Many speakers do not distinguish the affricate/p?f/ from the simple fricative/f/ in the beginning of a word, in which case the verb (er) fährt ('[he] travels') and the noun Pferd ('horse') are both pronounced [ft]. This most commonly occurs in northern and western Germany, where the local dialects did not originally have the sound /p?f/. Some speakers also have peculiar pronunciation for /p?f/ in the middle or end of a word, replacing the [f] in /p?f/ with a voiceless bilabial fricative, i.e. a consonant produced by pressing air flow through the tensed lips. Thereby Tropfen ('drop') becomes ['tpn?], rather than ['tp?fn?].
Many speakers who have a vocalization of /r/ after /a/ merge this combination with long /a:/ (i.e. /ar/ > *[a?] or * > [a:] or [?:]). Hereby, Schaf ('sheep') and scharf ('sharp') can both be pronounced [?a:f] or [:f]. This merger does not occur where /a/ is a front vowel while /a:/ is realised as a back vowel. Here the words are kept distinct as [:f] ('sheep') and [?a:f] ('sharp').
In umlaut forms, the difference usually reoccurs: Schäfer[':f?] or ['?e:f?] vs. schärfer['?f?]. Speakers with this merger also often use [a:ç] (instead of formally normal /a:x/) where it stems from original /arç/. The word Archen ('arks') is thus pronounced ['?a:çn?], which makes a minimal pair with Aachen[?a:xn?], making the difference between [ç] and [x]phonemic, rather than just allophonic, for these speakers.
In the standard pronunciation, the vowel qualities /i/, /?/, /e/, /?/, as well as /u/, /?/, /o/, /?/, are all still distinguished even in unstressed syllables. In this latter case, however, many simplify the system in various degrees. For some speakers, this may go so far as to merge all four into one, hence misspellings by schoolchildren such as Bräutegam (instead of Bräutigam) or Portogal (instead of Portugal).
In everyday speech, more mergers occur, some of which are universal and some of which are typical for certain regions or dialect backgrounds. Overall, there is a strong tendency of reduction and contraction. For example, long vowels may be shortened, consonant clusters may be simplified, word-final [?] may be dropped in some cases, and the suffix -en may be contracted with preceding consonants, e.g. [ham] for haben['ha:b?n] ('to have').
If the clusters [mp], [lt], [nt], or [?k] are followed by another consonant, the stops /p/, /t/ and /k/ usually lose their phonemic status. Thus while the standard pronunciation distinguishes ganz[?ant?s] ('whole') from Gans[?ans] ('goose'), as well as er sinkt[zkt] from er singt[zt], the two pairs are homophones for most speakers. The commonest practice is to drop the stop (thus [?ans], [zt] for both words), but some speakers insert the stop where it is not etymological ([?ants], [zkt] for both words), or they alternate between the two ways. Only a few speakers retain a phonemic distinction.
Middle High German
The Middle High German vowels [ei?] and [i:] developed into the modern Standard German diphthong [a], whereas [ou?] and [u:] developed into [a]. For example, Middle High German heiz/hei?s/ and wîz/wi:s/ ('hot' and 'white') became Standard German heiß/has/ and weiß/vas/. In some dialects, the Middle High German vowels have not changed, e.g. Swiss Germanheiss/hei?s/ and wiiss/vi:s/, while in other dialects or languages, the vowels have changed but the distinction is kept, e.g. Bavarianhoaß/hs/ and weiß/vas/, Ripuarianheeß/he:s/ and wieß/vi:s/, Yiddishheys/hs/ and ?vays/vas/.
The Middle High German diphthongs[i], [u] and [y] became the modern Standard German long vowels [i:], [u:] and [y:] after the Middle High German long vowels changed to diphthongs. Most Upper German dialects retain the diphthongs. A remnant of their former diphthong character is shown when [i:] continues to be written ie in German (as in Liebe 'love').
German incorporates a significant number of loanwords from other languages. Loanwords are often adapted to German phonology but to varying degrees, depending on the speaker and the commonness of the word. /?/ and /d/ do not occur in native German words but are common in a number of French and English loan words. Many speakers replace them with /?/ and /t/ respectively (especially in Southern Germany, Austria and Switzerland), so that Dschungel (from English jungle) can be pronounced ['d?l?] or ['t?l?]. Some speakers in Northern and Western Germany merge /?/ with /d/, so that Journalist (phonemically /d?na'l?st ~ na'l?st/) can be pronounced [?na'l?st], [dna'l?st] or [?na'l?st]. The realization of /?/ as [t], however, is uncommon.
Loanwords from English
Many English words are used in German, especially in technology and pop culture. Some speakers pronounce them similarly to their native pronunciation, but many speakers change non-native phonemes to similar German phonemes (even if they pronounce them in a rather English manner in an English-language setting):
English /?, ð/ are usually pronounced as in RP or General American; some speakers replace them with /s/ and /z/ respectively (th-alveolarization) e.g. Thriller['l? ~ 'sl?].
English /?/ can be pronounced the same as in English, i.e. , or as the corresponding native German /r/ e.g. Rock[k] or [r?k]. German and Austrian speakers tend to be variably rhotic.
English /w/ is often replaced with German /v/ e.g. Whisk(e)y['v?ski:].
word-initial /s/ is often retained (especially in the South, where word-initial /s/ is common), but many speakers replace it with /z/ e.g. Sound[zant].
word-initial /st/ and /sp/ are usually retained, but some speakers (especially in South Western Germany and Western Austria) replace them with /?t/ and /?p/ e.g. Steak[?te?k] or [?te:k], Spray[?p?e?] or [?p?e:].
English /t/ is usually retained, but in Northern and Western Germany, as well as Luxembourg it is often replaced with /?/ e.g. Chips[ps].
In Northern Standard German, final-obstruent devoicing is applied to English loan words just as to other words e.g. Airbag['?:b?k], Lord[lt] or [lt], Backstage['b?kste:t]. However, in Southern Standard German, in Swiss Standard German and Austrian Standard German, final-obstruent devoicing does not occur and so speakers are more likely to retain the original pronunciation of word-final lenes (although realizing them as fortes may occur because of confusing English spelling with pronunciation).
English /e?/ and /o?/ are often replaced with /e:/ and /o:/ respectively e.g. Homepage['ho:mpe:t].
English /æ/ and /?/ are pronounced the same, as German /?/ (met-mat merger) e.g. Backup['b?kap].
English /?/ and /?:/ are pronounced the same, as German /?/ (cot-caught merger) e.g. Box[b?ks].
English /?/ is usually pronounced as German /a/ e.g. Cutter['kat?].
English /?:r/ is usually pronounced as German /oe?/ e.g. Shirt[?oe?t] or [?oet].
English /i/ is pronounced as /i:/ (happy-tensing) e.g. Whisk(e)y['v?ski:].
The sample text is a reading of the first sentence of "The North Wind and the Sun". The phonemic transcription treats every instance of [?] and  as /?r/ and /r/, respectively. The phonetic transcription is a fairly narrow transcription of the educated northern accent. The speaker transcribed in the narrow transcription is 62 years old, and he is reading in a colloquial style. Aspiration, glottal stops and devoicing of the lenes after fortes are not transcribed.
Note that the audio file contains the whole fable, and that it was recorded by a much younger speaker.
^Pages 1-2 of the book (Deutsches Aussprachewörterbuch) discuss die Standardaussprache, die Gegenstand dieses Wörterbuches ist (the standard pronunciation which is the topic of this dictionary). It also mentions Da sich das Deutsche zu einer plurizentrischen Sprache entwickelt hat, bildeten sich jeweils eigene Standardvarietäten (und damit Standardaussprachen) (German has developed into a pluricentric language separate standard varieties (and hence standard pronunciations)), but refers to these standards as regionale und soziolektale Varianten (regional and sociolectal variants).
^"Reflections on Diglossia". In northern Germany, it appears that in Hanover - perhaps because of the presence of the electoral (later royal) court - a parastandard High German was spoken by the 18th century as well, at least among the educated, with the curious result that Hanover speech - though non-native - became the model of German pronunciation on the stage (Bühnendeutsch), since everywhere else in Germany dialects were still spoken by everyone. Other capitals (Berlin, Dresden, Munich, Vienna) eventually developed their own Umgangssprachen, but the Hanover model remained the ideal.
^"Reading Heinrich Heine"(PDF). He spoke the dialect of Hanover, where - as also in the vicinity to the south of this city - German is pronounced best.
^"Nicht das beste Hochdeutsch in Hannover". In Hannover wird zweifellos ein Deutsch gesprochen, das sehr nah an der nationalen Aussprachenorm liegt. Aber das gilt auch für andere norddeutsche Städte wie Kiel, Münster oder Rostock. Hannover hat da keine Sonderstellung.
^Differences include the pronunciation of the endings -er, -en, and -em.
^ abcSource: Wiese (1996:11, 14). On the page 14, the author states that /a/, /a/ and // are of the same quality as vowels of which they consist. On the page 8, he states that /a/ is low central.
^ abcSee vowel chart in Kohler (1999:87). Note that despite their true ending points, Kohler still transcribes them as /a a /, i.e. with higher offsets than those actually have.
^Source: Krech et al. (2009:72). Authors do not provide a vowel chart. Rather, they state rather vaguely that "the diphthong [a] is a monosyllabic compound consisting of the unrounded open vowel [a] and the unrounded mid front vowel [?]."
^Source: Krech et al. (2009:72-73). Authors do not provide a vowel chart. Rather, they state rather vaguely that "the diphthong [a] is a monosyllabic compound consisting of the unrounded open vowel [a] and the rounded mid back vowel [?]."
^Krech et al. (2009:73). Authors do not provide a vowel chart. Rather, they state rather vaguely that "the diphthong [?oe?] is a monosyllabic compound consisting of the rounded mid back vowel [?] and the rounded mid front vowel [oe]."
^ abcMoosmüller, Schmid & Brandstätter (2015:341-342): "SAG features a wide variety of realizations of the trill. In approximately the past 40 years, the pronunciation norm has changed from an alveolar to a uvular trill. The latter is mostly pronounced as a fricative, either voiced or voiceless. Alveolar trills are still in use, mostly pronounced as an approximant.
^[v] written v[clarify] can devoice in nearly every place once the word has become common; w is devoiced in Möwe, Löwe. On the other hand, the keeping to the variety is so standard that doof/do:f/ induced the writing "(der) doofe" even though the standard pronunciation of the latter word is /'do:v?/
Altvater-Mackensen, N.; Fikkert, P. (2007), "On the acquisition of nasals in Dutch and German", Linguistics in the Netherlands, 24: 14-24, doi:10.1075/avt.24.04alt
Ammonn, Ulrich; Bickel, Hans; Ebner, Jakob; Esterhammer, Ruth; Gasser, Markus; Hofer, Lorenz; Kellermeier-Rehbein, Birte; Löffler, Heinrich; Mangott, Doris; Moser, Hans; Schläpfer, Robert; Schloßmacher, Michael; Schmidlin, Regula; Vallaster, Günter (2004), Variantenwörterbuch des Deutschen. Die Standardsprache in Österreich, der Schweiz und Deutschland sowie in Liechtenstein, Luxemburg, Ostbelgien und Südtirol, Berlin, New York: Walter de Gruyter, ISBN3-11-016575-9
Goswami, U.; Ziegler, J.; Richardson, U. (2005), "The effects of spelling consistency on phonological awareness: A comparison of English and German", Journal of Experimental Child Psychology, 92: 345-365, doi:10.1016/j.jecp.2005.06.002
Trudgill, Peter (1974), "Linguistic change and diffusion: description and explanation in sociolinguistic dialect geography", Language in Society, Cambridge University Press, 3 (2): 215-246, doi:10.1017/S0047404500004358
Ulbrich, Horst (1972), Instrumentalphonetisch-auditive R-Untersuchungen im Deutschen, Berlin: Akademie-Verlag
Wängler, Hans-Heinrich (1961), Atlas deutscher Sprachlaute, Berlin: Akademie-Verlag