Hungarian orthography (Hungarian: helyesírás, lit. 'correct writing') consists of rules defining the standard written form of the Hungarian language. It includes the spelling of lexical words, proper nouns and foreign words (loanwords) in themselves, with suffixes, and in compounds, as well as the hyphenation of words, punctuation, abbreviations, collation (alphabetical ordering), and other information (such as how to write dates).
Hungarian is written with the Hungarian alphabet, an extended version of the Latin alphabet. Its letters usually indicate sounds, except when morphemes are to be marked (see below). The extensions include consonants written with digraphs or a trigraph and vowel letters marked with diacritics. Long consonants are marked by a double letter (e.g. l > ll and sz > ssz) while long vowels get an acute accent (e.g. o > ó) or their umlaut is replaced with a double acute accent (ö, ü > ?, ?). Only the first letter of digraphs and of the trigraph dzs is written in upper case when capitalizing in normal text, but all letters are capitalized in acronyms and all-uppercase inscriptions.
The letters q, x, y, w are only part of the extended Hungarian alphabet and they are rarely used in Hungarian words - they are normally replaced with their usual phonetic equivalents kv, ksz, i, v (only the x is relatively common, e.g. taxi). Ch is not a part of the alphabet but it still exists in some words (like technika, 'technology' or 'technique'). In traditional surnames, other digraphs may occur as well, both for vowels and consonants.
The first principle is that the Hungarian writing system is phonemic by default, i.e. letters correspond to phonemes (roughly, sounds) and vice versa. In some cases, however, vowel length or consonant length does not match between writing and pronunciation (e.g. szúnyog [su?og] 'mosquito', küzd [ky:zd] 'fight', állat [a:l?t] 'animal', egy [e?:] 'one').
Suffixed or compound words usually obey the second main principle, word analysis. It means that the original constituents (morphemes) of a word should be written the same way, regardless of pronunciation assimilations. This, however is only true when the resulting pronunciation conforms to some regular pattern; irregular assimilations are reflected in writing too. For example, hagy + j ('you should leave [some]') is pronounced like "haggy", but written as hagyj according to the principle of word analysis. This is because the composition of gy and j gives a long gy in Hungarian phonology anyway, so spelling out the original morphemes is considered clearer. By contrast hisz + j ('you should believe') is pronounced "higgy" and also written as higgy, since this pronunciation cannot be regularly deduced from the morphemes and basic phonological rules. Compound words are generally written so that all constituents retain their spelling, but some compounds have become vague enough not to be considered true compounds any more, especially if one of the elements is obsolete. An example is keszty? 'glove', which originally comes from kéz 'hand' and an obsolete ty? and in this case the spelling no longer reflects the derivation.
The third principle, tradition, affects for example surnames, whose spelling often predates the modern spelling rules of Hungarian. For example, kovács 'smith' may be spelt Kovács, Kováts or Kovách as a surname. Another example for tradition is that the digraph ly is still used despite the fact that it stands for the same sound as j in today's standard Hungarian.
The fourth principle (simplification) only affects a handful of cases. If a common noun ending in a double consonant has a suffix beginning with the same consonant, the third instance is dropped, e.g. toll + lal > tollal. This rule extends to Hungarian given names, e.g. Bernadett + t?l > Bernadett?l 'from Bernadett.' On the other hand, compounds and suffixed proper names (excluding Hungarian given names) containing three consecutive identical consonants preserve all three, but a hyphen is also inserted (e.g. sakk-kör 'chess group', Wittmann-né 'Mrs. Wittmann', Bonn-nal 'with Bonn'). The simplification principle is also applied to double digraphs at the border of suffixes, thus sz + sz becomes ssz (e.g. Kovács + csal > Kováccsal 'with Kovács'). However, there is no simplification in compounds: e.g. kulcscsomó 'bunch of keys'. In case of suffix-like derivational elements such as -szer? and -féle '-like', simplification can only be applied to words ending in a single digraph, e.g. viasz + szer? > viasszer? 'wax-like' but not to their doubled forms: dzsessz + szer? > dzsessz-szer? 'jazz-like'.
Compound words are typically spelt as one word (without spaces) and phrases are normally spelt as more than one word (with one or more spaces), but this is not always the case. Hyphenated spelling is considered an alternative to writing as one word and is used, e.g., if a compound contains a proper name.
As far as repeated words are concerned, they are normally written separately (with a comma), but a hyphen is used if their connection is more than occasional (e.g. ki 'who' but ki-ki 'everyone'). If a word is repeated with a different suffix or postposition, the words are written separately (napról napra 'day by day', lit. 'from day to day'), except if an element only exists in this phrase, in which case the words are written with a hyphen (régi 'old' réges-régi 'ancient old').
Coordinated words are normally written separately (with a comma). If the meaning of the result is different from that of the two words together, but both elements take suffixes, they are written with a hyphen (e.g. süt-f?z 'cook', consisting of words referring to cooking in the oven and cooking in water, sütnek-f?znek 'they cook'). A hyphen is needed in cases when a phrase is only used with certain suffixes. Connections of words which are completely fused and thus take suffixes only at the end of the second element are written as one word (e.g. búbánat 'sorrow and grief', búbánatos 'stricken with sorrow and grief'). However, there are phrases that only take suffixes at the end but their elements are still connected with a hyphen, as when words are contrasted (e.g. édes-bús 'bittersweet'). Certain phrases can be suffixed either at the end of both elements or only at the end of the second element (e.g. hírnév 'fame': hírneve or híre-neve 'his/her/its fame').
As shown by printed material and street inscriptions, this field is probably the most problematic for the majority of native speakers even at a reasonably educated level. The main principle is that these compounds have to be written without spaces if any of these three criteria are met:
This applies to phrases and compounds of many types, like those where the first element is the subject of the second (which is a participle), or it is the adjective of the second (e.g. gyors vonat means 'fast train', while gyorsvonat means 'express train' as a type of train: the change in meaning makes it necessary to write the latter as one word).
As far as the suffix omission is concerned, often there is a grammatical relationship between two nouns of a compound which could also be expressed in a marked, more explicit way: for example ablaküveg 'window pane' could be expressed as az ablak üvege 'the pane of the window,' and based on this derivation, it needs to be written as one word. The word bolondokháza 'confusion, turmoil' also needs to be written as one word, despite the marked possessive, so as to avoid the literal meaning 'house of fools' (1st case). Other compounds, where the first element gives the object, the adverb, or the possessor, are also written in one word where the suffix is omitted, or if the actual meaning is different from the sum of its elements. Thus, szélvédett 'wind-protected' can be deduced from szélt?l védett 'protected from [the] wind', and it is written together as the suffix t?l is omitted. Verbal phrases where the suffix is marked are usually written in two words, even if the meaning has become figurative, (e.g. részt vesz 'take part'), while other phrases with a marked suffix are written in one word (e.g. észrevesz 'notice, spot', literally "take on the mind").
Verbal prefixes (cf. Vorsilben in German) are only written together with the verb they belong to if they immediately precede that verb. If the same verbal prefix is repeated to express repeated action, the first is divided by a hyphen, the second is written in one word (meg-megáll 'keep stopping once in a while'). If two verbal prefixes with an opposite meaning follow each other, both are written separately (le-föl sétál 'walk up and down'). Verbal prefixes may be written separately if the meaning of the prefix is stressed and the prefix is meant in a literal sense, but they must be written as one word if the meaning is changed (e.g. fenn marad 'stay upstairs' but fennmarad 'survive, remain'). Some verbal prefixes coincide with adverbs that can have personal endings. In this case, they can only be written as one word if they are in the third person plural and the prefix/adverb is not stressed on its own (especially if the meaning is changed). Otherwise (if another person is used and/or the prefix/adverb is stressed) they should be written in two words.
A separate group of compounds with subordinated elements is the one named literally "meaning-condensing" or "meaning-compressing" compounds, which have a more complex internal structure, containing implicit elements outside the constituting words, or sometimes where the present meaning cannot be derived at all from the elements. They are always written in one word, e.g. csigalépcs? 'spiral staircase,' lit. "snail-staircase", i.e. a staircase similar to the shell of snails.
Phrases whose first element is a participle are written separately if the participle expresses an occasional action: dolgozó n? 'a working woman, a woman at work.' However, if the participle expresses function, purpose, ability, task, or duty, the phrase is considered a compound and is written as one word, e.g. mosón? 'washerwoman', someone whose duty is to wash. Sétálóutca 'walking [pedestrian] street' means a street for walking: writing as one word expresses that it is not the street that walks. - However, this rule doesn't apply to compounds where an element is already a compound itself, even if the whole compound expresses function or purpose. For example, rakétaindító állvány 'rocket launching platform' is written as two words because of its compound first element, despite the fact that it is not the platform which launches the rocket, but it is only used for it, so a function is expressed.
If a phrase (e.g. an adjective and a noun or a noun and a postposition) written in two words receives a derivational suffix, it will also be written in two words - except if the meaning is changed. However, if they receive a second derivational suffix, the phrase will be written in one word. (For example: egymás után 'one after the other', egymás utáni 'successive', but egymásutániság 'successiveness,' i.e. 'succession.' In addition: föld alatt 'under the ground', föld alatti 'being under the ground' but földalatti 'underground <movement>' or 'subway, tube.')
Appositional compounds are normally written in two words, e.g. 'a footballer wife' (a wife who plays football) is expressed as futballista feleség. However, if there is a possessive relationship between the words, i.e. if the wife of a footballer is meant, it is considered a (regular) compound, thus it should be written as one word: futballistafeleség. There are several appositional compounds though, which are written as one word, especially where the first element specifies the type of the second (e.g. diáklány 'student girl').
Words containing a suffixed numeral are written as one word (e.g. húszméteres út 'a twenty metres long way,' cf. húsz méter 'twenty metres'), except if an element is already a compound (e.g. huszonegy méteres út 'a twenty-one metres long way' or húsz kilométeres út 'a twenty kilometres long way'). This rule doesn't apply to compounds with numbers written in digits, e.g. 20 méteres út, as they are written with spaces. - A similar principle is applied to compounds whose first element expresses the material of the second (e.g. faasztal 'wooden table' but feny?fa asztal 'pine-wood table' and fa konyhaasztal 'wooden kitchen table').
To avoid too long words, a "syllable-counting rule" is applied. Compounds with more than 6 syllables (excluding all its inflectional suffixes) and more than 2 elements take a hyphen at the border of the two main elements. For example, labdarúgócsapataitokkal 'with your [PL] football teams' has 10 syllables, but its stem, labdarúgócsapat is only 6 syllables long, so all its forms are written as one word. On the other hand, labdarúgó-bajnokság 'football championship' has 7 syllables even in its base form, so all its forms should take a hyphen. Compounds of whatever length are permitted, supposing they consist of only two elements, e.g. nitrogénasszimiláció 'nitrogen assimilation' is written as one word despite its 9 syllables. Sometimes adding a single letter (a short suffix, in fact) may induce a hyphen, e.g. vendéglátóipar 'catering industry' is written as one word, but vendéglátó-ipari 'catering industry related' will take a hyphen in accordance with the above rules.
Sometimes word boundaries are flexibly rearranged to reflect the meaning of the whole compound: the three rules dealing with it are referred to as "mobility rules".
The following type of proper names are distinguished: personal names, animals' names, geographical names, astronomical names, names of institutions, brand names, names of awards and prizes, and titles (of works).
Proper names may become common names, and in this case they are written in lowercase (e.g. röntgen 'x-ray') and even their derived compounds may become lowercase, losing the hyphen (e.g. ádámcsutka rather than *Ádám-csutka 'Adam's apple').
Surnames and given names are capitalized. Surnames may have an old-fashioned spelling, which is usually retained - except if their form already has variations, and some of them may interfere with reading. They may consist of two or more elements, and they may be given as one word or in several words, but today hyphenation is the most common method. Given names are written phonetically (even modern names like Dzsenifer, cf. English Jennifer), except that x and ch are retained (even though they are pronounced ksz and h), e.g. Richárd, Alexandra.
Names of gods and religious figures are capitalized, except when they are referred to as common names (like Greek gods) or if they are mentioned as part of common phrases (e.g. hála istennek 'thank God').
Occasional epithets are not capitalized: only their fixed equivalents are. Common nouns expressing rank or relation are written separately (István király 'King Stephen', Németh mérnök 'Mr Németh, engineer'). Groups of people named after people (or even a fancy name)[clarification needed] are written separately, except for groups founded or led by that person (in which case it is a compound, written with a hyphen).
Suffixes are added to personal names without hyphens. If a suffix is attached, it follows the pronunciation of the word, including obsolete consonant clusters (e.g. Móricz, pronounced ['mo:rits], suffixed: Móriczcal). However, if a surname or a foreign name ends in a double consonant, suffixes are added with a hyphen, so that the original form can be restored (e.g. Papp is suffixed as Papp-pal, because Pappal would refer to another name, Pap). However, given names are suffixed in a simplified way, because they are from a limited set, so their original forms can be retraced (e.g. Bernadett + tel > Bernadettel).
If an adjective is formed from a proper name, it is not capitalized. (In case of a hyphenated compound, no element is capitalized, e.g. Rippl-Rónai but rippl-rónais 'typical of Rippl-Rónai'.) Suffixes are added directly, except if the name consists of several elements written separately: Széchenyi István and Széchenyi István-i. Compounds formed with personal names are always hyphenated, e.g. Ady-vers 'a poem by Ady'.
An exception to the hyphenation of compounds with a proper name is when the proper name contains an uncapitalized common noun. For example, if there is a monastery (kolostor) named after Jeremiás próféta 'the Prophet Jeremiah', the compound Jeremiás próféta kolostor cannot have the usual hyphen, as it would falsely suggest a closer relationship between próféta and kolostor. (If all the elements were common nouns, the case would be simpler, as the above mobility rules could be applied.)
Animals' names are capitalized, and if the species is added, it is written in lowercase, without a hyphen.
The two most important questions about geographical names are whether a name should be written in one word, with a hyphen, or in separate words, and which elements should be written uppercase and lowercase. Different written forms may refer to different entities, e.g. Sáros-patak lit. 'muddy river' refers to a river, but Sárospatak refers to a city (because rivers' names are written with a hyphen, but city names are written as one word). This field is considered one of the most complex parts of Hungarian orthography, so a separate volume has been published about it, and a separate board (Földrajzinév-bizottság) working in the Ministry of Agriculture is entitled to give statements. It consists of experts in linguistics, education, transportation, hydrology, natural protection, public administration, ethnic minorities, foreign relations, and other fields.
Apart from single-element names, country names with -ország, -föld, -alföld or -part ('country', 'land', 'plain', 'coast') and most regions are written in one word, as well as Hungarian settlements and their districts ("towns") and quarters, and even Hungarian names outside Hungary. The adjective-forming suffix -i (sometimes -beli) is attached directly to the name. If it already ends in -i, this ending is not repeated.
If a geographical name contains a common geographical expression (river, lake, mountain, island etc.) or another common noun or an adjective, the compound is written with a hyphen (e.g. Huron-tó 'Lake Huron' or Új-Zéland 'New Zealand'). When these forms are converted into an adjective, only those elements are left capitalized which are actual proper names themselves (Kaszpi-tenger and Kaszpi-tengeri 'Caspian Sea', however Új-Zéland and új-zélandi - zélandi is not considered a proper name because it carries the adjectival suffix). The same rule is applied to compounds with three or more elements, although compounds with more than four elements are simplified (lower-ranked hyphens are removed).
An en dash is used to express a relation between two places, and its adjectival form becomes completely lower-case (e.g. Moszkva-Párizs 'Moscow-Paris [route]' and moszkva-párizsi 'of the Moscow-Paris route'). However, if a higher-ranked connected element becomes an adjective, the geographical proper names will retain the upper case (e.g. Volga-Don-csatorna 'Volga-Don canal' vs. Volga-Don-csatornai), except when the elements of the name contain adjectives or common nouns, which will become lower-case (e.g. Cseh-Morva-dombság 'Bohemian-Moravian Highlands' vs. cseh-morva-dombsági).
All elements are written separately (excluding the above-mentioned names that are written as one word or with a hyphen) in current and historical country names and geographical-historical region names. Their adjectival forms are all written with lower case. (For example, Egyesült Királyság 'United Kingdom' vs. egyesült királysági 'from/of the U.K.', Dél-afrikai Köztársaság 'South African Republic' vs. dél-afrikai köztársasági but San Marino Köztársaság 'Republic of San Marino' vs. San Marino köztársasági).
Only the first element is capitalized in subnational entities like counties, areas, districts, neighbourhoods. When forming an adjective, this uppercase letter is only kept if this element is a proper name, e.g. New York állam 'State of New York' vs. New York állami. However, if the first element of such an entity is a common noun or an adjectival form, all elements are written lower case (e.g. in names of local administrative units like Váci kistérség vs. váci kistérségi).
Names of public spaces (roads, streets, squares, bridges etc.) are written separately (except for elements that are already compounds or hyphenated). Their first element is capitalized, and this capitalization is kept even in the adjectival forms, e.g. Váci utca 'Váci Street' and Váci utcai.
If a common name is added to a geographical name to clarify its nature, it is written separately.
If a geographical name consists of several elements whose relationship is marked by suffixes or postpositions, these elements are also written separately. The uppercase letter of the beginning element is kept even in an adjectival form.
The above case of Jeremiás próféta kolostor emerges again with the type of Mária asszony sziget 'Lady Mary Island', where sziget 'island' would normally be connected with a hyphen, were it not for the common noun asszony 'lady' in the original name, which makes it impossible, so all elements have to be written separately.
Stars, constellations, planets, moons are written with an uppercase capital, e.g. Föld 'Earth', Tejút 'Milky Way', especially as astronomical terms. In everyday usage, however, names of the Earth, the Moon, and the Sun are normally written in lowercase (föld körüli utazás 'a journey around the Earth').
Names of offices, social organizations, educational institutions, academic institutes, cooperatives, companies etc. are written capitalizing all elements except conjunctions and articles. In adjectival forms, only actual proper names and fancy names[clarification needed] are left uppercase. For example, Országos Széchényi Könyvtár 'National Széchényi Library' vs. országos Széchényi könyvtári.
If a part of the institution name stands for the whole name, its upper case form is preserved if it is a specific keyword of the name. However, if a common noun part is used for the whole name, it is written in lower case (except for Akadémia for the Hungarian Academy of Sciences and Opera for the Hungarian State Opera House).
Subordinated units of institutions are written in uppercase if they are major divisions (e.g. Földrajzi Társaság 'Geographical Society', under the Hungarian Academy of Sciences), not including the personnel department or the warden's office.
Railway stations, airports, cinemas, restaurants, cafés, shops, baths and spas, cemeteries etc. are considered less typical institutions[clarification needed], so only their actual proper name elements (including possible fancy names[clarification needed]) are written in upper case, apart from the first word. Their adjectival forms retain the original case. For example, Keleti pályaudvar 'Eastern Railway Station' vs. Keleti pályaudvari; Vén Diák eszpresszó 'Old Student Café' vs. Vén Diák eszpresszóbeli,
Names of products, articles, makes, and brands are written capitalized, e.g. Alfa Romeo. This does not include names which include the material or origin of the product, e.g. narancsital 'orange juice'. If the word showing the type is added to the name for clarification, it is done with a space, and in lowercase, e.g. Panangin tabletta 'Panangin pill'.
Words denoting a prize, an award, a medal etc. are attached with a hyphen to proper names, e.g. Kossuth-díj 'Kossuth Prize.' If the name consists of several elements, whose relation is marked, all the elements are capitalized, e.g. Akadémiai Aranyérem 'Golden Medal of the Academy.' Degrees and types of awards are written in lowercase.
Titles are classified as constant and individual titles: the first being the title of newspapers, periodicals, magazines, and the second used with literary, artistic, musical, and other works, articles etc.
All elements of constant titles are written in uppercase (e.g. 'Élet és Tudomány 'Life and Science' [weekly]), while only the first word is capitalized in individual titles (e.g. Magyar értelmez? kéziszótár 'Defining Desk Dictionary of the Hungarian Language' or Kis éji zene 'A Little Night Music').
Suffixes are attached to titles without a hyphen, except if a title already ends in a suffix or a punctuation mark, or if the suffix creates an adjective: in these cases, a hyphen must be used. (For example: a Magyar Hírlapban 'in Magyar Hírlap' but Magyar Hírlap-szer? 'Magyar Hírlap-like.')
Names of national and religious holidays, celebrations, notable days, periods, historical events are not capitalized (nor day or month names), neither are names of nationalities and ethnicities, languages and language groups as well as religions. Events, programmes, and arrangements are not capitalized either, except if they have an institutional background.
Apart from personal names, commons nouns expressing rank or relation may also be capitalized in addresses for reasons of politeness.suffixes and titles like Doctor, Junior, Senior, and their abbreviations are only capitalized if they are in a prominent position (e.g. in postal addresses or lists).
Foreign words either retain their foreign spelling or they are phonetically respelled according to the Hungarian writing system.
If a word comes from a language using the Latin script, it is only respelled if it has become an integral, widely known part of Hungarian language (e.g. laser > lézer; manager > menedzser). If it is less widely used, it retains its original spelling, e.g. bestseller, myocarditis, rinascimento. But there is no hard and consistent rule, and many widely used terms are written in the original spelling, e.g. musical or show. Certain phrases from foreign languages are always written in their original form, even if the individual words would be respelled in isolation, e.g. tuberkulózis cf. tuberculosis bronchialis.
Some features of the original spelling are sometimes retained, e.g. football > futball (pronounced "fudbal"), million > millió (pronounced "milió"). The digraph ch is preserved if it is pronounced [h]. The letter x, if pronounced "ksz", is usually written x in Hungarian too. However, if it is pronounced "gz", it is normally written gz, again with a few exceptions. The letters qu are always respelled as kv.
If the source language uses a non-Latin script (Greek, Russian, Chinese etc.) words are respelled phonetically. This does not always mean exact transliteration: sometimes the foreign pronunciation is bent to conform to Hungarian phonology better (e.g. szamovár, tájfun 'samovar', 'typhoon'). In practice however, English transliterations are often used instead, such as using gyros instead of gírosz).
Proper names from languages with a Latin alphabet are normally written in the original way, e.g. Shakespeare, Horatius, Chopin, including all the diacritics (e.g. Molière, Gda?sk).
Certain foreign proper names have a Hungarian version, e.g. Kolumbusz Kristóf for Christopher Columbus (in the Eastern name order, typical of Hungarian). Other names adapted the given name and the word order to Hungarian customs, but left the surname intact, e.g. Verne Gyula for Jules Verne. Recently borrowed names are no longer modified in Hungarian. The only exceptions are some given names which can only be written in Hungarian spelling, e.g. Krisztián for Christian and Kármen for Carmen.
As with common nouns, ch and x are retained in both personal names and geographical names of foreign origin (e.g. Beatrix, Mexikó). Similarly to common names again, widely-known and fixed forms of proper names from languages with a non-Latin script are preserved (e.g. Ezópus (Aesop), Athén, Peking), rather than introducing a more up-to-date or more accurate transliteration (e.g. Aiszóposz, Athénai/Athína, Pejcsing). Some well-established foreign names have a popular form used in phrases and another referring to the person (e.g. Pitagorasz tétele 'Pythagorean theorem' but Püthagorasz for the philosopher himself).
Suffixes are added directly in most cases. The -i suffix is omitted in writing if the word already ends in the letter i (e.g. Stockholm > stockholmi; Helsinki > helsinki). In the case of suffixes of variable forms depending on Hungarian vowel harmony rules, the version in accordance with the actual pronunciation should be used. If a certain suffix requires lengthening of the word-final vowels a, e, o, ö, they are lengthened as usual, e.g. Oslo but Oslóban, oslói. In addition, suffixes will follow the pronunciation of the word in terms of the ending consonant and the front or back vowels (e.g. Bachhal 'with Bach', Greenwichcsel 'with Greenwich').
If the last letter of a foreign word is silent (not pronounced) or part of a complex cluster of letters, a hyphen is used when attaching suffixes (e.g. guillotine-nal 'with a guillotine', Montesquieu-vel 'with M.'). If an adjective is formed from a proper name with only one element, it will be lowercase (e.g. voltaire-es 'Voltaire-esque').
A hyphen is also used if an adjective is formed from a multiword name (e.g. Victor Hugó-i 'typical of V. H.', San Franciscó-i 'S. F.-based'). The last vowel is lengthened even in writing if it is pronounced and it is required by phonological rules. If the suffix begins with the same letter as a word-final double letter (e.g. Grimm-mel 'with Grimm')., a hyphen is used again.
Hyphenation at the end of a line depends on whether there is an easily recognizable word boundary there. If the word is not a compound (or it is, but the boundary is not nearby) the word is hyphenated by syllables, otherwise by word elements (e.g. vas-út 'railway', lit. 'iron-road,' instead of *va-sút).
The number of syllables is defined by the number of vowels (i.e., every syllable must contain one and only one vowel) and the main rule can be summarized as follows: a syllable can begin with at most one consonant (except for the first syllable of a word, which may contain up to three initial consonants). It means that a syllable can only begin without a consonant if there is no consonant after the preceding vowel (e.g. di-ó-nyi 'nut-sized'), and if there are multiple consonants between vowels, only one can go to the next syllable (e.g. lajst-rom 'list').
Hyphenation normally follows pronunciation, rather than the written form. If a word contains several vowel letters but they are pronounced as a single sound, it cannot be hyphenated (e.g. Soós 'a surname', blues 'blues'). Pronunciation is respected in the case of ch, which is pronounced as a single sound so both its letters are kept together (e.g. pszi-chológia, züri-chi 'from Zürich'). Hungarian surnames are also hyphenated by pronunciation, e.g. Beöthy > Beö-thy [pr. b?-ti], Baloghék 'the Balogh family' > Ba-lo-ghék [pr. balog], móri-czos 'typical of Móricz' ['mo:rits]. The same principle applies to foreign common names and proper names, e.g. Ljub-lja-na, Gior-gio, Fi-scher for consonants (because lj, gi, and sch denote single sounds) and Baude-laire, Coo-per for vowels. Even acronyms can be hyphenated if they contain at least two vowels (e.g. NA-TO) or at the boundary of the acronym and the suffix, where a hyphen already exists (e.g. NATO-ért 'for the NATO').
On the other hand, x denotes two sounds, but it is not separated at the boundary of two syllables (e.g. ta-xi rather than *tak-szi, based on phonetics). Long double consonants are separated and their original forms are restored if they are at the boundary of two syllables (e.g. meggyes 'cherry-flavoured' > megy-gyes). Although not incorrect, it is not recommended to leave a single vowel at the end or the beginning of a line (e.g. Á-ron, Le-a). Double vowels can be separated (e.g. váku-um 'vacuum'), and long consonants can also be separated (e.g. ton-na 'ton'). Inflectional suffixes are not considered elements on their own (e.g. although the stem of pénzért 'for money' is pénz, its hyphenation is pén-zért rather than *pénz-ért).
Apart from the hyphenation based on pronunciation, foreign compounds may be hyphenated at their boundary, if the prefix or suffix is widely recognized, e.g. fotog-ráfia (by syllables) or foto-gráfia (by elements). The elements are also taken into consideration in compound names (e.g. Pálffy [pr. pálfi], hyphenated as Pál-ffy, rather than *Pálf-fy). Sometimes different ways of hyphenation reflect different words (e.g. me-gint 'again,' a single word hyphenated by syllables, cf. meg-int 'admonish,' a compound with a verbal prefix, hyphenated by elements). Hyphens are not to be repeated at the beginning of the next line, except in specialized textbooks, as a way of warning for the correct form.
Punctuation marks are added to the end of the sentence depending on its intended meaning. The exclamation mark is not only used for exclamations, but also for wishes and commands. If the sentence formally reflects one mood, but it actually refers to a different idea, the punctuation mark is selected based on the actual meaning. Punctuation marks may be repeated or combined to express an intense or mixed emotion. (For example, Hogy képzeled ezt?! 'How do you dare?!')
In case of coordinating clauses, the punctuation mark is adapted to the ending clause. Subordinated clauses take a punctuation mark reflecting the main clause - except if the main clause is only symbolic, emphasizing the subordinate clause.
A comma (or a colon, semicolon etc.) should be placed at the border of clauses whether or not there is a conjunction. It also applies to cases when the clause begins with one of the conjunctions és, s, meg 'and' and vagy 'or'. However, it is sometimes difficult to assess whether the part joined with these conjunctions is a separate clause (because if it is not, no comma is needed). For example: Bevágta az ajtót, és dühösen elrohant. 'He banged the door and rushed away in fury.' but Hirtelen felugrott és elrohant. 'He suddenly jumped up and rushed away.'
Similes introduced with the word mint 'as, like' are to be preceded by a comma. The exception is a kind of a 'more than' construction that has a mere intensifying function (as opposed to 'practically' or 'almost'). In case of a double conjunction expressing 'instead of (doing)', 'without (doing)' etc., only the first element should be preceded by a comma - except if the first element closely belongs to the first clause, in which case the comma is placed between the two conjunctions.
Semicolons are generally used to separate sets of closely connected clauses, if these larger sets of clauses are loosely connected to each other. A semicolon may also be used to mark that two single clause have but a loose relation to each other.
Colons attract the attention to a forthcoming idea, or they may be used to mark that an important explanation or conclusion follows. If a clause introduces several separate sentences, all of them (including the first) are written with an uppercase initial.
To express that a fairly distinct set of ideas follows, a dash may be used after the full stop, the question mark, or the exclamation mark.
Coordinated clause elements are separated by commas if no conjunction is used. (A semicolon may be used to separate series of words whose elements are separated by commas.) If a conjunction is used between coordinated clause elements, a comma is used before it, except if the conjunction is one of the words és, s, meg 'and' or vagy 'or,' where the comma is omitted. Since the abbreviation stb. 'etc.' includes the conjunction s 'and,' it doesn't need a comma either. For example: tetszet?s, de helytelen elmélet 'an appealing but incorrect theory,' a rózsának, a szegf?nek vagy a levendulának az illata 'the scent of a rose, a carnation, or a lavender.'
If a coordinated sentence element is mentioned at the end of the whole clause, separated from the related elements, in a postponed manner, it is separated from the rest of the clause with a comma. For example: Erny?t hozzál magaddal a kirándulásra, vagy kabátot! 'Bring an umbrella to the excursion, or a raincoat.' Coordinated structures formed with coupled conjunctions (e.g. "either - or") are written with a comma placed before the second conjunction.
Appositions are separated from the referred element with a comma (or a colon), if they are in the same grammatical position as the referred element. If the apposition gets further back in the sentence, the comma will precede it directly. If the apposition is followed by a pause in speech, a comma may be placed after it, too. If a descriptive phrase is added to a personal name but only the last part takes the suffixes (in which case it is not called an apposition), no comma is used after the personal name. For example: Nagy Elemérnek, városunk díszpolgárának 'to Elemér Nagy, honorary citizen of our town' - because of the possessive structure, both element take the suffixes, and the second part can only be an apposition, so a comma is needed. On the other hand: Nagy Elemér díszpolgárnak 'to Elemér Nagy honorary citizen' - the whole structure takes one suffix at the very end, thus it cannot be appositive, and no comma is used. If the apposition or the referred element is a derivative of the word maga ("himself" etc.), the comma is not used. However, adverbs used like appositions take the comma.
Subordinated clause elements take no comma (e.g. fekete szemüveges férfi 'a man with black glasses' - the word fekete 'black' doesn't belong to férfi 'man' but to szemüveg 'glasses'). If the word mint 'as' precedes a phrase expressing status or quality, no comma is used before it (e.g. Bátyámat mint tanút hallgatták ki. 'My brother was heard as a witness.') Structures formed with an adverbial participle are not usually separated from the clause with a comma, especially if the participle is directly connected to it. However, if this part is loosely attached to the clause (especially if the participle has its own complement), it is recommended to use a comma.
Words or phrases (especially external elements) interposed into a sentence are marked with commas, dashes (with spaces), or parentheses. For example: Bátyámat, a baleset tanújaként, többször is kihallgatták. or Bátyámat - a baleset tanújaként - többször is kihallgatták. or Bátyámat (a baleset tanújaként) többször is kihallgatták. 'My brother, as the witness of the accident, was heard several times.' The comma may be omitted around interposed elements depending on the articulation, reflecting the intention of the author, e.g. A vonat, persze, megint késett. 'The train was, of course, late again.' can be written without commas as well. If the conjunction mint 'as' precedes an interpolation separated by pauses in speech, commas may be used before and after the interjected part. Subordinated clauses are also separated by commas, dashes, or parentheses if they are interposed into another clause. Évi, bár még át tudott volna szaladni az úttesten, hagyta elmenni a teherautót. 'Eve, though she could have run through the road, let the truck leave.'
If a word, phrase, or clause is interposed into a sentence right next to a punctuation mark, this mark needs to be inserted after the pair of dashes or parentheses. For example: M?szaki egyetemen szerzett diplomát - vegyészmérnökit -, de író lett. 'He graduated at a technical university - as a chemical engineer - but he became a writer.' However, if an independent sentence is interposed, its punctuation mark is inserted inside the parentheses.
Forms of address are usually followed by an exclamation mark, e.g. Kedves Barátaim! 'My dear friends,' or a comma can be used in private letters. If this form stands within a sentence, it is separated from the rest with commas.
Quotation marks are placed below at the beginning and above at the ending of a quotation, both signs turning left, being curly and double. If another quotation is included in a quotation, angle quotation marks (guillemets) are used, directed towards each other with their tips: (,,quote1 »quote2« quote1").
If a quoting sentence introduces the quotation, it is preceded by a colon; the ending punctuation mark should be inserted as in the original. Lowercase initials should only be used if they are lowercase in the original. If a quoting sentence follows the quotation, they are separated by a dash (and spaces). Punctuation marks of the original text are preserved, except for the full stop, which is omitted. If the quoting sentence is interposed in the quotation itself, it is written in lowercase and separated with dashes (and spaces). The second quotation mark stands at the end of the quotation. For example: Így felelt: ,,Igen, tudom." or ,,Igen, tudom" - felelte. or ,,Igen - felelte -, tudom." '"Yes, I know," he replied.'
If the quotation is organically interwoven into one's own text, the quoted part is marked with quotation marks, and common words beginning the quotation are written in lowercase (even despite the original). For example: A tanterv szerint az iskola egyik célja, hogy ,,testileg, szellemileg egészséges nemzedéket neveljen". 'According to the syllabus, one of the goals of a school is "to bring up a generation healthy in body and mind."' When quoting others' words in terms of their content, the quotation marks are not used: Alkotmányunk kimondja, hogy társadalmi rendszerünknek a munka az alapja. 'Our constitution states that our social system is based on work.' Indirect (reported) speech is treated in the same way.
In fiction and prose, quotations are marked by dashes instead of quotation marks, placed at the beginning of a line. If the quotation is written in a separate line, the only dash is the one that precedes it. If quotation is followed by the quoting sentence, they are separated by another dash (the full stop omitted from the end, other punctuation marks retained, as described above). If a quotation is continued after the author's words, another dash follows. For example:
Interjections are preceded and followed by commas. If an interjection is followed by the emphatic words be or de 'how much,' the commas can be omitted depending on the stress and pause conditions of the sentence.
If two conjunctions follow each other (e.g. because of an interposed clause), only the first is preceded by a comma, e.g. Hívták, de mert hideg volt, nem indult útnak. 'They invited him, but as it was cold, he didn't set out.'
A hyphen is used between words and their elements in the following cases (a taxative list, partly reiterating points mentioned elsewhere):
The dash is referred to in Hungarian orthography under two names: gondolatjel (lit. "thought mark") and nagyköt?jel (lit. "big hyphen"). The first form applies to cases where it separates an interposed remark, usually a clause or a phrase (see above): this one is always used with spaces on either side (or a comma and a space after it). The second one is used to connect single words with each other to create a phrase: this one is normally used without spaces. This latter dash is used between words in the following cases (a taxative list):
The ellipsis sign (...) is used to mark that an idea is unfinished (and more thoughts can be inferred from what is written), or if a part of a text has been omitted from a quotation.
Suffixes are normally attached to words directly. However, a hyphen is used in a couple of cases (a taxative list, referring to other passages of the regulation):
No full stop is needed after titles of periodicals, books, poems, articles, studies, and treatises as well as after institution names and direction signs if they are given highlighted or on their own. However, lower section titles can be inserted in a text and they can be followed by other sentences: in this case, a full stop is used after them. Question and exclamation marks can be used even in highlighted titles.
A full stop is used in the following cases:
A colon is used to highlight a phrase or sentence mentioned as an example. This sign is also used between an author's name and the title of the work, if they are given without a syntactic reference to each other. A possessive case, however, elimites the colon. (For example: Arany János: Toldi but Arany János Toldija 'Toldi by János Arany.')
A hyphen is used at the end of a line, when a part of a word is taken to the next line. If a word already contains a hyphen for whatever reason, it can be used at the end of the line, just like if it contains a dash.
If a part given in parentheses has a fairly close connection to the sentence, the closing punctuation mark is used after it. If the part in parentheses ends in a full stop, the punctuation mark still needs to be used after the parenthetical part.
Quotation marks may be used (though should not be overused) to express ironic or other emotional overtones. Quotation marks can be used around the titles of books, works, articles etc. - in this case, suffixes can be connected with a hyphen.
The beginning of decimal fractions is marked with a comma. Numerals of more than four digits are divided by spaces, in groups of three, counted from the back. (See more below.)
The following signs and symbols are also used relatively frequently (with minor differences from the Anglo-Saxon usage): plus (+) for addition, minus (-) for extraction, the interpunct ( · ) for multiplication, the colon ( : ) for division, the equals sign (=) to mean equality, the percent sign (%) to express percent, the slash (/) to express alternativeness or fractions, the section sign (§) to refer to sections, a combination of an upper dot, a slash, and a lower dot (?) to mean "please turn over," the asterisk or superscript numbers (* or 1) to mark notes, a right double curly quote (") to express repetition (as opposed to ditto mark), a right single curly quote (') to express lack, the degree symbol to mark the (Celsius) degree, and the tilde (~) to express repetition or equivalence. Suffixes are connected to the percent sign, the section sign, and the degree symbol with a hyphen, and the suffix will reflect the pronounced form, with respect to assimilations and linking vowels, e.g. 3%-kal [pr. "három százalékkal"] 'by 3%.'
These two groups are distinguished by whether the shortened form is only used in writing (abbreviations) or in speech as well (acronyms). Acronyms may be pronounced with the name of their letters (e.g. OTP 'National Savings Bank' [pr. ótépé]), or if possible, in full (MÁV 'Hungarian State Railways' [pr. máv]). The article preceding these forms is always adapted to the spoken form.
Abbreviations are written in one word whether they are created from single nouns, nouns with derivational suffixes, or compounds, and they are written with a full stop. If an abbreviation retains the ending of the original word, the full stop is still preserved (e.g. pság. < parancsnokság 'headquarters'). Abbreviation of phrases normally contains as many elements as the original phrase contains (e.g. s. k. < saját kezével 'by his/her own hand') but there are exceptions (e.g. vö. < vesd össze 'compare'). Case is usually kept in abbreviations (e.g. Mo. < Magyarország 'Hungary') but some abbreviations created from lowercase words use the uppercase (e.g. Ny < nyugat 'west'). Units of measurement are used in accordance with the international standard, depending on whether the sign comes from a common name (m < méter) or a proper name (N < newton after Isaac Newton). Standard forms of abbreviations are not to be altered even in full-capitalized inscriptions (ÁRA: 100 Ft 'PRICE: 100 HUF').
Some abbreviations are written without a full stop, such as names of currencies, cardinal and ordinal directions, country codes of cars, codes of country names, chemical, physical, mathematicals symbols, symbols of units, etc. The full stop can be omitted from abbreviations in encyclopedias, but they are to be explained in a legend. A full stop is not used after abbreviations whose last element is a full word (e.g. uaz < ugyanaz 'the same').
Suffixes are attached to abbreviations based on their pronunciation (even if the pronunciation is considerably different from the symbol, e.g. F [vas 'iron'] > Fe-sal [vassal 'with iron'], and the article, too, should reflect the pronounced form). If an abbreviation forms a compound with a full word, they are connected with a hyphen (e.g. fszla.-kivonat < folyószámla-kivonat 'statement of current account').
Acronyms are classified into two groups: those consisting only of initials (bet?szók lit. 'letter-words'), and those consisting parts of the original word (szóösszevonások 'word contractions').
The first group is divided again by whether they denote proper names (written in uppercase, e.g. ENSZ < Egyesült Nemzetek Szövetsége 'United Nations Organization', note that both letters of the digraphs SZ are capitalized) or common names (written in lowercase, e.g. vb < végrehajtó bizottság 'executive committee', note that it is written as one word despite the two elements). Some acronyms created from common names are still written in uppercase, though, especially in sciences (URH < ultrarövidhullám 'ultra-high-frequency') but other capitalized acronyms may be accepted too (TDK < tudományos diákkör 'students' scholarly circle'). In some cases, full-fledged words are created from the pronounced form of acronyms standing for common names (e.g. tévé < tv < televízió).
Acronyms of the second group are created from longer parts of the original words (in fact, at least one word of the original should keep at least two letters, not including digraphs). Their letters are not all capitalized, only the initial of acronyms that derive from proper names (e.g. Kermi < Kereskedelmi Min?ség-ellen?rz? Intézet, 'Commercial Quality Control Institute' cf. gyes < gyermekgondozási segély 'maternity benefit').
Neither type of acronyms need a full stop between their elements or at their end.
Acronyms take suffixes in accordance with their pronounced forms, whether their letters are pronounced one by one or as a full word (e.g. tbc-s [tébécés] 'one with tuberculosis'). Those from the first group, consisting only of word initials, are suffixed with a hyphen. Their capitalized types will retain their uppercase even in their adjectival forms (ENSZ-beli 'one from the UN'), and their ending vowel letter will not be lengthened even if it would be phonologically justified (e.g. ELTE-n [eltén] 'at ELTE'). Those from the second group, however, consisting of shorter pieces of the constituting words, take suffixes without a hyphen (e.g. gyesen van 'she is on maternity leave'). The same happens to those words that were created from pronounced letters (e.g. tévézik 'watch TV'). Proper name types of these acronyms are written in lowercase if an adjective is formed out of them (e.g. kermis 'Kermi-related'). In addition, their ending vowel letter may be lengthened in accordance with general phonological rules (e.g. Hungexpo > Hungexpónál 'at Hungexpo').
Compounds are created with acronyms by the following rules: those from the first group take other elements with a hyphen (e.g. URH-adás 'UHF broadcast'), and proper name types of the second group behave the same way (e.g. Kermi-ellen?rzés 'control by Kermi'). The common name types of the second group, however, can be written as one word with other elements, except if they require a hyphen because of length (e.g. tévéközvetítés 'TV transmission').
Numerals that can be pronounced with a short word are usually written in letters, just like those having a suffix, a postposition, or another compound element. On the other hand, digits should be used in case of longer or bigger numerals, as well as to note down exact quantities, dates, amounts of money, measurement, statistical data etc.
If cardinal numbers are written in letters, they should be written as one word up to 2000 (e.g. ezerkilencszázkilencvenkilenc '1,999') and they should be divided by hyphens by the usual three-digit division over 2000 (e.g. kétezer-egy '2,001'). Numbers written in digits can be written without a space up to four digits; above that, they are divided by spaces from the end by the usual three-digit division (e.g. 9999 but 10 000). If numbers are written under each other in a column, all can be divided by spaces.
Ordinal numbers written in digits take a full stop (e.g. 3. sor '3rd line'). The full stop is retained even before the hyphen that connects suffixes (e.g. a 10.-kel 'with the 10th'). Dates are an exception to this rule, see below.
If a fraction functions as a noun, the quantifier is written separately (e.g. egy negyed 'one quarter'). However, if a fraction takes an adjectival role in a phrase, the two parts are written in one word (e.g. egynegyed rész 'a one-quarter part'). Giving the hour is also done by this rule. The integer part of a decimal is divided from the rest by a comma (e.g. 3,14 '3.14').
Numbers are usually written in Arabic numerals. Roman numerals are only used in some special traditional cases, only to express ordinal numbers (e.g. to express the numbering of monarchs, popes, districts of a city, congresses, etc.). Their use is advisable if they have a distinctive role as opposed to Arabic numbers, e.g. to denote the month between the year and the day, or to mark the floor number in front of the door number.
The year is always given in Arabic numerals and it is followed by a full stop. The name of the month can be written in full or abbreviated, or it can be marked with a Roman or Arabic numeral. The day is always written in Arabic numerals. Dates are sometimes written without full stops and spaces, divided only by hyphens.
Normally, a full stop is needed after the year. However, it is omitted in three cases: (1) if it is in a possessive relationship with the forthcoming word, (2) if it is followed by a postposition or an adjective coined from it, or (3) if it is the subject of a sentence or it stands solely in parentheses. For example, 1994. tavasz 'the 1994 spring' but 1994 tavasza 'the spring of 1994' and 1994 után 'after 1994.'
When digits expressing year and day take suffixes, the full stop is dropped before the hyphen (e.g. 1838-ban 'in 1838' and március 15-én 'on March 15th'). The word elsején 'on the 1st of' and its suffixed forms are abbreviated as 1-jén etc. If a day is followed by a postposition, the full stop is retained (e.g. 20. és 30. között 'between the 20th and the 30th').
Letters and other postal consignments are to be addressed by the official addressing patterns of the Hungarian Postal Service. (It currently means that the name comes first, then the settlement, the street or the P.O.B., and finally the postcode, written under each other. Street directions contain the street number first, and optionally the floor number and the door number.)
The words for "hour" and "minute" (óra and perc) are not usually abbreviated in fluent texts. If the time is given in digits, a full stop is placed between the hour and the minute without a space (e.g. 10.35). This latter form takes a hyphen before suffixes (e.g. 10.35-kor 'at 10:35').
Digraphs are distinguished in collation (i.e. to determine the order of entries in a dictionary or directory) from the letters they consist of. For example, cukor is followed by csata, even though s precedes u, as cs is considered a single entity, and follows all the words starting with c. In general dictionaries, contracted forms of digraphs are collated as if they were written in full, e.g. Menyhért precedes mennybolt, even though n precedes y, because nny consists of ny + ny, and h precedes ny. Short and long versions of vowels are considered equal for the purposes of collation (e.g. ír precedes Irak) unless the words are otherwise identically spelt, in which case the short vowel precedes the long one (e.g. egér precedes éger). Phrases and hyphenated compounds are collated ignoring the space or the hyphen between their elements; lower and upper case don't count either.
Obsolete digraphs in traditional Hungarian names and foreign words are treated as a series of individual letters. Diacritics are only taken into consideration if there is no other difference between words. However, in encyclopedias, map indices, and other specialized works, where Hungarian and foreign names are mixed, the universal Latin alphabet is followed.
The rules of Hungarian orthography were first published by the Hungarian Academy of Sciences (HAS) in 1832, edited by Mihály Vörösmarty. Major revisions followed in 1877, 1922, 1954, and 1984. The currently effective version is the 11th edition from 1984. A new revised edition is currently under preparation.
Rules of Hungarian orthography are laid down by the Hungarian Language Committee of the Research Institute for Linguistics of the Hungarian Academy of Sciences and published in a book titled Rules of Hungarian Orthography (A magyar helyesírás szabályai).
This volume is supplemented by two orthographic dictionaries, one published by HAS, and one published by the publisher Osiris Kiadó. The former is considered more official, and comprises 140,000 words and phrases; the latter is more comprehensive, including more than 210,000 words and phrases as well as a more detailed elaboration of the regulations.
Although orthography gives only instructions how to note down an existing text, usage-related suggestions are also given in most Hungarian linguistic publications (such as if a construction should be rephrased or a word should be avoided). These periodicals include Magyar Nyelv, Magyar Nyelv?r, Édes Anyanyelvünk, Magyartanítás, and Nyelvünk és Kultúránk, and several other periodicals have linguistic columns (such as Élet és Tudomány).Ádám Nádasdy sometimes touched on orthographic issues in his column popularizing linguistics in Magyar Narancs, and in his books based on this column and its forerunners. New entries of Korrektorblog (Proofreader's Blog - "The mild Grammar Nazi") are published on the main page of the popular news portal Index.hu.
Linguistic educational programmes were broadcast on television, the most famous being Álljunk meg egy szóra! "Let's stop for a word", screened more than 500 occasions between 1987 and 1997, and some of its issues were published in a book.
Apart from the Geographical Names Committee and the manual on geographical names mentioned above, other fields have their specialized orthographical dictionaries, such as economy, medicine, technology, chemistry, and military affairs, as well as collections of examples in periodicals, such as for zoological and botanical names.
Orthographical competitions are organized at primary, secondary, and tertiary level in every year (Zsigmond Simonyi competition for upper primary schools - for students aged 10 to 14 -, József Implom competition for secondary schools, and Béla J. Nagy competition for universities).
Word processors, some Internet browsers and mailing applications are supplied with a Hungarian spellchecker: Hunspell for OpenOffice.org, Firefox and Thunderbird. A Hungarian company, MorphoLogic has developed its own proofing tools, which is used in Microsoft Office.
People can seek advice for free in orthography-related and other linguistic topics from the Department of Normative Linguistics at the Research Institute for Linguistics of the Hungarian Academy of Sciences or from the Hungarian Linguistic Service Office.