Chapter Five
Pronouncing Esperanto

Copyight © 2002 Sylvan Zaft.  This is the 5th chapter of  Esperanto: A Language for the Global Village.  You may make electronic copies and paper copies for your personal use, and you may freely distribute verbatim copies which include this notice provided that you do not charge for these copies.   You may not post this material to any site.   You are invited to insert links to this site.   For any other use, including publication, you must first get my permission.   I welcome any suggestions about how to improve this work.   My address is sylvanz@me.com

The Esperanto Alphabet

Esperanto’s twenty-eight sounds are each represented by one letter.  Just as the Italian alphabet does without five letters which the English alphabet includes (j, k, w, x and y), so the Esperanto alphabet does without four letters that the English alphabet has.  These letters which are found in English but not in Esperanto are q, x, y and z.

Esperanto includes six special letters which English does not have: ĉ, ĝ, ĥ, ĵ, ŝ and ŭ.  These accented letters are regarded as being different letters than c, g, h, j, s and u just as the Spanish ñ is regarded as being a different letter than the Spanish n.

Five Easy Vowels

There are just five vowel sounds in Esperanto.  They are represented by the five vowels, a, e, i, o and u.  These are pure vowels, without any y sound or w sound at the end of them.  They can be remembered easily by memorizing the following sentence as it would be pronounced by speakers of Standard American English: Ah, men fear cold tombs.

Pronounce the vowels as follows:

a  as in ah

e  as in men

i  as in fear

o  as in cold

u  as in tombs

Esperanto forms diphthongs by adding a j  to a vowel.  The is pronounced like the English letter y.  Here are some examples:

oj  is pronounced like the English oy  in boy

aj  is pronounced like the English y  in my or like the English igh  in sigh.

Esperanto also forms diphthongs by adding ŭ  to a vowel.  The ŭ  is pronounced like the English u in guava.  In other words, it is pronounced like the English w.  Here is an example:

  is pronounced like the English ow  in cow or like the English ough in bough.

That covers the vowel sounds in Esperanto, except for a few more diphthongs which can be easily pronounced by putting their two component sounds together.  One of these, , does not exist in English and must be practiced by speakers of English, just as English sounds such as the two th sounds must be practiced by native speakers of languages in which they do not occur.  (A way to learn how to pronounce   is to say “Ed” but replace the d-sound with a w-sound.)

The Not-so-easy Vowels of English

The Esperanto vowels are spaced nicely apart.  Be aware of your mouth, your tongue, your lips when you pronounce the Esperanto “a‑e‑i‑o‑u.” Notice that there is a lot of movement between each vowel sound.  Now do the same with the English vowels in “bat‑bet‑bit‑but.” Notice how much more subtle these movements are.  This, of course, makes no difference when it comes to children learning English as their native tongue.  Children pick up any sound and imitate it exactly.  This is why Chinese children have no trouble learning how to pronounce Chinese, and American children have no trouble learning how to pronounce English.  However, it is quite a different matter when it comes to American teen-agers or adults learning how to pronounce Chinese or Chinese teen-agers or adults learning how to pronounce English.  Here subtle differences are hard to detect, hard to reproduce and hard to recognize.  This is one of the reasons why Americans might have some difficulty understanding the English of many Orientals.  This is one of the reasons why many foreign students of English have trouble pronouncing English.

The Esperanto Consonants

Most Esperanto consonants are pronounced as they are in English.  This includes these letters:

       b  as in baby

       d  as in dog

       f  as in fill

       k  as in kid

       m  as in might

       n  as in now

       p  as in porcupine

       t  as in take

       v  as in Victor

       z  as in zebra

The consonants that have to be learned by speakers of English are as follows:

       c  is always pronounced like ts in cats, even when it comes at the beginning of a word.

       ĉ  is always pronounced like ch in chair

       g  is always pronounced like g in great (never like g in George)

       ĝ  is always pronounced like g in George (never like g in great)

       h  is always pronounced like h in have

       ĥ  is always pronounced like the Scots ch in loch (this is a very rare sound)

       j   is always pronounced like y in yesterday

       ĵ   is always pronounced like s in pleasure (or z in azure)

       s  is always pronounced like s in still

       ŝ  is always pronounced like s in sure (or sh in should)

The sound of r  is pronounced in different ways.  The way chosen usually depends on the linguistic background of the speaker.  Many authorities claim that the only correct way of pronouncing the r-sound is to pronounce it with the tip of the tongue as speakers of Spanish or Serbo-Croatian do.  Other authorities claim that it can also be pronounced like the Parisian or German r  by vibrating the uvula (that little fleshy thing that hangs down at the back of your mouth) and some authorities claim that it can be pronounced as it is in Standard American.

The r-sound is a problematic and difficult sound for many millions of people.  The Chinese and Japanese have a great deal of difficulty pronouncing r.  Once when I was speaking with a visiting Chinese Esperantist I could not understand his pronunciation of poŝtmarkoj which means “postage stamps” because he entirely omitted the r-sound.  Because of the difficulty that a large percentage of the human race has with the r-sound, Johann Martin Schleyer, who developed the earlier planned language of Volapük, left this sound out of his language.

For Americans the best practical advice is probably this: Pronounce the r  as the Spanish do with the tip of the tongue if you can.  Second best is to pronounce it with the uvula as the Parisians and the Germans do.  Third best is to pronounce it as it is in Standard American English.  However, make certain that you always pronounce it clearly or else you will run a danger of not being understood.  If you pronounce it in the American style you will be speaking with a distinct American accent.

I have difficulty pronouncing the r  with the tip of the tongue so I pronounce it in the Parisian or German style.  In 1989 I attended the five-day annual convention of the Esperanto League for North America in Chicago.  We spoke only Esperanto there.  I was asked what country I originally came from.  I answered that I was a native-born American.  Some people looked surprised at that.  I suspect that my guttural r-sound gave them the impression that I came from some other country.

It should be noted that when Orientals learn English they have the same difficulty with the r-sound as they have when they learn Esperanto.  This causes few practical problems in Esperanto because there are very few pairs of words where the only difference is that one of the words has an and the other an l.

Because the ĥ-sound created a lot of trouble for speakers who did not have that sound in their native language, it has been replaced, wherever possible, by the k-sound.  There are very few common words with this sound that remain in the language.

A difficulty for English speakers is the combination sc which occurs in words such as scienco (science).  The sound for c is found in English but not at the beginning of a word except for the ts in “tsetse fly.” Many English speaking students of Esperanto incorrectly pronounce scienco as though it were written sienco or cienco.  I think that they are usually understood anyway.

The level of difficulty of pronouncing sounds in Esperanto or in any other newly learned language will vary according to the presence or absence of those sounds in the student’s native language.

Finns who study languages such as English or Esperanto have a lot of trouble with consonants because several consonants in these languages simply do not exist as separate consonants in Finnish.  Consonants which are quite different in English and Esperanto seem like different forms of a single continent to a Finn and are therefore very tough to tell apart.  According to Mark Rauhamaa, “Finnish-speakers have a hard time hearing a difference between bet/pet, core/gore, sip/ship/zip etc.”

According to the Encyclopedia Britannica English speakers have different ways of pronouncing the letter p but most of them are not aware of those differences.  Sometimes p is pronounced with a little puff of air and sometimes it is not.  These variants in pronunciation do not affect the meanings of words in English and so they are commonly ignored.  In Thai, on the other hand the different ways in which p is pronounced leads to different meanings.

A Russian friend of mine who immigrated to the United States once remarked that her mouth felt as though it were filled with cotton during her first year of pronouncing the strange sounds of English.

If one of the criteria for selecting a language for international use is that its sounds should be easy to pronounce for everyone, then it will be very difficult to find such a language.  Esperanto, of course, does not fully meet that requirement, but then only a language with very few sounds would.

The Ease of Placing the Tonic Accent in Esperanto

In Esperanto, as in English, certain syllables receive a special stress.  This stress is called the tonic accent.  It is made by slightly raising the pitch of the syllable.  In the English word “elephant” the stress is placed on the first syllable.  In the English word “emotion” the stress is placed on the second syllable. 

In Esperanto the stress always comes on the next-to-the-last syllable.  The student of this language learns that a syllable is a group of letters which contains a vowel and that the stress always comes on the syllable which contains the next to the last vowel.  After a few minutes of practice, the student of Esperanto has mastered this feature of the language.  Here are some examples:



to love

2 syllables



to hate

3 syllables




4 syllables



a television set

5 syllables

Esperanto is not unique in placing the stress on the next-to-the-last syllable.  This feature is found in other languages, such as Welsh and Swahili.

The Difficulty in Figuring Out Where to Place the Tonic Accent in English

In English the tonic accent can come practically anywhere in a word.  This is no problem for native speakers.  As children they easily pick up the correct placement of the tonic accent.  However, for foreign students of English, learning the placement of the tonic accent is another one of those tasks that comes up each time new words are learned.  That little task has to be taken care of hundreds or thousands of times, depending on how many polysyllabic words are learned.

Here are a few examples of the placement of the main tonic accent in English.  (Many English words also have a lesser stress as well.  I have only indicated the main stress in these words.)

1.  In two syllable words the accent falls on the first syllable in some words and on the second syllable in others:





2.  In three syllable words the accent falls on the first syllable in some words, on the second syllable in some words and on the third syllable in some words.




3.  In different four syllable words the primary accent can be found placed on the first, second, or third syllable.




4.  In longer words the main stress can be found in a variety of positions.


(on the fourth of five syllables)


(on the fifth of seven syllables)


(on the third of six syllables)


(on the first and sixth of seven syllables)

Errors in placing the stress are quite common among non-native speakers of English.  Sometimes this kind of variance from standard pronunciation is just one of a large number of variances that together make the speaker very hard to understand.

I was once speaking with a Ugandan who was working on an advanced degree in economics at the University of Michigan.  He said that a problem in his country was – and here I could not figure out what the next words were.  He repeated them.  When I still did not understand, he explained that people were coming over the border and stealing cows, just like in our western movies.  He had pronounced the words “cattle rustling” as “cattle rustling” and because of the misplaced stress his words were unintelligible to me.  He was a highly intelligent person and very successful in his advanced studies in economics.  He spoke English well.  However, all of these little details of English which cannot be reduced to invariable, regular rules will, from time to time, impede his ability to communicate in his international language.

This does not mean that there is anything wrong with the English system of placing stress in an irregular way.  As a matter of fact it is one of the features of the language that makes possible the wealth of rhythms in English poetry and prose.  However it does create problems for the foreign student who wants to learn a language for occasional use as an auxiliary language, problems which Esperanto completely avoids.

Chapter 6   Ambiguity or Understanding

Return to Contents