Smoothtalk Logo
Quick Pronounce Tool

How Pronunciation Works

Pronunciation is a mysterious process to many of us. We hear people who have different accents, but it is sometimes hard to pinpoint the things about their speech that make them sound so different. There are many people who manage to speak and write a language at a high or even fluent level, but who still have an accent when they speak. This can be very frustrating to both them and the people who are they communicating with. What makes pronunciation so hard for people, even those who have excellent writing and grammatical skills? If we can understand the components of pronunciation, we can apply that knowledge when trying to change an accent.

When people are first learning a language formally, pronunciation is often not given much attention. Even when it it is taught, some of the key components that have a strong effect on influencing how well a speaker is understood aren’t introduced.

Most people don’t realize the patterns they carry over from the native languages, such as the muscles movements used to make sounds and the rhythm of their speech, are what create the accent in a newly learned language. People don't know what to change about their speech, and as a result they continue having an accent which might be hard to understand, even if they try to slow down or speak more loudly. So, what are the things that people should be aware of in order to modify their accent? What is pronunciation, and how does it work?

Pronunciation is a motor skill

Language study is often divided up into the skills of reading, writing, listening, and speaking. Speaking a language well not only involves a grasp of grammar, but also knowledge of how the language's sound systems work and how to produce those sounds. When we perceive someone has an accent, it is because something about their physical habits used in speech is different than what we are accustomed to hearing. Changing an accent requires one to be aware of what different parts of your body are doing - the amount of air used, the position of your tongue, the tension in your jaw, and the pacing of your speech. In the same way that writing and riding a bike require learning and refining motor skills, mastering a different pronunciation also involves learning new motor skills in order to create the proper sounds and intonation.

The articulatory setting

Every language, and dialect, has a different way of positioning the parts that are involved in speech production -- the jaw, tongue, and lips -- while they are at rest and while they are being used. This is the articulatory setting, which is also known as the basis of articulation , point of resonance or oral posture (the last two terms are generally used by dialect coaches for actors). A speaker’s articulatory setting helps them to make the sounds that are most frequently used in their language more easily. If you are speaking another language with the same articulatory setting as your native language, then it will be harder for you to pronounce the sounds in the same way that a native speaker will. This is often what results in you speaking with an accent. Learning about and adopting the articulatory setting of your target language or dialect will make speaking it much easier.

The articulatory setting of standard American English is one where a speaker has a fairly relaxed jaw, tongue held in middle of mouth, and the tongue touching behind the teeth in the resting position. This articulatory setting easily allows for a speaker to create the many consonant sounds that are made near or through contact with the alveolar ridge behind the teeth in the mouth, and creates the particular resonance that characterizes American English. Speakers of different dialects of the same language may have very different articulatory settings. For instance, speakers of the Received Pronunciation dialect spoken in Great Britain have an articulatory setting with their tongue held very close to the roof of their mouth while speaking, which native speakers of American English dialects don't hold their tongue in the middle of their mouth. Speakers of Southern varieties of English speak with more of their words coming from the front of their mouth.

The difference between the degree of muscle movements used by speakers of different languages is perceptible, and many have noticed that, for instance, compared to speakers of other languages, native English speakers don't move their lips and jaws as much as, say, French speakers. Some people have commented they can tell a French speaker or Japanese speaker by the movement of their mouths. In recent years studies have been done using MRI and other imaging technology to compare the articulatory settings used by speakers of different languages as well as bilinguals in both those languages (English and French), and the bilinguals who sounded most native-like in their second language were found to have adopted an articulatory setting that was similar to the native speakers’.


Prosody refers to the stress, timing, and intonation in language. Understanding a language’s prosodic features and how they differ from the ones you use in your native speech is necessary because prosodic features signal additional important information. Some languages might use pitch to distinguish between word meaning, while others might use it to signal whether the sentence is a question or a statement. Using prosody in a way that is different than what listeners expect can make it much more difficult for you to be understood, and can cause massive problems. Understanding prosody will help you to understand others and will help others understand you. Studies have shown that when native speakers rank non-native speakers on their pronunciation, using good prosody has a higher impact on being easily understood than the quality of pronunciation of individual sounds.


Languages are generally categorized as stress-timed (English, Arabic, Russian), syllable-timed (Spanish, French, Swahili), or mora-timed (Japanese) languages. English is categorized as a stress-timed language because there are both stressed and unstressed syllables in English. In a syllable-timed or mora-timed language, the syllables or moras receive the same amount of emphasis. In stress-based languages, the emphasis given to words varies. When speakers of a syllable-based language begin speaking a stress-based language, or vice versa, often it is difficult to understand them because they still speak using the timing system they use in their native language.


English speakers heavily rely on stress to distinguish meaning. If they are listening to someone who uses stress in a way that is different from native speakers, it is much harder for them to figure out what that person is saying. In English, some syllable within a word are stressed, and others are unstressed. At the sentence level, some words in the stress are stressed, and others are unstressed. The way syllables are stressed within a word can distinguish the meaning of a word, and the way words are stressed within a sentence contributes to the meaning of a sentence. Generally, content words are stressed, and functional words are unstressed. Content words consist of nouns, verbs, adjectives, and adverbs, and are used to convey special meaning. Function words like articles and prepositions are there primarily to serve a grammatical function and as a result do not receive as much stress.

English stressed syllables are marked by:

  • longer duration
  • pitch
  • high volume (they're louder)
  • vowels that are clear and distinct

In contrast, unstressed syllables have:

  • shorter duration
  • lower pitch
  • lower volume
  • vowels that become the neutral vowel /ə/ (the ‘schwa’ sound)


Intonation refers to the melody of our speech. As we speak, we our voices rise up and down in patterned ways. There are several different patterns of intonation that speakers of English use. In each of these patterns the pitch of our voices follow a certain way of rising and falling. For instance, in declarative sentences, like “I am going to the supermarket”, the pitch of a native speaker of American English’s voice will rise and then fall at the end of the sentence. The intonation this speaker will use when asking a question is different, and the pitch of their voice will rise. Our intonation can indicate whether it is a statement or a question, and the type of intonation we use can provide emphasis, contrast, and information about our mood. Non-native speakers of American English generally use very different intonation patterns than native speakers, which can cause a lot of confusion since American English speakers rely on intonation to convey a wide variety of meanings.


Learners of a new language often don't use the new language's patterns for changing speech that have developed in order to speak more quickly and fluidly. This is called linking. Many languages have a version of this in which involves certain sounds to be changed or dropped in order to make speech flow more easily. In French, speakers don’t pronounce the final vowel of certain unstressed words (like the definite article le, which becomes written as l’ ) when they are followed by word beginning with a vowel. This process is called elision. Similarly, linking occurs in lots of different contexts in English. As an example, English speakers blend the boundaries between words and meld sounds in order to produce the informal contractions "gonna", "wanna", “coulda”, and "gotcha" instead of "going to", "want to", “could have”, and "got you".

Native speakers often do not pronounce a word in a stream of speech exactly how they would say the word by itself, and when a non-native speaker attempts to enunciate everything in hopes of being clear, rather than sounding more clear they sound more foreign and harder to understand because they clearly articulating the words rather than linking the sounds together is just not the way most native speakers talk. Without understanding linking, it will be difficult to follow what many native speakers are saying as they smush and link their sounds together while speaking. Additionally, mastering a language’s rules for linking words together will help you speak more quickly and sound more native-like.

Consonants and Vowels

When people learn to speak a new language, they often have difficulty pronouncing sounds that do not exist in their language. Some of the most difficult sounds for English learners to pronounce correctly are the two 'th' sounds (IPA symbols θ and ð) and the American R sound, which are both are not commonly found in the world's languages. What generally happens is that if there is a close alternative, a speaker will substitute the closest sound, so speakers will often substitute an /r/ or /l/ like sound for the American one. A variety of sounds have been substituted for the “th” sounds, even among dialects of English, and depending on a speaker's background the sounds /s/, /f/, /t/, or /d/ often appear as substitutions for the 'th' sounds.

In addition to problems pronouncing different sounds in general, sometimes the rules of one's native language get in the way of either hearing the sound correctly or pronouncing the sound in different contexts. Linguists refer to the study of sound systems as phonology, and the phonology of a language influences how they speak other languages. Speaking a language well involves understanding the target language's phonological rules, and making sure that the speaker's native language rules don't carry over if they aren't present in the target language. As an example, speakers of some languages like German and Dutch speakers often devoice (stop vibrating their vocal cords) when pronouncing words that end in sounds like /b/, /d/, and /g/. This makes them sound like /p/, /t/, and /k/. Being aware that this process occurs helps speakers know what to focus on in order to sound more clear.

Even sounds which are very similar but are produced in a slightly different way can give speakers trouble and will produce a small difference. Linguists use the International Phonetic Alphabet (IPA) chart to describe the characteristics of different sounds. However, the IPA notation doesn't always indicate all differences. As an example, Spanish speakers make the /t/ and /d/ sounds in the dental position (touching near or on the teeth) by lightly tapping the tongue. In contrast, American English speakers touch the blade of their tongue to their alveolar ridge, which is further away from their teeth. These small physical differences in creating a /d/ sound make the /d/ sound in each of these languages sound different.

Mastering these sounds involves learning new muscles movements in order to create the right sound at the right time. Understanding the differences between how a similar sound is pronounced can help lessen an accent. For instance, the /d/ sound in American English is made by touching the tongue to the alveolar ridge. In Spanish, the /d/ sound is dental that is made by touching the tongue to the back of the teeth. In Hindi, the /d/ is a retroflex sound made by curling the tongue backwards before touching the tongue to the roof of the mouth. Other physical changes, such as creating or stopping airflow, rounding of lips, and lowering or tensing the jaw, may need to take place in order to create different sounds accurately.

Practice, practice, practice

Pronunciation is a motor skill that can be mastered, and this mastery can be facilitated through an understanding of the various components of the layers involved in pronouncing a certain accent. Start by gaining an awareness of what your native articulatory setting is, and how your target accent's articulatory setting is different. Then, work on study the various aspects of prosody. Understanding and using prosody well will make it easier others to understand the meaning of your individual words as well as entire sentences more quickly. Once you are able to maintain your target accent's articulatory setting and prosody, work on learning how to properly pronounce individual sounds and connected speech. Daily practice while focusing on these skills and comparing them to native speakers of your target accent will go a long way in helping you change your accent.

© 2020