What Languages are Spoken in China?
Learn Chinese in China or on Zoom and gain fluency in Chinese!
Join CLI and learn Chinese with your personal team of Mandarin teachers online or in person at the CLI Center in Guilin, China.
China is a vast country with a long, complicated history, and this complexity is reflected in the stunning level of linguistic diversity still present in the country today. In this article, you will learn about the many hundreds of dialects and languages in China.
Table of Contents
- How many languages are there in China?
- Why are there so many languages in China?
- The 10 main Chinese varieties
- Chinese is a Sino-Tibetan language
- Other languages families spoken in China
- 1. The Tai-Kadai (Kra-Dai) language family
- 2. The Hmong-Mien language family
- 3. The Austroasiatic (Mon-Khmer) language family
- 4. The Turkic language family
- 5. The Mongolic language family
- 6. The Tungusic language family
- 7. The Koreanic language family
- 8. The Indo-European language family
- 9. The Austronesian language family
- 10. The Sino-Tibetan language family
- A language lover’s paradise
- Chinese Language Vocabulary
How many languages are there in China?
Officially, there are 302 living languages in China. Depending on your definition of “language” and “dialect,” this number can vary somewhat.
The number of speakers of many of China’s minority languages and dialects has decreased in recent years, and some of them are now considered endangered. That said, China remains a land of impressive linguistic diversity.
The languages of China can be divided into two different categories.
Mandarin and other Chinese variants
The first category includes Mandarin Chinese, which belongs to the Sino-Tibetan language family and is the most commonly spoken language in China.
In addition, it also includes languages which are considered variants of Chinese, such as Cantonese. These are often referred to as dialects, even though they are often mutually unintelligible.
Officially, there are ten different varieties of Chinese, although some sources only list eight because the last two are only spoken by less than 1% of the population. These variants are written using Chinese characters and do not have their own written form.
Minority languages
The second category consists of minority languages which belong to completely different language families, or, in the case of the Bai language and many of the languages of Tibet, belong to a different branch of the Sino-Tibetan language family.
Minority languages are generally spoken by members of China’s ethnic minority groups. Many of these languages have their own written form that is not based upon Chinese characters.
A note on spoken versus written Chinese
When discussing the topic of languages in China, it’s common for people to bring up the difference between traditional and simplified Chinese. While this is an important distinction for prospective language learners to understand, it only involves differences in how Chinese characters are written, not in how the Chinese language is actually spoken.
Because the focus of this article is on the spoken languages of China, we will not spend time discussing the differences between traditional and simplified Chinese characters here.
Why are there so many languages in China?
The existence of such a large number of languages in China is related to China’s expansive geography, long history and some unique features of written Chinese.
Geography
Geographic isolation is one of the main drivers of linguistic diversity. Just as animal species evolve in unique ways after spending thousands of years on isolated islands, languages also tend to evolve in different directions when speakers are cut off from other groups for long periods of time.
In the past, travel was time-consuming, difficult and dangerous. As a result, communication between different groups of people was infrequent and speakers in one area had few opportunities to hear the languages spoken by people in other places.
Today, the effects of such geographic isolation are much less pronounced due to the advent of the internet and fast, accessible travel networks.
History
China has a very long history. Over the course of thousands of years, the Chinese empire expanded and contracted many times. Each new territorial expansion brought new groups of people into China’s linguistic and cultural fold.
The waxing and waning of various Chinese dynasties was often accompanied by violent clashes. Political conflicts led to wars and the ensuing societal disruption and famine often forced waves of refugees to flee from one region to another, bringing their languages and dialects with them (Hakka is a quintessential example of this). Many thousands of years of upheaval helped to scatter various languages throughout the empire.
Frequently, refugee groups chose to settle in geographically isolated areas for safety. In isolation, their spoken languages evolved in many different directions while also retaining certain original linguistic features. Slowly, they morphed into the myriad languages that exist in China today.
The Latin Comparison
When thinking about how spoken languages evolved in China, it can be helpful to draw comparisons with the situation in ancient Europe. In Europe, Latin (拉丁文 Lādīngwén) was once the common language.
After the Roman Empire disintegrated, the religious and intellectual elite in the various parts of Europe continued to communicate with each other in Latin, which remained the international language of the educated class until the 18th century.
While the international elite used Latin to communicate, the common people continued speaking to each other using everyday language. Slowly, the everyday language spoken in various different parts of Europe diverged from its original written form, becoming a collection of regional dialects.
Because Latin is phonetic, people were able to use the Latin alphabet to write down their new regional dialects. These dialects evolved in different directions and eventually became the Romance languages that exist today.
In some ways, the situation in China was similar. Classical Chinese was the lingua franca of the Chinese elite. Although they didn’t speak it in everyday life, officials from different corners of the empire communicated using written classical Chinese.
Unlike Latin, classical Chinese isn’t alphabetic. Therefore, it remained quite distinct from the spoken languages used in most parts of China.
Speakers could still use their local spoken dialects to pronounce written classical Chinese in their own way, so unlike the European Romance languages, the different varieties of Chinese never developed their own written forms.
The 10 main Chinese varieties
Officially, there are 10 different varieties of Chinese. In Chinese, they are referred to as 汉语方言 (Hànyǔ fāngyán), or “Chinese dialects.”
Note that linguists consider many of these “dialects” (方言 fāngyán) to be separate languages (语言 yǔyán). Within mainland China, it’s politically correct to refer to the various varieties of Chinese as “dialects” even though the difference between some Chinese variants and Standard Mandarin is greater than the difference between most European languages.
All 10 varieties of Chinese (including Standard Mandarin) are classified as Sinitic languages, which are a branch of the Sino-Tibetan language family.
The 10 recognized varieties of Chinese are listed below in descending order by the total number of native speakers.
1. Standard Mandarin
Standard Mandarin, or Standard Chinese, commonly referred to as Putonghua 普通话 (Pǔtōnghuà) in mainland China and Guoyu 国语 (Guóyǔ) in Taiwan, is China’s predominant language. This language is also referred to as 汉语 (Hànyǔ) because it is the language spoken by the majority Han ethnic group (汉族 Hànzú).
65.7% of the Chinese population are native Mandarin speakers, making it the variety of Chinese with the largest number of native speakers. When people around the world study Chinese, Standard Mandarin is what most of them learn.
Standard Mandarin was originally based on a dialect of Chinese spoken in Beijing, but it’s now spoken by people all over the country. Mandarin is China’s main official language and it’s the main language used to communicate in China today.
2. Min Chinese
Min Chinese 闽语 (Mǐnyǔ) refers to a group of languages or dialects that are most commonly spoken in the coastal province of Fujian. Around 6.2% of China’s population are native Min speakers. Perhaps because of geographic isolation resulting from Fujian’s mountainous terrain, there are many different varieties of Min.
One well-known variety of Min is Hokkien. This variety is a Southern Min language from south eastern Fujian that is also spoken in Taiwan. Hokkien is spoken by members of overseas Chinese communities both in South East Asian and in China towns in many countries, including the US.
3. Wu Chinese
Wu Chinese 吴语 (Wúyǔ) refers to a group of related Chinese varieties spoken in and around Shanghai. Although it’s sometimes referred to as Shanghainese, Wu is actually thought to have originated in Suzhou. It is spoken by 6.1% of China’s population. The language is most commonly spoken in Shanghai, as well as in parts of the neighboring Jiangsu and Zhejiang Provinces.
4. Cantonese (Yue)
In addition to Mandarin, Cantonese 粤语 (Yuèyǔ) is perhaps the dialect of Chinese with which people outside of China are most familiar. Although it’s only spoken by 5.6% of the population, its association with Hong Kong martial arts films and the Chinese diaspora has given it heightened visibility abroad, especially compared to many of the other less well-known Chinese variants.
Different versions of Cantonese are spoken in Guangdong and Guangxi, as well as Hong Kong and Macau. It sounds quite different from Mandarin Chinese and the two are mutually unintelligible.
Cantonese retains several features of ancient spoken Chinese. If you read Tang dynasty poetry in Standard Mandarin, you won’t hear much of the poet’s intended rhyme scheme. If you read it in Cantonese, you’ll suddenly be able to hear much more of the original rhyme.
5. Jin Chinese
Jin Chinese 晋语 (Jìnyǔ) dialects are spoken by 5.2% of the population. Most Jin speakers live in north central areas of China including Shanxi Province as well as parts of Inner Mongolia, Hebei, Henan and Shaanxi. Because most residents of Shanxi Province speak Jin, this dialect is also sometimes called 山西话 (Shānxīhuà).
6. Gan Chinese
Gan Chinese 赣语 (Gànyǔ) refers to a group of Chinese dialects that are mostly spoken in Jiangxi Province. Gan speakers can also be found in Hunan, Hubei, Fujian and Anhui. Because most speakers are in Jiangxi, it’s sometimes called 江西话 (Jiāngxīhuà). Gan is spoken by 3.9% of the population and is considered to be closely related to Hakka.
7. Hakka (Kejia) Chinese
After Mandarin and Cantonese, Hakka Chinese or Kejia (客家话 Kèjiāhuà) is perhaps the most well-known Chinese variety abroad, probably due to the fact that Hakka is commonly spoken in many overseas Chinese communities.
In China, Hakka is spoken by 3.5% of the population. Today, Hakka speakers are scattered throughout Guangdong, Fujian, Guangxi, Hainan, Sichuan, Jiangxi, Guizhou, Hong Kong and Taiwan. Speakers can also be found in Singapore, Indonesia and Malaysia as well as in China towns around the world.
The fascinating history of the Chinese Hakka varieties and of the Hakka (Kejia) people more generally is emblematic of how war and migration have helped shape the linguistic landscape of modern China.
The ancestors of today’s Kejia people were originally from northern China. However, they fled south as refugees to escape war and unrest in northern China beginning in the Western Jin dynasty (266-420 CE). To this day, their name in Chinese, 客家 (Kèjiā), which means “guest families,” bears witness to their status as migrants.
Like Cantonese, Hakka Chinese retains several features of ancient spoken Chinese that no longer exist in Standard Mandarin. Perhaps due to their status as refugees, groups of Hakka people in China often live in geographically isolated areas, which has contributed to great linguistic diversity among the various dialects classified as Hakka.
8. Xiang Chinese
Xiang Chinese 湘语 (Xiāngyǔ) is most commonly spoken in Hunan Province, although speakers can also be found in Guangxi, Guizhou and Hubei. Because of the large concentration of speakers in Hunan, it’s also often called Hunanese. Around 3% of the population speak some variety of Xiang.
9. Huizhou Chinese
Huizhou Chinese 徽州话 (Huīzhōuhuà), also called Hui 徽语 (Huīyǔ), is a small group of dialects spoken by scattered communities of people in Anhui, Zhejiang and Jiangxi. This dialect group is only spoken by 0.3% of the population. Because it’s so small, it’s often left off of lists of Chinese varieties.
10. Pinghua Chinese and other unclassified dialects
Pinghua Chinese (平话 Pínghuà) is mainly spoken by people in parts of Guangxi Zhuang Autonomous Region. Since the number of speakers is quite small, it, like Huizhou Chinese, is often not included on lists of Chinese varieties.
In addition to Pinghua, China also has a few varieties of Chinese, such as Danzhou Dialect 儋州话 (Dānzhōuhuà), which remain unclassified. Speakers of Pinghua and the various unclassified Chinese dialects make up 0.6% of the population.
Chinese is a Sino-Tibetan language
Chinese, as a member of the Sino-Tibetan language family, shares its origins with over 400 languages spoken by approximately 1.4 billion people across East, South, and Southeast Asia.
This section delves into the historical context, linguistic features, and classification challenges of Sino-Tibetan languages, highlighting Chinese's unique place within this diverse and ancient family.
Historical context
The Sino-Tibetan language family is one of the most ancient and expansive linguistic families in the world, with its origins tracing back several millennia.
The Chinese language, being a significant member of this family, has a rich history that dates back to early Chinese civilizations. Historical records, such as oracle bone inscriptions, provide evidence of Chinese writing dating back to the Shang Dynasty (1600-1046 BCE). Over the centuries, Chinese has evolved through various dynasties, each contributing to its development and complexity.
The spread of Sino-Tibetan languages, including Chinese, is closely tied to the migration and interactions of ancient peoples across East, South, and Southeast Asia. As Chinese civilization expanded, it influenced and was influenced by neighboring cultures and languages, leading to the rich tapestry of dialects and linguistic variations seen today.
The historical context of Chinese as a Sino-Tibetan language underscores its enduring legacy and cultural significance.
Linguistic features
Sino-Tibetan languages, including Chinese, exhibit distinctive phonological and grammatical characteristics. One of the most notable features of Chinese is its use of tones to distinguish meaning. Mandarin Chinese, for example, has four primary tones, with a fifth neutral tone, allowing the same syllable to have different meanings based on tonal variation.
This tonal aspect is a common feature among Sino-Tibetan languages, though the number and nature of tones can vary significantly.
In addition to tonal variation, Chinese is characterized by its lack of inflection, meaning that words do not change form based on tense, number, or case. Instead, Chinese relies heavily on word order and contextual cues to convey grammatical relationships.
This syntactic structure is quite different from many Indo-European languages, where inflection plays a significant role. The linguistic features of Chinese, rooted in its Sino-Tibetan heritage, contribute to its unique and complex nature.
Classification challenges
Classifying Sino-Tibetan languages, including Chinese, poses significant challenges due to their vast diversity and historical spread.
While Chinese, as the most prominent member, is well-documented and studied, many other Sino-Tibetan languages remain less understood. The sheer number of languages, coupled with varying degrees of documentation and linguistic analysis, makes comprehensive classification a daunting task for linguists.
Additionally, the relationships between the different branches of the Sino-Tibetan family are not always clear-cut. While Chinese (Sinitic) languages are distinct, the connections between other branches, such as Tibeto-Burman languages, are often debated.
Efforts to reconstruct a proto-Sino-Tibetan language face obstacles due to the lack of consistent historical records and the diverse evolution of these languages over time. Despite these challenges, ongoing linguistic research continues to shed light on the intricate web of Sino-Tibetan languages, with Chinese at its core.
Other languages families spoken in China
In addition to the variants of Chinese listed in the previous section, there are also a variety of other languages spoken in China, most of which aren’t related to the Sino-Tibetan language family. Their presence in China can be traced back to the country’s long history of contact, conflict and coexistence with other peoples and cultures.
Below are some of the other major language families present in China today. Note that languages in these families are generally spoken by members of China’s 55 ethnic minority groups who collectively make up around 8% of China’s population.
1. The Tai-Kadai (Kra-Dai) language family
Several ethnic minority groups living in southern China speak languages that belong to the Tai-Kadai family. These ethnic groups include the Li, the Dai, the Dong and the Bouyei. Languages in this family are also spoken by the Zhuang, many of whom live in Guilin and nearby Longsheng. Other speakers of languages in this family can be found in Southeast Asia and Northeastern India.
2. The Hmong-Mien language family
Speakers of languages in the Hmong-Mien family include people living in mountainous areas of southern China and Southeast Asia. Chinese ethnic minority groups that speak languages in this family include the Miao, the Yao and the She.
3. The Austroasiatic (Mon-Khmer) language family
Speakers of the languages belonging to the Austroasiatic language family can be found in southern China as well as Southeast Asia, India, Nepal and Bangladesh. Vietnamese and Khmer both belong to this family. In China, Austroasiatic languages are spoken by the De’ang, Jing, Blang and Wa ethnic groups.
4. The Turkic language family
The Turkic language family includes a large number of speakers spread over a wide geographic area. These languages are spoken in northwestern China as well as throughout Western, Eastern, Northern and Central Asia, Southern and Eastern Europe and the Caucasus. Turkish is a representative example of a language in this group.
In China, Turkic languages are spoken by several different ethnic groups including the Uzbeks, Kazakhs, Uighurs, Kyrghiz, Tatar (Dada), Salar and the Yugurs (“Yellow Uighurs”). In addition, Turkic languages are also spoken by two unrecognized Chinese ethnic groups living in Xinjiang. These include the Aynu and the Ili Turk. The Ili Turk speak an endangered language known as Ili Turki. Sadly, there are only 30 families left in China who speak this language.
5. The Mongolic language family
The Mongolic language family includes languages spoken by people in Mongolia, Inner Mongolia, Russia and Afghanistan. The most common language in this family is Mongolian. Mongolic languages are spoken by people living in northern China including the Mongols, Dongxiang, Monguor, Daur, Bonan, Yugurs (“Yellow Uighurs”) and the Kuangjia Hui.
6. The Tungusic language family
Languages in the Tungusic family are spoken by people in North and East Asia, specifically in Siberia and Manchuria. There aren’t many speakers of Tungusic languages left and many languages in this family are considered endangered.
Some members of northern Chinese ethnic groups including the Manchu, Sibe, Hezhen, Evenki and Oroqen people still speak languages in this family.
7. The Koreanic language family
Unlike the other language families on this list, the Koreanic language family does not include a wide variety of different languages. Instead, it only includes various regional varieties of the Korean language, which is considered a language isolate.
Korean is spoken in East and Central Asia in countries including North and South Korea, Russia and China. In China, people belonging to the Korean ethnic group speak “Chinese Korean” which is based off of the standard form of Korean spoken in North Korea.
8. The Indo-European language family
The Indo-European language family is a large group that includes European languages like English, French, Urdu and Persian. In China, speakers of languages in this family include members of various minority groups including the Tajiks and the Russians. Two well-known members of the Indo-European family, Portugese and English, are also spoken in Macau and Hong Kong, respectively.
9. The Austronesian language family
The Austronesian language family is a large group of languages with many speakers in Maritime and Mainland Southeast Asia, Oceania and Madagascar. Indonesian is a representative example of a language in this family. In China, Austonesian languages are spoken by the Utsul people (officially considered to be members of the Hui minority group) of Hainan and the Gaoshan people of Taiwan.
10. The Sino-Tibetan language family
As already mentioned, Mandarin Chinese and the other Chinese variants discussed earlier all belong to the Sinitic branch of the Sino-Tibetan language family.
The Bai language, spoken by the Bai ethnic group, is considered to be a separate member of the Sino-Tibetan family.
The Sino-Tibetan family also has a Tibeto-Burman branch. Languages in this branch are spoken Southeast Asia, as well as parts of South and East Asia. Its most representative languages are Tibetan and Burmese.
Within China, there are almost 20 ethnic groups that speak languages belonging to this branch. These groups include the Naxi, Yi, Lisu, Derung, Nu, Achang and Hani, among others.
A language lover’s paradise
China’s great linguistic diversity makes it a paradise for anyone passionate about language and linguistics. Studying China’s various languages provides a fascinating window into the country’s rich history, vast geography, and mind-boggling cultural and ethnic diversity.
Chinese Language Vocabulary
Hànzì | Pīnyīn | Definition |
---|---|---|
拉丁文 | Lādīngwén | Latin |
方言 | fāngyán | dialect |
语言 | yǔyán | language |
汉语方言 | Hànyǔ fāngyán | Chinese dialects |
普通话 | Pǔtōnghuà | Standard Mandarin (literally "common speech"); term used in mainland China |
国语 | Guóyǔ | Standard Mandarin; term used in Taiwan |
汉语 | Hànyǔ | Standard Mandarin (literally "Han language") |
汉族 | Hànzú | the Han ethnic group (China's largest ethnic group) |
闽语 | Mǐnyǔ | Min Chinese |
吴语 | Wúyǔ | Wu Chinese |
粤语 | Yuèyǔ | Cantonese (also called Yue Chinese) |
晋语 | Jìnyǔ | Jin Chinese |
赣语 | Gànyǔ | Gan Chinese |
客家话 | Kèjiāhuà | Hakka (Kejia) Chinese |
客家人 | Kèjiārén | the Hakka people (literally "guest families"); considered members of the Han majority |
湘语 | Xiāngyǔ | Xiang Chinese |
徽州话 | Huīzhōuhuà | Huizhou Chinese |
平话 | Pínghuà | Pinghua Chinese |
Anne Meredith holds an MA in International Politics and Chinese Studies from the Johns Hopkins School of Advanced International Studies (SAIS). As part of the graduation requirements for the program, Anne wrote and defended a 70-page Master's thesis entirely in 汉字 (hànzì; Chinese characters). Anne lives in Shanghai, China and is fluent in Chinese.