Terminal gate celing of Kolkata Airport, decorated with Bengali words.

The Bengali script or Bangla alphabet is the standard writing system used to write the Bengali language, and has historically been used to write Sanskrit within Bengal. An estimated 300 million people use this syllabic alphabet, which makes it the 5th most commonly used writing system in the world. It is the sole national script of Bangladesh and one of the official scripts of India, specifically used in the Indian states of West Bengal, Tripura and the Barak Valley of Assam. The script is also used for the Meitei language in Manipur, defined by the Manipur Official Language Act.

From a classificatory point of view, the Bengali writing system is derived from the Brahmi script. It is written from left to right. It is an abugida, i.e., its vowel graphemes are mainly realised not as independent letters, but as diacritics modifying the inherent vowel in the base letter to which they are added. There are no distinct upper and lower case letter forms, which makes it a unicameral script. The script is characterised by many conjuncts, upstrokes, downstrokes, and other features that hang from a horizontal line running along the tops of the graphemes that links them together called matra (মাত্রা [ˈmat̪ɾaˑ] 'measure'). The punctuation is all borrowed from 19th-century English, with the exception of one.

Characters

The Bangla alphabets for the first time were widely observed in the Anulia copper plate of Lakshmana Sena

The Bengali script can be divided into vowels and vowel diacritics, consonants and conjunct consonants, diacritical and other symbols, digits, and punctuation marks. Vowels and consonants are used as letters and also as diacritical marks.

Vowels

The Bengali script has a total of 11 vowel graphemes, each of which is called a svaravana. They represent six of the seven main vowel sounds of Bengali, along with two vowel diphthongs. All of them are used in both Bengali and Assamese languages.

  • The letter অ ô /ɔ/ (স্বর অ sbôr ô 'vocalic ô') represents the default inherent vowel for the entire Bengali script. It is the first letter of the Bangla alphabet. Other Eastern Indo-Aryan languages – namely Assamese and Odia – have this value for the inherent vowel. This corresponds to /ə~ɐ~ʌ/ in other Indic languages using a Brahmi-derived script – all of which ultimately derive from the Sanskrit inherent vowel अ a /ɐ/.
  • Even though the near-open front unrounded vowel /æ/ is one of the seven main vowel sounds in standard Bengali, no distinct vowel symbol has been allotted for it in the script since there is no /æ/ sound in Sanskrit—the primary written language when the script was conceived. The use of অ্যা ê is widespread and is found to be used in many contexts of /æ/, such as in loanwords like অ্যাসিড 'acid' and অ্যানিমে 'anime'. The sound is also orthographically realised by multiple means in modern Bengali orthography, usually using some combination of এ e /e/ (স্বর এ sbôr e 'vocalic e'), অ ô, আ a /a/ (স্বর আ sbôr a 'vocalic a'), and the yôphôla (যফলা [ˈdʒɔfɔlaˑ]) (the diacritic form of the consonant grapheme য /dʒɔ/). Thus ক /k/ written with the vowel অ্যা /æ/ is written as ক্যা /kæ/.
  • There are two graphemes for the vowel sound /i/, and two graphemes for the vowel sound /u/. The redundancy stems from the time when this script was used to write Sanskrit, which had contrastive short and long vowels. Bengali lost phonemic long vowels and merged short and long /i/ and /u/: ই (হ্রস্ব ই hrôsbô i 'short i') and ঈ (দীর্ঘ ঈ dirghô i 'long i') are both pronounced /i/; and উ (হ্রস্ব উ 'short u') and ঊ (দীর্ঘ ঊ 'long u') are both pronounced /u/ in modern Bengali. The letters are preserved in the Bengali script with their traditional names, as they serve an etymological function in preserving the original Sanskrit spelling (e.g., in tatsama words).
  • The grapheme ঋ (হ্রস্ব ঋ 'short ri') does not actually represent a vowel phoneme in Bengali, but instead represents the consonant-vowel combination রি /ri/. Nevertheless, it is included in the vowel section of the inventory of the Bengali script. This inconsistency is also a remnant from Sanskrit, where the grapheme represents the vocalic equivalent of a retroflex approximant (possibly an r-colored vowel). A similar (and now obsolete) grapheme, ঌ (হ্রস্ব ঌ 'short li'), which represented the vocalic equivalent of a dental approximant in Sanskrit but actually represented the consonant-vowel combination লি /li/ in Bengali, was once included in the Bengali alphabet, but was eventually discarded from the inventory due to its extremely limited usage (even in Sanskrit).
  • When a vowel sound occurs syllable-initially or when it follows another vowel, it is written using a distinct letter. When a vowel sound follows a consonant (or a consonant cluster), it is written with a diacritic which, depending on the vowel, can appear above, below, before or after the consonant. These vowel marks cannot appear without a consonant and are called kar (কার 'vowel sign').
  • An exception to the above system is the vowel /ɔ/, which has no vowel mark but is considered inherent in every consonant letter. To denote the absence of the inherent vowel /ɔ/ following a consonant, a diacritic known as the virama (হসন্ত hôsôntô) may be written underneath the consonant.
  • Although there are only two diphthongs in the inventory of the script – ঐ oi /oi/ (স্বর ঐ sbôr oi) and (ঔ ou /ou/ (স্বর ঔ sbôr ou) – the Bengali phonemic inventory has many more diphthongs. Most diphthongs are represented by juxtaposing the graphemes of their constituent vowels, as in কেউ keu [keu̯~keo̯].
  • There also used to be two long vowels – ৠ ṛṛi and ৡ ḷḷi – which were removed from the inventory during the Vidyasagarian reform of the script, due to peculiarity to Sanskrit.

The table below shows the vowels present in the modern (i.e., since the late 19th century) inventory of the Bengali abugida:

Bengali vowels(স্বরবর্ণ sbôrôbôrṇô)
হ্রস্ব (short)দীর্ঘ (long)
স্বর (vowel phoneme)কার (vowel mark)স্বর (vowel phoneme)কার (vowel mark)
কন্ঠ্য (Guttural)ô /ɔ~o/-a /ɐ/
তালব্য (Palatal)i /i/িī /i/
ওষ্ঠ্য (Labial)u /u/ū /u/
মূর্ধন্য (retroflex) /ri/
দন্ত্য (Dental) /li/
যুক্তস্বর (complex vowels)
কন্ঠ্যতালব্য (Palato-guttural)e /e/oi /oi/
কন্ঠৌষ্ঠ্য (Labioguttural)o /o~w/ou /ou/

Notes

The consonant ক () along with the diacritic form of the vowels আ, ই, ঈ, উ, ঊ, ঋ, এ, ঐ, ও and ঔ

Consonants

Consonant letters are called ব্যঞ্জনবর্ণ bênjônbôrṇô in Bengali. The names of the letters are typically just the consonant sound plus the inherent vowel অ ô. Since the inherent vowel is assumed and not written, most letters' names look identical to the letter itself (e.g., the name of the letter ঘ is itself ghô, not gh).

  • Some letters that have lost their distinctive pronunciation in modern Bengali are called by more elaborate names. For example, since the consonant phoneme /n/ is written as both ন and ণ, the letters are not called simply ; instead, they are called দন্ত্য ন dôntyô nô ('dental ') and মূর্ধন্য ণ murdhônyô nô ('retroflex '). What was once pronounced and written as a retroflex nasal ণ /ɳ/ is now pronounced as an alveolar /n/ (unless conjoined with another retroflex consonant such as ট, ঠ, ড and ঢ).
  • Although still named murdhônyô when taught, retroflex consonants do not exist in Bengali on the phonemic scale, and are instead fronted to their post-alveolar and alveolar equivalents.
  • The voiceless palato-alveolar sibilant phoneme /ʃ/ can be written as শ (তালব্য শ talôbyô śô 'palatal śô'), ষ (মূর্ধন্য ষ murdhônyô ṣô 'retroflex ṣô') or স (দন্ত্য স dôntyô sô 'dental '). These graphemes once represented the alveolo-palatal /ɕ/, the retroflex /ʂ/, and the alveolar /s/ in Sanskrit, respectively, but have all merged into the palato-alveolar /ʃ/ (or /ɕ/, depending on dialect) in modern standard Bengali.
  • The voiced palato-alveolar affricate /dʒ/ can be written in two ways: as য (অন্তঃস্থ য ôntôḥsthô yô 'semivocalic ') or as জ (বর্গীয় জ bôrgiẏô jô 'plosive '). In many varieties of Bengali, /z,dz/ are not distinct from this phoneme, but speakers who distinguish them may use the letters য and জ with contrast.[citation needed]
  • Post-reform, the letter য় was introduced to distinguish it from য []: The semivowel য় ẏô /e̯ɔ/ cannot occur at the beginning of a word. The name of য় is অন্তঃস্থ অ ôntôḥsthô ô 'semivocalic ẏ'. The pronunciation of য় ẏô /e̯ɔ/ varies between the glides /w/ and /j/. The name of য is অন্তঃস্থ য ôntôḥsthô yô ('semivocalic y'), whose onset was originally pronounced as a glide /j/ but became an affricate /dʒ/ in modern Bengali. It is found almost entirely at the beginning of words.[citation needed] When present in the middle of words, in conjuncts, য is represented as a distinct letter: ্য (যফলা yôphôla), which is mostly silent or semi-silent. The yôphôla may either alter the pronunciation of the surrounding vowel, genimate the preceding consonant, or be completely silent.
  • Since the nasalsñô /e̯ɔ/ and ঙ ṅô /ŋɔ/ cannot occur at the beginning of a word in Bengali, their names are actually pronounced [ˈ(n)ĩjɔˑ] (i.e., (n)ĩẏô) and [ˈuŋɔˑ~ˈumɔˑ] (i.e., uṅô~umô), respectively.
  • There is a difference in the pronunciation of ড় ṛô and ঢ় ṛhô, similar to other Indic languages. This is especially true in the parlance of the western and southern parts of Bengal, but lesser on the dialects of the eastern side of the Padma River.[citation needed] ড় and ঢ় were introduced to the inventory during the Vidyasagarian reform to indicate the retroflex flap in the pronunciation of ড ḍô and ঢ ḍhô in the middle or end of a word. It is an allophonic development in some Indic languages not present in Sanskrit. In ordinary speech these letters are pronounced the same as র in modern Bengali.
Bengali consonants(ব্যঞ্জনবর্ণ bênjônbôrṇô)
স্পর্শ (Stop)অনুনাসিক (Nasal)অন্তঃস্থ (Approximant)ঊষ্ম (Fricative)
বর্গীয় বর্ণ (Generic sounds)
Voicingঅঘোষ (Voiceless)ঘোষ (Voiced)অঘোষ (Voiceless)ঘোষ (Voiced)
Aspirationঅল্পপ্রাণ (Unaspirated)মহাপ্রাণ (Aspirated)অল্পপ্রাণ (Unaspirated)মহাপ্রাণ (Aspirated)অল্পপ্রাণ (Unaspirated)মহাপ্রাণ (Aspirated)
কন্ঠ্য (Guttural) /kɔ/khô /ɔ/ /ɡɔ/ghô /ɡʱɔ/ṅô /ŋɔ/ /ɦɔ~hɔ/
তালব্য (Palatal) /ɔ~ɔ/chô /tʃʰɔ~tɕʰɔ/ /ɔ~ɔ/jhô /dʒʱɔ~dʑʱɔ/ñô /nɔ/ /ɔ~dzɔ~zɔ/śô /ʃɔ~ɕɔ~sɔ/
মূর্ধন্য (Retroflex)ṭô /ʈɔ/ṭhô /ʈʰɔ/ḍô /ɖɔ/ḍhô /ɖʱɔ/ṇô /nɔ/ /ɾɔ/ṣô /ɕɔ~ʃɔ~ʂɔ/
দন্ত্য (Dental) /ɔ/thô /t̪ʰɔ/ /ɔ/dhô /d̪ʱɔ/ /nɔ/ /lɔ/ /sɔ~ɕɔ~ʃɔ/
ওষ্ঠ্য (Labial) /pɔ/phô /pʰɔ/ /bɔ/bhô /bʱɔ/ /mɔ/ /wɔ/
Post-reform lettersড়ṛô /ɽɔ/ঢ়ṛhô /ɽʱɔ~ɽ/য়ẏô /jɔ~ɔ/

Notes

Consonant conjuncts

The consonant ligature ndrô (ন্দ্র): (ন) in green, (দ) in blue, and (র) in maroon.

Clusters of up to four consonants can be orthographically represented as a typographic ligature, called a consonant conjunct (Bengali: যুক্তাক্ষর/যুক্তবর্ণ yuktakṣôr/yuktôbôrṇô, or more precisely, যুক্তব্যঞ্জন yuktôbêñjôn). Typically, the first consonant in the conjunct is shown above or to the left of the following consonants. Many consonants appear in an abbreviated or compressed form when serving as part of a conjunct. Others simply take exceptional forms in conjuncts, bearing little or no resemblance to the base character.

Often, consonant conjuncts are not actually pronounced as would be implied by the pronunciation of the individual components. For example, adding ল underneath শ śô in Bengali creates the conjunct শ্ল, which is pronounced /slɔ/ (and not /ʃlɔ/) in Bengali. Many conjuncts represent Sanskrit sounds that were lost centuries before modern Bengali was ever spoken; for instance, জ্ঞ jñô, which is a combination of জ and ঞ ñô, is pronounced ggô /gːɔ/ in modern Bengali (which does not permit the sequence /*dʒɲ/). Thus, as conjuncts often represent combinations of sounds that cannot be easily understood from the components, the following descriptions are concerned only with the construction of the conjunct, and not the resulting pronunciation.

Fused forms

Some consonants fuse in such a way that one stroke of the first consonant also serves as a stroke of the next.

  • The consonants can be placed on top of one another, sharing the same vertical line, e.g., ক্ক kkô, গ্ন gnô, গ্ল glô, ন্ন nnô, প্ন pnô, প্প ppô, ল্ল llô, etc.
  • As the last member of a conjunct, ব can hang on the vertical line under the preceding consonants, taking the shape of ব (including বফলা bôphôla), e.g. গ্ব gbô, ণ্ব ṇbô, দ্ব dbô, ল্ব lbô, শ্ব śbô.
  • The consonants can also be placed side-by-side, sharing their vertical line, e.g., দ্দ ddô, ন্দ ndô, ব্দ bdô, ব্জ bjô প্ট, pṭô, স্ট sṭô, শ্চ ścô, শ্ছ śchô, etc.

Approximated forms

Some consonants are written closer to one another simply to indicate that they are in a conjunct together.

  • The consonants can be placed side-by-side, appearing unaltered, e.g., দ্গ dgô, দ্ঘ dghô, ড্ড ḍḍô.
  • As the last member of a conjunct, ব can appear immediately to the right of the preceding consonant, taking the shape of ব (including বফলা bôphôla), e.g., ধ্ব dhbô, ব্ব bbô, হ্ব hbô.

Compressed forms

Some consonants are compressed (and often simplified) when appearing as the first member of a conjunct.

  • As the first member of a conjunct, the consonants ঙ ṅô, চ , ড ḍô, and ব are often compressed and placed at the top-left of the following consonant with little or no change to the basic shape, e.g., ঙ্ক্ষ ṅkṣô, ঙ্খ ṅkhô, ঙ্ঘ ṅghô, ঙ্ম ṅmô, চ্চ ccô, চ্ছ cchô, চ্ঞ cñô, ড্ঢ ḍḍhô, ব্‍ব bbô.
  • As the first member of a conjunct, ত is compressed and placed above the following consonant, with little or no change to the basic shape, e.g., ত্ন tnô, ত্ম tmô, ত্ব tbô.
  • As the first member of a conjunct, ম is compressed and simplified to a curved shape. It is placed above or to the top-left of the following consonant, e.g., ম্ন mnô, ম্প mpô, ম্ফ mphô, ম্ব mbô, ম্ভ mbhô, ম্ম mmô, ম্ল mlô.
  • As the first member of a conjunct, ষ ṣô is compressed and simplified to an oval shape with a diagonal stroke through it. It is placed to the top-left of the following consonants, e.g., ষ্ক ṣkô, ষ্ট ṣṭô, ষ্ঠ ṣṭhô, ষ্প ṣpô, ষ্ফ ṣphô, ষ্ম ṣmô.
  • As the first member of a conjunct, স is compressed and simplified to a ribbon shape. It is placed above or to the top-left of the following consonant, e.g., স্ক skô, স্খ skhô, স্ত stô, স্থ sthô, স্ন snô, স্প spô, স্ফ sphô, স্ব sbô, স্ম smô, স্ল slô.

Abbreviated forms

Some consonants are abbreviated when appearing in conjuncts and lose part of their basic shape.

  • As the first member of a conjunct, জ can lose its final down-stroke, e.g., জ্জ jjô, জ্ঞ jñô, জ্ব jbô.
  • As the first member of a conjunct, ঞ ñô can lose its bottom half, e.g., ঞ্চ ñcô, ঞ্ছ ñchô, ঞ্জ ñjô, ঞ্ঝ ñjhô.
  • As the last member of a conjunct, ঞ ñô can lose its left half (the এ part), e.g., জ্ঞ jñô.
  • As first members of a conjunct, ণ ṇô and প can lose their respective down-strokes, e.g., ণ্ঠ ṇṭhô, ণ্ড ṇḍô, প্ত ptô, প্স psô.
  • As first members of a conjunct, ত and ভ bhô can lose their final upward tails, e.g., ত্ত ttô, ত্থ tthô, ত্র trô, ভ্র bhrô.
  • As the last member of a conjunct, থ thô can lose its final upstroke, taking the form of হ instead, e.g., ন্থ nthô স্থ, sthô, ম্থ mthô.
  • As the last member of a conjunct, ম can lose its initial down-stroke, e.g., ক্ম kmô, গ্ম gmô, ঙ্ম ṅmô, ট্ম ṭmô, ণ্ম ṇmô, ত্ম tmô, দ্ম dmô, ন্ম nmô, ম্ম mmô, শ্ম śmô, ষ্ম ṣmô, স্ম smô.
  • As the last member of a conjunct, স can lose its top half, e.g., ক্স ksô.
  • As last members of a conjunct, ট ṭô, ড ḍô, and ঢ ḍhô can lose their respective matra, e.g., প্ট pṭô, ণ্ড ṇḍô, ণ্ট ṇṭô, ণ্ঢ ṇḍhô.
  • As the last member of a conjunct ড ḍô can change its shape, e.g., ণ্ড ṇḍô.

Variant forms

Some consonants have forms that are used regularly but only within conjuncts.

  • As the first member of a conjunct, ঙ ṅô can appear as a loop and curl, e.g., ঙ্ক ṅkô, ঙ্গ ṅgô.
  • As the last member of a conjunct, the curled top of ধ dhô is replaced by a straight downstroke to the right, taking the form of ঝ jhô instead, e.g., গ্ধ gdhô, দ্ধ ddhô, ন্ধ ndhô, ব্ধ bdhô.
  • As the first member of a conjunct, র appears as a diagonal stroke (called রেফ reph) above the following member, e.g., র্ক rkô, র্খ rkhô, র্গ rgô, র্ঘ rghô, etc.
  • As the last member of a conjunct, র appears as a wavy horizontal line (called রফলা rôphôla) under the previous member, e.g., খ্র khrô, গ্র grô, ঘ্র ghrô, ব্র brô, etc. In some fonts, certain conjuncts with রফলা rôphôla appear using the compressed (and often simplified) form of the previous consonant, e.g., জ্র jrô, ট্র ṭrô, ঠ্র ṭhrô, ড্র ḍrô, ম্র mrô, স্র srô. In some fonts, certain conjuncts with রফলা rôphôla appear using the abbreviated form of the previous consonant, e.g., ক্র krô, ত্র trô, ভ্র bhrô.
  • As the last member of a conjunct, য appears as a wavy vertical line (called যফলা yôphôla) to the right of the previous member, e.g., ক্য kyô, খ্য khyô, গ্য gyô, ঘ্য ghyô, etc. In some fonts, certain conjuncts with যফলা yôphôla appear using special fused forms, e.g., দ্য dyô, ন্য nyô, শ্য śyô, ষ্য ṣyô, স্য syô, হ্য hyô.

Exceptions

  • When followed by র or ত , ক takes on the same form as ত would with the addition of a curl to the right, e.g., ক্র krô, ক্ত ktô.
  • When preceded by the abbreviated form of ঞ ñô, চ takes the shape of ব , e.g., ঞ্চ ñcô.
  • When preceded by another ট ṭô, ট is reduced to a leftward curl, e.g., ট্ট ṭṭô.
  • When preceded by ষ ṣô, ণ ṇô appears as two loops to the right, e.g., ষ্ণ ṣṇô.
  • As the first member of a conjunct, or when at the end of a word and followed by no vowel, ত can appear as , e.g., ৎস tsô, ৎপ tpô, ৎক tkô, etc.
  • When preceded by হ , ন appears as a curl to the right, e.g., হ্ন hnô.
  • Certain combinations must be memorised: ক্ষ (ক+ষ) kṣô, হ্ম (হ+ম) hmô.

Certain compounds

When serving as a vowel mark, উ u, ঊ u, and ঋ ri take on many exceptional forms.

  • u When following গ or শ śô, it takes on a variant form resembling the final tail of ও o, e.g., গু gu, শু śu. When following a ত that is already part of a conjunct with প , ন or স , it is fused with the ত to resemble ও o, e.g., ন্তু ntu, স্তু stu, প্তু ptu. When following র , and in many fonts also following the variant রফলা rôphôla, it appears as an upward curl to the right of the preceding consonant as opposed to a downward loop below, e.g., রু ru, গ্রু gru, ত্রু tru, থ্রু thru, দ্রু dru, ধ্রু dhru, ব্রু bru, ভ্রু bhru, শ্রু śru. When following হ , it appears as an extra curl, e.g., হু hu.
  • u When following র , and in many fonts also following the variant রফলা rôphôla, it appears as a downstroke to the right of the preceding consonant as opposed to a downward hook below: রূ ru, গ্রূ gru, থ্রূ thru, দ্রূ dru, ধ্রূ dhru, ভ্রূ bhru, শ্রূ śru.
  • ri When following হ , it takes the variant shape of ঊ u, e.g., হৃ hri.
  • Conjuncts of three consonants also exist, and follow the same rules as above, e.g., স + ত + র = স্ত্র strô; ম + প + র = ম্প্র mprô; জ + জ + ব = জ্জ্ব jjbô; ক্ষ kṣô + ম = ক্ষ্ম kṣmô.
  • Theoretically, four-consonant conjuncts can also be created (as in র + স + ট ṭô + র = র্স্ট্র rsṭrô), but they are not found in native words.
  • Theoretically, five-letter conjuncts can be created, e.g., র + স + ট + র + ঁ = র্স্ট্রঁ rsṭrôñ. Here ঁ is a diacritic which nasalises the previous vowel.
  • Additionally, a theoretical six-letter conjunct would be র্স্ট্রাঁ rsṭrañ (আ a + র্স্ট্রঁ rsṭrôñ), and a theoretical seven-letter conjunct would be র্স্ট্র‍্যাঁ rsṭryañ (য + র্স্ট্রাঁ rsṭrañ).

Diacritics and other symbols

These are mainly the Brahmi-Sanskrit diacritics, phones and punctuation marks present in languages with Sanskrit influence or Brahmi-derived scripts.

সংশোধক বর্ণ sôngshodhôk bôrnô
Symbol/ GraphemesNameFunctionRomanisationIPA transcription
খণ্ড ত khôndô tôSpecial character. Final unaspirated dental [t̪]t/t̪/
অনুস্বার ônussharDiacritic. Final velar nasal [ŋ]/ŋ/
বিসর্গ bisôrgôDiacritic. 1. Doubles the next consonant sound without the vowel (spelling feature) in দুঃখ duḥkhô [ˈd̪uɦkʰoˑ]>[ˈd̪uʔkʰoˑ]>[ˈd̪uk̚kʰoˑ] 'sorrow' 2. Final -ḥ examples: এঃ eḥ, উঃ uḥ 3. Silent in spellings like আন্তঃনগর antôḥnôgôr [ˈant̪ɔɦˌnoɡɔɾ]>[ˈant̪ɔˌnoɡɔɾ] 'intercity' 4. Also used as an abbreviation, e.g., কিঃমিঃ (similar to 'km' in English, for the word কিলোমিটার 'kilometre'), ডাঃ (similar to 'Dr.' in English, for ডাক্তার 'doctor'. However, in modern Bengali, using বিসর্গ bisôrgô for making abbreviations is considered grammatically wrong and the full stop is used for making abbreviations, e.g., as in কি.মি. 'km', ডা. 'Dr.'./h/
‍ঁচন্দ্রবিন্দু côndrôbinduDiacritic. Vowel nasalisation◌̃ / ṃ/◌̃/
‍্হসন্ত hôsôntôDiacritic. Suppresses the inherent vowel [ɔ]
‍ঽঅবগ্রহ ôbôgrôhôSpecial character or sign. Used for prolonging vowel sounds E.g., শোনঽঽঽ… śônôôô… 'listennn…' (This is where the default inherited vowel sound ô in ন is prolonged.) E.g., কিঽঽঽ? kiii? 'whaaat?' (This is where the vowel sound i which is attached with the consonant ক is prolonged.)-
‍্যযফলা yôphôlaDiacritic. Used with two types of pronunciation in modern Bengali depending on the location of the consonant it is used with within a syllable E.g., when the consonant it is used with is syllable-initial, it acts as the vowel /æ/, and thus, ত্যাগ is pronounced /t̪æɡ/ E.g., when the consonant with which it is used l is syllable-final, it doubles the consonant, and thus, মুখ্য is pronounced /ˈmukʰːɔ/ Notably used in transliterating English words with /æ/, e.g. ব্ল্যাক 'black', and sometimes as a diacritic to indicate non-Bengali vowels of various kinds in transliterated foreign words, e.g. the schwa indicated by a yôphôla; the French u /y/ and the German umlaut ü /y~ʏ/ as উ্য uyô; the French eu /ø~œ/ and the German umlaut ö /ø~œ/ as ও্য oyô or এ্য eyô.ê / yô/æ/ or /ː/
‍‍্ররফলা rôphôlaDiacritic. [r] pronounced following a consonant phoneme.r/r/
‍‍র্◌রেফ rephDiacritic. [ɾ] pronounced preceding a consonant phoneme.r/r/
‍্ববফলা bôphôlaDiacritic. Used in spellings only, if they were adopted from Sanskrit and has two different pronunciations depending on the location of the consonant with which it is used. E.g., when the consonant with which it is used is syllable-initial, it remains silent, and thus, স্বাধীন is pronounced /ˈʃad̪ʱin/ (and not /*ˈʃbad̪ʱin/ or *ˈʃʋad̪ʱin/). E.g., when the consonant with which it is used is syllable-final, it doubles the consonant, and thus, বিদ্বান is pronounced /ˈbid̪ːan/ and বিশ্ব is pronounced /ˈbiʃːɔ/. However, certain Sanskrit sandhis (i.e., phonetic fusions) such as ঋগ্বেদ, দিগ্বিজয়, উদ্বেগ, and উদ্বৃত্ত are pronounced /ˈriɡbed̪/, /ˈd̪iɡbidʒɔe̯/, /ˈud̪beɡ/, and /ˈud̪brittɔ/, respectively, while usage with the consonant হ defies phonological rules, e.g., আহ্বান /ˈaɦban/>[ˈau̯bʱan], জিহ্বা /ˈdʒiɦba/>[ˈdʒiu̯bʱa]. Also used in transliterating Islam-related Arabic words Note: Not all instances of ব used as the last member of a conjunct are bôphôla, e.g., in the words অম্বর ômbôr, লম্বা lômba, তিব্বত tibbôt, বাল্ব balb, etc.-/ː/
‍৺ঈশ্বর iśbôrSign. Represents the name of the deity and also written before the name of a deceased person.
আঞ্জী/সিদ্ধিরস্তু añji/siddhirôstuSign. Used at the beginning of texts as an invocation.

Notes

Digits and numerals

The Bengali script has ten numerical digits (graphemes or symbols indicating the numbers from 0 to 9). Bengali numerals have no horizontal headstroke or মাত্রা matra.

Bengali numerals
Hindu-Arabic numerals0123456789
Bengali numerals

Numbers larger than 9 are written in Bengali using a positional base 10 numeral system (the decimal system). A period or dot is used to denote the decimal separator, which separates the integral and the fractional parts of a decimal number. When writing large numbers with many digits, commas are used as delimiters to group digits, indicating the thousand (হাজার hajar), the hundred thousand or lakh (লাখ lakh or লক্ষ lôkṣô), and the ten million or hundred lakh or crore (কোটি koṭi) units. I.e., leftwards from the decimal separator, the first grouping consists of three digits, and the subsequent groupings always consist of two digits.

For example, the English number 17,557,345 will be written in traditional Bengali as ১,৭৫,৫৭,৩৪৫.

Punctuation marks

Bengali punctuation marks, apart from the downstroke দাড়ি daṛi (।), the Bengali equivalent of a full stop, have been adopted from western scripts and their usage is similar: Commas, semicolons, colons, quotation marks, etc. are the same as in English. Capital letters are absent in the Bengali script and proper names are unmarked.

An apostrophe, known in Bengali as ঊর্ধ্বকমা urdhbôkôma 'upper comma', is sometimes used to distinguish between homographs, e.g., পাটা paṭa 'plank', পাʼটা pa'ṭa 'the leg'. Alternatively a hyphen is used for the same purpose, e.g., পা-টা pa-ṭa.

Characteristics of the Bengali text

An example of handwritten Bengali script. Part of a poem written by Nobel Laureate Rabindranath Tagore in 1926 in Hungary.

Bengali text is written and read horizontally, from left to right. The consonant graphemes and the full form of vowel graphemes fit into an imaginary rectangle of uniform size (uniform width and height). The size of a consonant conjunct, regardless of its complexity, is deliberately maintained the same as that of a single consonant grapheme, so that diacritic vowel forms can be attached to it without any distortion. In a typical Bengali text, orthographic words, words as they are written, can be seen as being separated from each other by an even spacing. Graphemes within a word are also evenly spaced, but that spacing is much narrower than the spacing between words.

Unlike in purely alphabetic scripts – like Latin, Greek, and Cyrillic – for which the letter-forms stand on an invisible baseline, the Bengali letter-forms instead hang from a visible horizontal left-to-right headstroke called মাত্রা matra. The presence and absence of this matra can be important. For example, the letter ত and the numeral ৩ (3) are distinguishable only by the presence or absence of the matra, as is the case between the consonant cluster ত্র trô and the independent vowel এ e. The letter-forms also employ the concepts of letter-width and letter-height (the vertical space between the visible matra and an invisible baseline).

GraphemePercentage
11.32
8.96
7.01
6.63
4.44
4.15
4.14
3.83
2.78

According to Bengali linguist Munier Chowdhury, there are about nine graphemes that are the most frequent in Bengali texts, shown with its percentage of appearance in the adjacent table.

Comparison of Bengali script with ancestral and related scripts

Bangarh inscription of Mahipala I, among the earliest inscriptions in Proto-Bengali or Gaudi script

Vowels

aāiīuūeaioau
Bengali
Odia
Devanagari
Siddham

Consonants

kkhgghcchjjhñṭhḍhtthddhnpphbbhmẏ,yrl,ḷwśshkṣ
Bengaliয,য়ল,ল়ওয়,ৱক্ষজ্ঞ
Odiaଯ,ୟଲ,ଳୱ,ଵକ୍ଷଜ୍ଞ
Devanagariल,ळक्षज्ञ
Siddham

Vowel diacritics

kakikukṛkṝkḷkḹkekaikokau
Bengaliকাকিকীকুকূকৃকৄকৢকৣকেকৈকোকৌ
Odiaକାକିକୀକୁକୂକୃକୄକୢକୣକେକୈକୋକୌ
Devanagariकाकिकीकुकूकृकॄकॢकॣकेकैकोकौ

Standardisation

In the Bengali abugida, clusters of consonants are represented by different and sometimes quite irregular forms; thus, learning to read is complicated by the sheer size of the full set of letters and letter combinations, numbering about 350. Ishwar Chandra Vidyasagar introduced punctuation marks in Bengali language and wrote a book named Barnaparichay to standardize Bengali alphabets. While efforts at standardising the alphabet for the Bengali language continue in such notable centres as the Bangla Academy at Dhaka (Bangladesh) and the Pôshchimbônggô Bangla Akademi at Kolkata (West Bengal, India), it is still not quite uniform yet, as many people continue to use various archaic forms of letters, resulting in concurrent forms for the same sounds.

Romanisation

Romanisation of Bengali is the representation of the Bengali language in the Latin script. There are various ways of Romanization systems of Bengali, created in recent years but failed to represent the true Bengali phonetic sound. While different standards for romanisation have been proposed for Bengali, they have not been adopted with the degree of uniformity seen in languages such as Japanese or Sanskrit. The Bengali alphabet has often been included with the group of Brahmic scripts for romanisation in which the true phonetic value of Bengali is never represented. Some of them are the International Alphabet of Sanskrit Transliteration or "IAST system", "Indian languages Transliteration" or ITRANS (uses upper case alphabets suited for ASCII keyboards), and the extension of IAST intended for non-Sanskrit languages of the Indian region called the National Library at Kolkata romanisation.

Sample texts

Article 1 of the Universal Declaration of Human Rights

সমস্ত

Sômôstô

[ˈʃɔmost̪oˑ

All

মানুষ

manuṣ

ˈmanuʃ

human

স্বাধীনভাবে

sbadhinbhabe

ˈʃad̪ʱinˌbʱabeˑ

free-manner-in

সমান

sôman

ˈʃoman

equal

মর্যাদা

môrjada

ˈmɔɾdʒad̪aˑ

dignity

এবং

ebôṅ

ˈeboŋ

and

অধিকার

ôdhikar

ˈod̪ʱikaɾ

right

নিয়ে

niẏe

ˈnie̯eˑ

taken

জন্মগ্রহণ

jônmôgrôhôṇ

ˈdʒɔnmoˌɡɾoɦon

birth-take

করে।

kôre.

ˈkɔɾeˑ‖

do.

তাঁদের

Tãder

ˈt̪ãd̪eɾ

Their

বিবেক

bibek

ˈbibek

reason

এবং

ebôṅ

ˈeboŋ

and

বুদ্ধি

buddhi

ˈbud̪ːʱiˑ

intelligence

আছে;

ache;

ˈatʃʰeˑ‖

exist;

সুতরাং

sutôraṅ

ˈʃut̪oɾaŋ

therefore

সকলেরই

sôkôleri

ˈʃɔkoˌleɾiˑ

everyone-indeed

একে

êke

ˈækeˑ

one

অপরের

ôpôrer

ˈɔpoɾeɾ

another's

প্রতি

prôti

ˈpɾot̪iˑ

towards

ভ্রাতৃত্বসুলভ

bhratṛtbôsulôbh

ˈbʱɾat̪ɾiˌt̪ːoʃulɔbʱ

brotherhood-ly

মনোভাব

mônobhab

ˈmonobʱab

attitude

নিয়ে

niẏe

ˈnie̯eˑ

taken

আচরণ

acôrôṇ

ˈatʃoɾɔn

conduct

করা

kôra

ˈkɔɾaˑ

do

উচিত।

ucit.

ˈutʃit̪‖]

should.

সমস্ত মানুষ স্বাধীনভাবে সমান মর্যাদা এবং অধিকার নিয়ে জন্মগ্রহণ করে। তাঁদের বিবেক এবং বুদ্ধি আছে; সুতরাং সকলেরই একে অপরের প্রতি ভ্রাতৃত্বসুলভ মনোভাব নিয়ে আচরণ করা উচিত।

Sômôstô manuṣ sbadhinbhabe sôman môrjada ebôṅ ôdhikar niẏe jônmôgrôhôṇ kôre. Tãder bibek ebôṅ buddhi ache; sutôraṅ sôkôleri êke ôpôrer prôti bhratṛtbôsulôbh mônobhab niẏe acôrôṇ kôra ucit.

[ˈʃɔmost̪oˑ ˈmanuʃ ˈʃad̪ʱinˌbʱabeˑ ˈʃoman ˈmɔɾdʒad̪aˑ ˈeboŋ ˈod̪ʱikaɾ ˈnie̯eˑ ˈdʒɔnmoˌɡɾoɦon ˈkɔɾeˑ‖ ˈt̪ãd̪eɾ ˈbibek ˈeboŋ ˈbud̪ːʱiˑ ˈatʃʰeˑ‖ ˈʃut̪oɾaŋ ˈʃɔkoˌleɾiˑ ˈækeˑ ˈɔpoɾeɾ ˈpɾot̪iˑ ˈbʱɾat̪ɾiˌt̪ːoʃulɔbʱ ˈmonobʱab ˈnie̯eˑ ˈatʃoɾɔn ˈkɔɾaˑ ˈutʃit̪‖]

All human free-manner-in equal dignity and right taken birth-take do. Their reason and intelligence exist; therefore everyone-indeed one another's towards brotherhood-ly attitude taken conduct do should.

All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience. Therefore, they should act towards one another in a spirit of brotherhood.

Unicode

Bengali script was added to the Unicode Standard in October 1991 with the release of version 1.0.

The Unicode block for Bengali is U+0980–U+09FF:

Bengali[1][2] (PDF)
0123456789ABCDEF
U+098x
U+099x
U+09Ax
U+09Bxি
U+09Cx
U+09Dx
U+09Ex
U+09Fx
Notes 1.^ As of Unicode version 17.0 2.^ Grey areas indicate non-assigned code points

See also

Notes

Bibliography

  • Ashraf, Syed Ali; Ashraf, Asia (1966), "Bengali Diphthongs", in Dil A. S. (ed.), Shahidullah Presentation Volume, Lahore: Linguistic Research Group of Pakistan, pp. 47–52
  • Chatterji, Suniti Kumar (1939), Vasha-prakash Bangala Vyakaran (A Grammar of the Bengali Language), kolkata: University of Rabindra Bharaty (RBUDDE)
  • Chowdhury, Munier (1963), "Shahitto, shônkhatôtto o bhashatôtto (Literature, statistics and linguistics)", Bangla Academy Potrika, 6 (4), Dhaka: 65–76
  • Kostic, Djordje; Das, Rhea S. (1972), A Short Outline of Bengali Phonetics, Calcutta: Statistical Publishing Company
  • Hai, Muhammad Abdul (1964), Dhvani Vijnan O Bangla Dhvani-tattwa (Phonetics and Bengali Phonology), Dhaka: Bangla Academy
  • D.R, Master (2010), , Asiatick Researches, Calcutta: Asiatick Society, archived from on 11 October 2021
  • Salomon, Richard (1998). Indian Epigraphy: A Guide to the Study of Inscriptions in Sanskrit, Prakrit, and the Other Indo-Aryan Languages. New York: Oxford University Press.
  • Sarkar, Pabitra (1987), "Bangla Dishôrodhoni (Bengali Diphthongs)", Bhasha, 4–5, Calcutta: 10–12