Schema for the Transliteration of Sanskrit and Pāḷi

by
Ānandajoti Bhikkhu

 

 PDF 

 

A number of transliteration schemes have been developed to represent Sanskrit and Pāḷi in the Romanised alphabet. It is not possible, nor desirable, to show every scheme that has been in use, so here I will show some of those which are standard and frequently encountered for the benefit of students.

I will show the letters in the traditional Indian order, concentrating on Pāḷi, with the extra letters needed for Sanskrit shown in red.

None of the schema is without fault, however, because in Pāḷi although there are only 41 letters, there are 43 sounds in use: e & o are normally long vowels, but before a double consonant they become short. Their weight is then known only by position because the schema shows no sign to indicate this; as occasionally, in the verse texts, these sounds are also short when out of position, we have no way to properly indicate this. In my own works I have indicated it with the breve sign: ĕ, ŏ. Really it would have been better to employ different signs to signify this in the first place, and have had: short e, long ē, short o, long ō.

I have given tables in large letters for ease of identification, and followed them by a sample text, drawn from the Dhammapada. Although this verse doesn’t cover all the letters in use, it gives an indication of the transliteration scheme in use. It is presented here in Unicode.

Pāḷi (Dhp 23):

te jhāyino sātatikā, niccaṁ daḷhaparakkamā,
phusanti dhīrā nibbānaṁ, yogakkhemaṁ anuttaraṁ.

Uṭṭhānavato satīmato, sucikammassa nisammakārino,
saññatassa ca Dhammajīvino, appamattassa yasobhivaḍḍhati.

Sanskrit (Udānavarga 4.3):

apramattāḥ sātatikā nityaṁ dḍhaparākramāḥ,
spśanti dhīrā nirvāṇaṁ yogakṣemam anuttaram.

Those who meditate all the time, constant and firm in their effort,
those wise ones reach nibbāna, the supreme release from (all) bonds.

For he who is active, mindful, pure in deeds, considerate,
self-controlled, living by Dhamma, heedful, fame greatly increases.


Unicode and ISO 15919

In my judgement this is the best of the schemes to be developed, and the one which I use in my publications.

One problem with it is that Unicode, being an encoding scheme, has shown a way to encode  and  (by using combining ring under r and l) to indicate the extra vowel sounds found in Sanskrit, but has failed to provide any code slot for them, at least up until the latest version Unicode 9 (June 2016).

This means the pre-composed characters, which are necessary if the representation is to appear neat, can only be placed in the Private Use Area (PUA), which will then differ in the various fonts.

vowels

a, ā, i, ī, u, ū, e, ai , o, au, , , , 

gutturals

k, kh, g, gh, ṅ

palatals

c, ch, j, jh, ñ

cerebrals

ṭ, ṭh, ḍ, ḍh, ṇ

dentals

t, th, d, dh, n

labials

p, ph, b, bh, m

semi-vowel

y, r, l, ḷ, v

sibilants

s, ś, ṣ

pure nasal

visarga

Pāḷi:

te jhāyino sātatikā, niccaṁ daḷhaparakkamā,
phusanti dhīrā nibbānaṁ, yogakkhemaṁ anuttaraṁ.

Uṭṭhānavato satīmato, sucikammassa nisammakārino,
saññatassa ca Dhammajīvino, appamattassa yasobhivaḍḍhati.

Sanskrit:

apramattāḥ sātatikā nityaṁ dḍhaparākramāḥ,
spśanti dhīrā nirvāṇaṁ yogakṣemam anuttaram.


IAST

The International Alphabet of Sanskrit Transliteration is the scheme that is probably the most widely used, both historically and at present. The Pali Text Society (PTS) do not use a standardised transliteration scheme, but this is the one that has been used most widely in their publications (note that in the Pali-English dictionary, guttural ṅ is replaced by simple n, as its sound can be known by position, as it only ever occurs before k & g.)

The differences to Unicode and ISO are that the final four Sanskrit vowel sounds are shown with dot under them instead of ring under. And that the pure nasal has a dot under, rather than a dot over. All the other signs are similar to those listed above. The advantage over Unicode is that all these characters have been given slot codes and therefore can use pre-composed characters, without any further trouble.

However, we do have further problems, and one is quite serious too: we have the same character ḷ (dot under l) representing two distinct characters, vowel ḷ used in Sanskrit and semi-vowel ḷ used in Pāḷi. Although context would normally enable us to distinguish them, it is a real failing to use the same character for two very different sounds. Another problem is that normally the dot-under character represents a cerebral sound as in ṭa, ḍa, etc. but neither ḷ nor ṃ are cerebrals.

vowels

a, ā, i, ī, u, ū, e, ai , o, au, ṛ, , ḷ, 

gutturals

k, kh, g, gh, ṅ

palatals

c, ch, j, jh, ñ

cerebrals

ṭ, ṭh, ḍ, ḍh, ṇ

dentals

t, th, d, dh, n

labials

p, ph, b, bh, m

semi-vowel

y, r, l, ḷ, v

sibilants

s, ś, ṣ

pure nasal

visarga

Pāḷi:

te jhāyino sātatikā, niccaṃ daḷhaparakkamā,
phusanti dhīrā nibbānaṃ, yogakkhemaṃ anuttaraṃ.

Uṭṭhānavato satīmato, sucikammassa nisammakārino,
saññatassa ca Dhammajīvino, appamattassa yasobhivaḍḍhati.

Sanskrit:

apramattāḥ sātatikā nityaṃ dṛḍhaparākramāḥ,
spṛśanti dhīrā nirvāṇaṃ yogakṣemam anuttaram.


SBE

The scheme used in the Sacred Books of the East, a very influential set of books published in the late 19th and early 20th centuries used a very different way of distinguishing characters that we now use diacritics for.

In the vowels, the circumflex and accent characters used in French (and other European languages), were used to indicate length; and in the consonants instead of using diactrics of any sort, various letters are shown italicised.

This has led to enormous confusion when the italicisation has been lost in transcribing for the internet, so we now regularly see Gâtaka written when Jātaka (in Unicode) is intended. Similarly: Kulla (Culla), Tîkâ (Ṭīkā), Gîva (Jīva), etc.

vowels

a, â, i, î, u, û, e, ai , o, au, ri, rí, li, lí

gutturals

k, kh, g, gh, ng

palatals

k, kh, g, gh, ñ

cerebrals

t, th, d, dh, n

dentals

t, th, d, dh, n

labials

p, ph, b, bh, m

semi-vowel

y, r, l, l, v

sibilants

s, s, sh

pure nasal

m

visarga

h

Pāḷi:

te ghâyino sâtatikâ, niccam dalhaparakkamâ,
phusanti dhîrâ nibbânam, yogakkhemam anuttaram.

uṭṭhânavato satîmato, sucikammassa nisammakârino,
saññatassa ca dhammajîvino, appamattassa yasobhivaddhati.

Sanskrit:

apramattâh sâtatikâ nityam dridhaparâkramâh,
sprisanti dhîrâ nirvânam yogakshemam anuttaram.


ITRANS

This scheme is an extension of the Harvard-Kyoto scheme, which is a way of transliterating using only the characters found in the ASCII encoding, which is standard in nearly all Latin fonts.

One of the main issues at hand with this scheme is that it looks so confusing with a mixture of devices used to distinguish letters: doubling, capitalisation, dots and circumflex before character.

vowels

a, aa, i, ii, u, uu, e, ai , o, au, Ri, RRi, Li, LLi

gutturals

k, kh, g, gh, ~N

palatals

c, ch, j, jh, ~n

cerebrals

.t, .th, .d, .dh, .n

dentals

t, th, d, dh, n

labials

p, ph, b, bh, m

semi-vowel

y, r, l, .l, v

sibilants

s, sh, Sh

pure nasal

.m

visarga

H

Paa.li:

te jhaayino saatatikaa, nicca.m da.lhaparakkamaa,
phusanti dhiiraa nibbaana.m, yogakkhema.m anuttara.m.

u.t.thaanavato satiimato, sucikammassa nisammakaarino,
sa~n~natassa ca dhammajiivino, appamattassa yasobhiva.d.dhati.

Sanskrit:

apramattaaH saatatikaa nitya.m dRi.dhaparaakramaaH,
spRishanti dhiiraa nirvaa.na.m yogakShemam anuttaram.


Velthius

This standard was again developed so that all the characters could be displayed using ASCII fonts, without the help of diacritical marks. Although it is cleaner to look at than the ITRANS scheme above, it is still quite confusing, but many webpages, especially older ones, employ this scheme.

vowels

a, aa, i, ii, u, uu, e, ai , o, au, .r, .rr, .l, .ll

gutturals

k, kh, g, gh, "n

palatals

c, ch, j, jh, ~n

cerebrals

.t, .th, .d, .dh, .n

dentals

t, th, d, dh, n

labials

p, ph, b, bh, m

semi-vowel

y, r, l, .l, v

sibilants

s, "s, .s

pure nasal

.m

visarga

.h

Paa.li:

te jhaayino saatatikaa, nicca.m da.lhaparakkamaa,
phusanti dhiiraa nibbaana.m, yogakkhema.m anuttara.m.

u.t.thaanavato satiimato, sucikammassa nisammakaarino,
sa~n~natassa ca dhammajiivino, appamattassa yasobhiva.d.dhati.

Sanskrit:

apramattaa.h saatatikaa nitya.m d.ri.dhaparaakramaa.h,
sp.ri"santi dhiiraa nirvaa.na.m yogak.semam anuttaram.