Somali Orthography and Basic Morphophonology

The standard orthography for standard Somali was established in 1972. Spelling is not entirely standardized.

The standard orthography uses the following symbols:


22 consonant symbols:

           labial  labio-  dental alveolar  retroflex  palatal  velar  uvular  pharyngeal  glottal
stops         b              d                 dh                g      q                   '
stops                        t                                   k
nasals        m                      n
fricatives           f               s                   sh             kh          x        h
fricatives                                                                          c
affricates                                                j
trills                                r
laterals                              l
glides       w*                                           y

* /w/ is labiovelar of course.

Perhaps /c/ and /j/ should really be considered as members of the voiced stop series?

/dh/ is pronounced as [r] in the south except when word-initial or geminate, and may often be written as 'r' as well.

The members of the series of "voiced stops" are apparently often voiceless, especially /b/, /j/ and /q/, which do not contrast.

/kh/ may be velar or uvular, and is found only in Arabic loan words.

/b/ /d/ /dh/ /g/ /q/ /l/ /m/ /n/ /r/  may be geminate.


The orthography distinguishes five vowels, each of which may occur long or short:

         i ii                   u uu
              e ee       o oo
                   a aa

These ten vowels are further distinguished by a front/back feature, which is not marked by the orthography, and which is also harmonic at some variable level apparently greater than the phonological word.

The following diphthongs occur:

     ay aay   aw aaw   ey eey   oy ooy   ow oow

The length of diphthongs does not seem to be marked very carefully in writing, if at all. Rhymes of the form /VG/ sometimes count as short, sometimes as long in metered verse; whether this reflects an underlying distinction not made in writing is unclear to me.


Somali has a tonal accent, which may be treated as morphophonemically predictable (though alternatively one might treat some of the morphological classes as distinguished in the lexicon by underlying tone). The accent is not marked in the orthography.

Syllable structure

The syllable onset is maximally a single consonant. Onsetless syllables occur, but only word-initially.

Rhyme types include V, VG, VV, VVG, VC, VGC, VVC. Whether VVGC is a possible rhyme type is unclear.

/t/ and /k/ do not occur in syllable-final position (/k/ -> [g]/_#)

/m/ and /j/ do not occur in word-final position (/m/ -> [n]/_#)

(Morpho-)phonological processes

The list below, gleaned from various sources, is incomplete at best.

[verb_conj3]d(o) + t -> t                qaado+taa -> qaadta -> qaata 'I take'  (x)

short vowels in word-final syllable (optionally?) assimilate to initial vowel of suffix
                                      or following (cliticized?) word:
                             xoola+kii -> xoolihii 'the cattle'
                             dhac+een -> dheceen 'they fell'
                             wix+kii -> wixii 'the (known) thing'
                             wax+ka+uu -> wuxuu 'what he ...' (i.e. 'the thing [that] he')
                             laba koob oo caano ah -> laba kob oo caana ah 'two cups of milk'
                          unclear if this can occur across two consonants...

trisyllabic shortening: (C1) V1 C2 V2 C3 V3 -> (C1) V1 C2 C3 V3
                        condition: C2, V2, C3 must be short
                        maalim+o -> maalmo 'days'
                but     maalmo+ka -> maalmaha 'the days' (because maalmka or maalmha is illegal)
                        gabadh+o -> gabdho 'girls'
                        qalad+ay -> qalday 'he made a mistake'
                but     qalad+tay -> qaladday 'she made a mistake'  (compare (x) above)
                        hayso+teen -> haysteen 'they possessed'
                        ladan+ahay -> ladnahay 'I'm good'

Metathesis: sometimes ('rather rare') with trisyllabic shortening, C2 and C3 switch.
                       culus+aa -> cuslaa 'he was heavy'
                       duman+ay -> dunmay 'got organized'
                       neceb+ayd -> nebcayd 'she disliked it'

a ~ e        sometimes /a/ and /e/ are variants, e.g. beddel ~ baddal 'change'

b ~ m        sometimes /b/ and /m/ are variants, e.g. kibis ~ kimis 'bread'

kh ~ q       sometimes /kh/ and /q/ are variants, e.g. duq ~ dukh 'old man'

[verb]+t -> +d /{c,d,h,q,x,',w}_     dhac+tay -> dhacday 'she fell'  illow+tay -> illowday 'you forgot'
               /V_                 joogso+tay -> joogsaday 'I stopped'

[noun]+t -> +d /d_          jamhuuriyad+ta -> jamhuuriyadda 'the republic'
                V_          kaalma+ta -> kaalmada 'the assistance'
                V'_         lo'+ta -> lo'da  'the cattle'

+t -> +dh /dh_        gabadh+ta -> gabadhdha 'the girl'
                      xidh+tay -> xidhdhay 'she tied it'

[noun]+k -> +g / {g,y,w,i}          rag+kii -> raggii 'the men' oday+ka -> odayga 'the old men'
                                    guri+kee -> gurigee 'which house?'

[noun]+k -> +h / V (other than i)   ololo+ka -> ololaha 'the campaign' bare+ka -> baraha 'the teacher'

[noun]+k -> nil /{c,h,kh,q,x}_     rah+ka -> raha 'the frog'  sanduuq+ka -> sanduuqa 'the box'
                             (dunno about after /'/, which is the only other gutteral;
                              maybe not noun-final, or maybe placeless?)

k -> g /_$

l+n -> ll          dil+nay -> dillay 'we killed'  (optional)
r+n -> rr          bar+nay -> barray 'we taught'  (optional)

m -> n /_#         /nim/ -> nin 'man'  niman 'men'

a+u -> oo          ina+u -> inoo 'for us'     la+u -> loo 'for someone'
a+u -> u           naag+ta+u -> naagtu 'the woman [subject]'
                   nim+ka+u -> ninku 'the man [subject]'

l+t -> sh          meel+ta -> meesha 'the place'   bil+tan -> bishan 'this month'
                   qosol+tay -> qososhay 'she laughed   calool+tayda -> calooshayda 'my stomach'

[verb]i+y -> sh   ("sometimes")     tiri+yo -> tirsho 'I do not count'
                                    is+mari+yaa -> ismarsha 'for external use'

w -> b (or vice versa?)   illow+ey -> illobey 'I forgot' koow iyo toban -> koobyo toban 'eleven'

i -> y / _ V       (generality unclear) guri+o -> guryo 'houses'
                                        bari+een -> baryeen 'they spent the night in peace'


Focus word /bàa/ following a noun phrase ending in a short vowel or diphthong:
nínka+bàa -> nínkàa
maxáy+bàa -> maxàa

Focus word /bàa/ followed by verbal subject pronoun:

maxáy+bàa+aad -> maxàad   ('what + focus + you')

Verbal particle(s) and verbal object pronoun:

ku+ú+ká -> kaagá ('you for from')