The Complex world of Phonemes
The sound inventories of the world's languages show a considerable extent of symmetry.
It has been postulated that this symmetry is a reflection of the human
physiological, cognitive and societal factors. Although the organization of the
vowel systems has been satisfactorily explained for smaller inventories,
the structure of the consonant inventories is an open problem since 1939.
We reformulate the problem in the light of statistical physics,
more precisely complex networks, and observe that the distribution of the occurrence
and co-occurrence of the phonemes (consonants and vowels) over languages are scale-free.
The co-occurrence network exhibits strong community structures, where the driving forces
behind the community formation are the human articulatory and perceptual factors.
In order to validate the above principle, we introduce an information theoretic
definition of these factors - feature entropy and feature distance -
and show that the natural language inventories are significantly different
in these terms from the randomly generated ones. Furthermore, a preferential attachment
based growth model can lead to the emergence of similar topologies as that of
the real networks.
Data Sources
PlaNet -- The Phoneme-Language Network
PhoNet -- The Phoneme-Phoneme Network
Community Structures in PhoNet
Feature Set
Feature-based Representation of Phonemes