Chris Anderson has an interesting post about Zipf’s law, which posits that the frequency distribution of words in the English language follows a power law. He shows that if you set up a process that generates random sets of characters, you end up with the same distribution.
I am wondering if we aren’t putting the cart before the horse here – might it not be the case that the words we use more often have become shorter, precisely because we use them more often? If language evolves over time with an aim to increase understanding and reduce bandwidth consumption, this is what we would expect.
The words "mama" and "papa" are common throughout many languages because when a baby starts babbling, that is what he or she will say first. So, we made words out of babble, representing what proud parents would want them to represent. Similarly, we reserve the shortest words (single vowels, diphthongs, or combinations of one vowel and one consonant) for the concepts we need most frequently.
Saves bandwidth. Just ask any kid with an SMS thumb.