Core word sets

I’ve been playing with the ‘the up-goer five text editor’┬árecently. It challenges you to write using only the 1000 most common English words. It will tell you if you’ve used a word that isn’t one of these. It’s challenging but fun and made me have to think creatively about how to reword something. It reminded me of something we skimmed over in one of my modules: the idea of semantic primes. This is the idea that we can define other words from a small set of core ideas. It wasn’t the focus of the module so I’m not well-informed on the topic, but we looked at some of the ideas from Anna Wierzbicka’s 1995 and 1996 work, and her idea that there are 55 semantic primitives, which could be used to define other words. We had an exercise of writing definitions just using these core words/ideas.

There was a study earlier last year where researchers looked at the structure of how words define and are defined by other words in four English dictionaries. They found that about 10% of the words in the dictionaries are used to define other words, with the other 90% not defining other words. Many of these 10% core words (which here are called the kernel) can in turn be fully defined by the other words in the kernel. These essential defining words could vary, but made up about half of the core set (around 5% of the dictionaries). The researchers called these defining words the ‘minimal grounding set’.

The researchers found that children generally acquire the minimal grounding set before the other core words and then the core words before other words. Compared with other words, the minimal grounding set of words are the most frequently in use and also have more concrete, rather than abstract, meanings.

This isn’t a topic I’m well acquainted with, however, I really like the idea of there being a core set of words. I wonder if the 1000 most common English words are similar to words found in a minimal grounding set and if there is a relationship between Wierzbicka’s semantic primitives and the minimal grounding set?