Dictionaries: Concise, Compact, and dacoit


Compact Oxford DictionaryDacoit: noun; one of a class of criminals in India and Burma who rob and murder in roving gangs. A member of a band of armed robbers in India or Burma. A bandit. Origin: Hindi and Urdu.

I love dictionaries. I like opening them up to a random page and just reading, discovering words and uses that I didn’t know. I love finding origins of words and phrases; linguistic connections between past and present. I will happily spend hours reading through Samuel Johnson’s dictionary, or a glossary of Shakespeare’s or Chaucer’s words.

I’ll open any dictionary at random and read a page or two. I’m almost always assured I will find something new. Some, like Samuel Johnson’s dictionary, are delights to read; others are dry and dull.

“Do you read the dictionary?” French author Théophile Gautier once asked a young poet. “It is the most fruitful and interesting of books.”

Last week I bought a used copy of the Oxford Compact English Dictionary, 2005 edition, at the local used bookstore, Cover to Cover (used, but is superb condition, I should add). And when I opened it at random to page 247, I read the definition of dacoit – a word I can’t ever recall encountering before last week. Sandwiched between dachshund and dactyl. Now I know a lot more about it, thanks to a bit of research in print and online sources.

It’s still in use today, albeit not in any media I regularly read. Every reference I’ve found comes from India or Pakistan. In 2004, The Telegraph of Calcutta wrote about the violent evolution of dacoits:

Sten guns, cellphones and agents on the job ‘ the image of the Chambal dacoit has changed over the years. What hasn’t is the centuries-old cycle of violence in the region.

The International News of Pakistan had a headline as recently as Dec. 19, 2013, saying:

Most-wanted dacoit carrying Rs1m bounty arrested

Dacoit, according to the two-volume Oxford Compact Dictionary, has many 19th century references for use in English, dating as far back as 1820. It’s also referred to as dacoity and dacoitery in some sources.

Wikipedia tells us the East India Company established “the Thuggee and Dacoity Department” in 1830. The ruling British enacted legislation called the “Thuggee and Dacoity Suppression Acts” in India between 1836 and 1848. Thuggee has survived in English, reduced to the shorter “thug.”

Not that I’d have much reason to use dacoit in any form. It’s one of those imperialist-period words that wouldn’t find a place in a contemporary vocabulary. George Orwell would have known it; maybe my father uttered it sometime before he left England. I have to wonder what force is keeping it intact in a dictionary that is constantly pressured by new entries: neologisms and borrowed words from other languages that keep popping into our increasingly international, technological language.

For dictionary-ophiles, you already know that the OED’s modern “Compact” versions are not the same as the legendary (and now collector’s edition) of the two-volume Compact edition that photographically reduced the famous 20-volume edition (of 1911) to a two-volume reproduction, sold with a magnifying glass in a small drawer so you could read the damn thing. (It was reprinted in such small type that even readers with perfect vision could barely read it without strain.)

I have that set, too, although I rarely refer to it – it’s clumsy and awkward. But I dutifully pulled out the A-O volume to look up dacoit this past weekend. Squinting at the small print.

The new one-volume Compact is actually a subset of the Concise edition, which also traces back to 1911. The more comprehensive edition is the two-volume Shorter version, now in its sixth edition. That recent edition was famous for its printing of 16,000 words that were previously hyphenated, now shown as single words.

My two Concise editions are a trifle old – seventh is from 1982 and the eighth is 1990 to be exact. Both have historical relevance to the OED. Wikipedia tells us:

  • Sixth (1976) and Seventh (1982) Editions were still called The Concise Oxford Dictionary of Current English, but the subtitle now read based on the Oxford English dictionary and its supplements first edited by H.W. Fowler and F.G. Fowler. It was (thoroughly) edited by J.B. Sykes, catching up with the developments in the parent dictionary. In the Seventh Edition, symbols were introduced to mark uses considered controversial or offensive.
  • Eighth Edition (1990): The Concise Oxford Dictionary of Current English, first edited by H. W. Fowler and F. G. Fowler was edited by Robert E. Allen. Being computer-based, this edition changed the original structure to a large extent.

I have the latest 12th edition on order from Indigo. I also have on my bookshelves editions of the Oxford Canadian English Dictionary and the Oxford Etymological Dictionary, and a CD version of the Concise.

King's English EncyclopediaThe OED is my personal choice of dictionaries, but I also have two recent editions of the Merriam-Webster’s Dictionary, and the Houghton-Mifflin American Heritage Dictionary, Chamber’s, Collins, and some older dictionaries. One of which is a 1933 two-volume set, “The King’s English Encyclopedia.” Plus I have dictionaries for Spanish, Latin, Italian, Scots, Russian and Greek. And glossaries for various writers, cultures, sciences and historical periods.

In particular, the OED has etymological information that identifies the origin of a particular word or phrase, at least in the better editions. I suspect that the majority of readers care less than a fig leaf for the etymological ancestry of words. Yet to me, it’s a key that opens some magical doors to understanding how our language evolved,

It may seem a bit obsessive to most folks to care passionately about words and their origins , unless you’re either a writer, an editor, a translator, or a logophile. And I’m all of those. Words move me in mysterious and magical ways.*

Dacoits

Even words like dacoit, although one has to accept that it’s pretty much lost its appeal to common parlance and is on the inexorable slide towards the categories of archaic or obsolete; words more likely to be shed from the dictionary when neologisms shove their way in (although arguably the number of English speakers in India who still use it might keep it alive in the OED for some time longer).

And jostle in, they do, at a remarkably rapid rate. According to the Global Language Monitor, there were 1,013,913 words in English, as of January, 2012. That estimate went up to 1,019,729.6 in January, 2013. Really: .6 of a word. Don’t ask me how.

GLM says a new word is coined every 98 minutes, or about 14.7 every day. We can safely assume no one creates 0.7 of a word, unless that’s where the .6 comes into play – an unfinished thought?

The GLM estimate isn’t without its controversy. Back in 2009, when GLM announced the millionth word – “Web 2.0″ – the New York Times wrote about the angry response from linguists who challenged the number. Then it weighed into the debate over the numbers game.

English is definitely big. The Oxford English Dictionary lists about 600,000 words (mostly drawn from written sources), with more than 1,000 added annually. Merriam-Webster’s estimates that there are about a million words in English, give or take a quarter-million — far more than the 500,000-plus claimed by the runner-up, Mandarin Chinese, and the 100,000-odd words of French.

But the idea of an “English word” is inherently fuzzy. How do you count compound words like “hot dog” or infinitely expandable ones like “great-great-great-great-aunt?” What about foreign loan words? Terms for chemical compounds (roughly 84 million) or insect species (roughly one million)? The slang terms that wink in and out of existence without ever making it into print?

A Google/Harvard study put the number even higher, at 1,022,000, based on a count of words in books in Google’s online collection. But over at the Oxford Dictionary site, they have a far more prosaic approach to the question, “How many words are there in the English language?”:

There is no single sensible answer to this question. It’s impossible to count the number of words in a language, because it’s so hard to decide what actually counts as a word. Is dog one word, or two (a noun meaning ‘a kind of animal’, and a verb meaning ‘to follow persistently’)? If we count it as two, then do we count inflections separately too (e.g. dogs = plural noun, dogs = present tense of the verb). Is dog-tired a word, or just two other words joined together? Is hot dog really two words, since it might also be written as hot-dog or even hotdog?

It’s also difficult to decide what counts as ‘English’. What about medical and scientific terms? Latin words used in law, French words used in cooking, German words used in academic writing, Japanese words used in martial arts? Do you count Scots dialect? Teenage slang? Abbreviations?

The Second Edition of the 20-volume Oxford English Dictionary contains full entries for 171,476 words in current use, and 47,156 obsolete words. To this may be added around 9,500 derivative words included as subentries. Over half of these words are nouns, about a quarter adjectives, and about a seventh verbs; the rest is made up of exclamations, conjunctions, prepositions, suffixes, etc. And these figures don’t take account of entries with senses for different word classes (such as noun and adjective).

This suggests that there are, at the very least, a quarter of a million distinct English words, excluding inflections, and words from technical and regional vocabulary not covered by the OED, or words not yet added to the published dictionary, of which perhaps 20 per cent are no longer in current use. If distinct senses were counted, the total would probably approach three quarters of a million.

Similarly, the Merriam Webster folks weigh in with their caveats:

There is no exact count of the number of words in English, and one reason is certainly because languages are ever expanding; in addition, their boundaries are always flexible. Consider such words as “cannoli” and “teriyaki,” which come from other tongues but are established through use, context, and frequency as English. There are many other thorny considerations that complicate the task of counting individual words and tallying up the language in that way. For example, are all of the inflected forms of a word–for instance, “drive,” “drives,” “drove,” etc.–one word or several separate words?
Similarly, there are twelve different words with the spelling “post” entered in Webster’s Third New International Dictionary, Unabridged; they all have different parts of speech or derivations. Should these twelve be considered one word for the purposes of our reckoning? Some scholars would insist the distinct forms of “post” only be counted once, but others consider each one a separate word that should be counted individually.
Another puzzle: should “port of call,” another Webster’s Third entry, count as a word, even though each of its components is entered separately?
It has been estimated that the vocabulary of English includes roughly 1 million words (although most linguists would take that estimate with a chunk of salt, and some have said they wouldn’t be surprised if it is off the mark by a quarter-million); that tally includes the myriad names of chemicals and other scientific entities. Many of these are so peripheral to common English use that they do not or are not likely to appear even in an unabridged dictionary.
Webster’s Third New International Dictionary, Unabridged, together with its 1993 Addenda Section, includes some 470,000 entries. The Oxford English Dictionary, Second Edition, reports that it includes a similar number.

Estimates on the number of words in use vary wildly. This site lists several sources that range from 450,000 to 1 million.

So how many of those words do you know? Shakespeare’s vocabulary has been estimated between 20,000 and 35,000 words (David and Ben Crystal’s superb glossary, “Shakespeare’s Words” has more than 14,000 entries of more than 15,000 used in his writing; other estimates suggest more than 27,000 unique words, but I don’t know if that includes proper names). Milton’s works use about 8,000 words. The KJV has about 14,500 unique words in its roughly 790,000; two thirds of them in the OT. These include the top ten: the, and, of, to, that, in, he, shall, for, and  unto, which occupy 265,000 of the 790,000 total; over a third.

According to the Hypertextbook site,

An average educated person knows about 20,000 words and uses about 2,000 words in a week.

Linguist Stephen Pinker challenges such estimates as too low in his book “The Language Instinct,” claiming the average American high school student has a vocabulary of about 45,000 words, many of which he/she doesn’t use outside the classroom. Based solely on anecdotal experience, I would have been surprised if any teenager had a vocabulary more than a few hundred words. But I stand to be be corrected.

The Testyourvocab blog has done tests that suggest the average 15-year-old has a vocabulary of about 20,000 words, and that grows until their 50s, when it reaches around 32,000. People with higher SAT scores also have higher vocabularies, reaching almost 40,000 words.

What’s interesting to me is that the site’s authors relate vocabulary to reading habits between ages 4 and 15. Kids who read “lots” learn more words than those who don’t. Based on their studies, people who read little don’t top 20,000 words in their vocabulary until age 60 or so, while those who read “lots” top 30,000 by age 30.

Worldwatch says the average vocabulary of a 14-year-old American in 1950 was 25,000 words; by 1999 it had shrunk to just 10,000.

Note to moms and dads: turn off the TV, take away the Playstation and give your kids a book. Note to gentle readers: read more. Turn off the bedroom TV and read a book instead.

David Crystal – linguist and author of more than 100 books – told the BBC that the average person likely has a vocabulary of  35,000 words, and university-educated people may have one as high as 75,000.

BlackadderIs dacoit among the words? I doubt it. Nor is contrafribularities, I suspect…

Dr. Samuel Johnson: [places two manuscripts on the table, but picks up the top one] Here it is, sir. The very cornerstone of English scholarship. This book, sir, contains every word in our beloved language.
Blackadder: Every single one, sir?
Dr. Samuel Johnson: Every single word, sir!
Blackadder: Oh, well, in that case, sir, I hope you will not object if I also offer the Doctor my most enthusiastic contrafribularities.
Dr. Samuel Johnson: What?
Blackadder: “Contrafribularites”, sir? It is a common word down our way.
Dr. Samuel Johnson: Damn! [writes in the book]
Blackadder: Oh, I’m sorry, sir. I’m anispeptic, frasmotic, even compunctuous to have caused you such pericombobulation.


Bigger vocabularies doesn’t mean better. Some people have large vocabularies of technical or professional words that have no real use in everyday conversation. Plus it’s really how well you use your vocabulary in communicating  that matters. Knowing words like dacoit is irrelevant to others if you can’t use them in a conversation.

But let’s get back to dictionaries. We all have them, we all use them, but are they the right ones?

If you use one of those generic “Webster’s” dictionary for your references, the sort of book you buy at Walmart, you’re doing yourself a great disservice. Those are likely printed from plates of out-of-print, well-aged editions, sometimes incomplete and even incorrect. They’re old school – new words, uses and phrases are absent in their pages. You have been gulled into believing they are worthy companions by their inexpensive sticker price.

The only real edition of a “Webster’s” dictionary worthy of the name says “Merriam Webster” in the title.The rest are dreadful pretenders. Truly dreadful.

It’s no different from buying a Ford or GM replacement part for your car or some generic made-in-China piece part that will fail in short order. Treat your dictionaries with the same respect you treat your auto parts: brand names; buy the brand names. Generic “Webster’s” isn’t one.

My personal preference is the Oxford dictionary, for many reasons: it’s a historical masterpiece and the creation of it is truly a wonderful tale. It’s English, so it has proper spellings of words like favour, labour and jewellery. It has great addenda about language and usage. And I’ve had an OED ever since I can remember. And I don’t like Noah Webster for arbitrarily changing the spelling of perfectly good words simply because he didn’t like them spelled that way.

Why get the latest edition as opposed to just keeping an old dictionary from a generation or two earlier? Because words and uses change; English is a cauldron of logo-istic alchemy. New words (neologisms) are added (particularly nowadays as technology evolves rapidly); old words get forgotten, altered or replaced; word uses change. Wicked, for example, went from evil to evil to cool (as in exciting). Gay changed from happy to homosexual. Fat became phat, meaning cool or hot (both terms being synonymous). We no longer say twenty-three skidoo or refer to theft as a “rip-off.”

Christopher Wren described his great cathedral as artificial and awful – words that have today a very different meaning than they had then. Yet we still encounter obsolete or changed words in other contexts – plays, treatises, novels and poems of a past era. We need modern dictionaries to help guide us through the linguistic shoals and explain the changing definitions.

Which is also why I keep older editions with the new: to compare uses and definitions, to chart the changes and to see words come and go. The oldest OED I own is the 1911 reprint, which pairs well with the 1933 set of King’s English. The Samuel Johnson and some 19th-century dictionaries I own have even older content, although they are actually 20th century reprints.

dacoitsDacoit is one of those words that, despite its antiquity (it shows up as “dacoits” in the 1933 King’s English Encyclopedia), didn’t survive as a cultural commonplace in Western conversation. Other words from the Imperial halcyon days survive unto today: curry, naan, chapati, ashram, guru, nabob, pundit, sitar, sutra, Brahman and sari are still in common everyday use. Dacoit, however, isn’t.

Funny how some words survive and others don’t.

~~~~~

* Perhaps not quite as obsessive as Ammon Shea, who read the entire 20 volume OED, from cover to cover, all  21,730 pages and 59 million words, over a period of a year. Grammar.About.com adds this:

A good portion of Shea’s book attends to the “strange and lovely words” that he ran across during his marathon read. These curios include anonymuncule (“an anonymous, small-time writer”), curtain-lecture (“a reproof given by a wife to her husband in bed”), goat drunk (“made lascivious by alcohol”), grinagog (“a person who is constantly grinning”), lectory (“a place for reading”), nod-crafty (“given to nodding the head with an air of great wisdom”), onomatomania (“vexation at having difficulty in finding the right word”), palaeolatry (“excessive reverence for that which is old”), and scrouge (“to inconvenience or discomfort a person by pressing against him or her by standing too close”).

The author happily admits that he’s a confirmed vocabularian (“one who pays too much attention to words”).