10

I was watching a video linked in this answer and it made the following claim:

[...], like most words in English, is derived from German.

That got me thinking. While I know that Germanic languages have greatly influenced English, so have the Latin and Celtic ones (and various others to a greater or lesser degree). Is it true that more than 50% of the English vocabulary is derived from Germanic roots?

More generally, can someone point me to data on this? I imagine attempts have been made to quantify the contribution of different languages to English; what were the results? What percentage of the lexis comes from each source?

Ideally I would like to see this expressed in terms of % of words but I am aware that, at least to some linguists, attempting to quantify vocabulary is anathema (to give a simple reason, all languages that allow number construction have an infinite vocabulary by definition), so alternative approaches to quantifying this are also welcome.

terdon
  • 21,559
  • 1
    Possible duplicate of English words of Latin origin: Did they replace existing words? . Note that it matters greatly whether you take an unweighted percentage of words noted in a particular dictionary ( you will get many more Latinate words) or whether you weigh the words by frequency (many more Germanic words). The latter is equivalent to counting the word the in a corpus every time it occurs; the former is equivalent to counting the only once in the entire (same) corpus. – Cerberus - Reinstate Monica Mar 06 '14 at 00:35
  • 1
    The latter depends somewhat on genre; the former depends greatly on the size and composition of your corpus. A large dictionary will have a disproportionately large percentage of Latinate words, it is a bit meaningless. – Cerberus - Reinstate Monica Mar 06 '14 at 00:39
  • @Cerberus your answer on the potential dupe is very good but seems to focus more on German specifically. I was after something more like the pie chart in MrShiny's answer below. I was also hoping that someone might dive in and expand a little on the difficulties of measuring vocabulary (something you touched on in your answer to the dupe). I can see why you consider it a dupe but I'm going for something more general here. – terdon Mar 06 '14 at 00:47
  • 1
    There are two obvious means of measuring. One is akin to selecting words from a dictionary and collecting them by origin. Another would be selecting words heard or read and collecting them, which would more heavily weight commonly used words. – Oldcat Mar 06 '14 at 01:03
  • @terdon: The other question has a link to that exact same pie chart... – Cerberus - Reinstate Monica Mar 06 '14 at 01:21
  • @Cerberus ah, so it does. Sorry, the description of the link was "more than half from latin" and this one shows 29 but you're quite right, they're the same, and this is another example of why pie charts are bad for this kind of thing. OK, voting to close, you've convinced me. – terdon Mar 06 '14 at 01:25
  • @terdon: Wow, voting to close your own question? You're a great sport, I'm sure there is a badge for that...at least I'll vote you up. – Cerberus - Reinstate Monica Mar 06 '14 at 02:00
  • @Cerberus gee thanks! A dupe's a dupe, mine or not :) – terdon Mar 06 '14 at 02:01
  • 2
    It is a bit lop-sided to say Germanic languages have influenced English. English is originally a Germanic language. But it has adopted French vocabulary (30 per cent) and Latin vocabulary ( also 30 per cent). – rogermue Jul 27 '15 at 05:19
  • Words derived from Celtic languages in English? Outside place names you’d be pushed to find enough to count on two hands. I can’t think of one. – David Jan 06 '22 at 19:26

1 Answers1

9

Wikipedia has the following pie chart showing the word origins:

It shows the breakdown as

  • Latin (including words used only in scientific / medical / legal contexts) ≈ 29%
  • French ≈ 29%
  • Germanic ≈ 26%
  • Greek ≈ 6%
  • Others ≈ 10%

It cites some references which back up these numbers but I don't have access to those.

To answer your question, it does not appear to be true that 50% of words are Germanic. However, that probably depends on what your context is. If you exclude scientific, medical, and legal, you will probably find a much lower incidence of Latin words. Given that English is itself a Germanic language, it's more surprising that Germanic doesn't account for MORE of the vocabulary.

  • Nice find, your google-fu beats mine apparently. Thanks! – terdon Mar 06 '14 at 00:24
  • Hang on, that seems to be claiming that words derived from proper nouns are about as many as those derived from Greek. I guess they're not counting the Greek that has come in through the back door of Latin? – terdon Mar 06 '14 at 00:28
  • 4
    You might do better with a frequency weighting, as the "Saxon" words are shorter and more common. – Oldcat Mar 06 '14 at 00:35
  • @Oldcat It isn't clear from the wikipedia article what method they used for surveying words. Arguments could be made either way for simply counting all the known words and for weighting the words by frequency. – Mr. Shiny and New 安宇 Mar 06 '14 at 14:01
  • I'm not sure how accurate this is. Is it possible that it's counting Latinised Greek words as purely Latin, or only counting words common in everyday conversation? – Pharap Aug 12 '19 at 22:01
  • The figures appear to be an analysis of the English lexicon, but the claim "[...] like most words in English is derived from German." seems to refer to words used daily. – Greybeard Jan 06 '22 at 18:47