63

What are the rules in English language to split words at the end of a line?

Where exactly must the hyphen split the word?

tchrist
  • 134,759
  • 2
    The rules are difficult---even for TeX. Here is a list of words that are unusually hyphenated. – Hugh Feb 22 '15 at 04:19
  • 1
    @Hugh: those words aren't unusually hyphenated (okay, some of them are); those are the words for which the TeX algorithm breaks. But, for example, acu-punc-ture is hyphenated the way anybody who knew it was derived from the word puncture by adding a prefix (which TeX doesn't) would hyphenate it. – Peter Shor Oct 14 '18 at 13:03
  • @Hugh … or words whose hyphenation doesn't depend on its spelling only (e.g., process). –  Nov 26 '23 at 20:30
  • This is too general and needs more focus. British-English hyphenation and American-English hyphenation differ. You do need to split this question in two or at least textually ask two separate questions in the text. –  Nov 26 '23 at 20:52

4 Answers4

51

The easiest thing to do, and the only way of being sure you agree with the authorities, is to look words up in the dictionary. Some of the hyphenations currently in American dictionaries make no sense at all. For example, the reason that prai-rie and fair-y are hyphenated the way they are seems to be that 150 years ago, the editors of Webster's dictionary thought they didn't rhyme1; prairie was pronounced pray-ree with a long 'a', while fairy was pronounced fair-ee with an r-colored 'a'.

That said, there are a few hyphenation rules that will let you hyphenate 90% of English words properly (and your hyphenations of the remaining 10% will be perfectly reasonable, even if they disagree with the authorities'). Here they are, in roughly decreasing order of priority:

  • Break words at morpheme boundaries (inter-face, pearl-y, but ear-ly).
  • Break words between doubled consonants — 'sc' counts here but not 'ck'. (bat-tle, as-cent, jack-et).
  • Never separate an English digraph (e.g., th, ch, sh, ph, gh, ng, qu) when pronounced as a single unit (au-thor but out-house).
  • Never break a word before a string of consonants that cannot begin a word in English (anx-ious and not an-xious).
  • Never break a word after a short vowel in an accented syllable (rap-id but stu-pid).

Finally, if the above rules leave more than one acceptable break between syllables, use the Maximal Onset Principle:

  • If there is a string of consonants between syllables, break this string as far to the left as you can (mon-strous).

There are lots of exceptions to these rules:

Sometimes the rules conflict with each other. For example, ra-tio-nal gets hyphenated after a short vowel in an accented syllable because ti acts as a digraph indicating that the 't' should be pronounced 'sh'.

Sometimes it's not clear what constitutes a morpheme boundary: why ger-mi-nate and not germ-i-nate?

Sometimes the pronunciation of a word varies—/væpɪd/ or /veɪpɪd/? Merriam-Webster and American Heritage dictionaries agree that both pronunciations are valid, but they disagree about the hyphenation.

And some hyphenations I can't figure out the reason for: the Maximum Onset Principle would suggest pa-stry, but the authorities all agree on pas-try.

1I believe some American dialects still make this distinction in pronunciation; the editors of Webster's dictionary weren't imagining things.

Peter Shor
  • 88,407
  • 1
    The Knuth hyphenation algorithm (or more properly, the Knuth–Liang one) for English does remarkably well with almost no exceptions. Nonetheless there do exist some US–UK differences. in a few words, such as process as one famous example, that require you know your target audience. – tchrist Jul 15 '12 at 15:46
  • 3
    And words like progress and desert, which get pronounced and hyphenated differently depending on whether they're a verb or a noun. A truly accurate hyphenation algorithm would also need to include a parser to determine the parts of speech. – Peter Shor Jul 17 '12 at 13:33
  • You give a "few hyphenation rules that will let you hyphenate 90% of English words properly". How did you determine what is "proper"? Are these your personal rules? What is your source? – Sverre May 24 '19 at 15:16
  • @Svere: By "proper" in my answer I mean agreeing with what at least one dictionary says to do. So I'm claiming these rules give the official hyphenation at least 90% of the time. – Peter Shor May 24 '19 at 15:23
  • 1
    @Sverre: I came up with these rules by (a) reading guidelines for hyphenation in various places and (b) looking at dictionary hyphenations and deducing what rules would explain them. I have been thinking about these rules on and off for the last 30 years, so I have not kept track of my sources. If you think this answer is wrong, you are welcome to write your own answer. If you can find a source that explains in detail the rules for hyphenation in English, please, please, please write an answer pointing to that source. I will be happy to upvote it. – Peter Shor May 24 '19 at 15:24
  • @PeterShor Actually, I came here to ask "Where can I find hyphenation rules for English?", because I haven't really found any in some authoritative sources I've looked in (e.g. Bringhurst and CMS). The only "rule" I seem to find is "Look up the word in Merriam-Webster", which is not very helpful. I find it odd that typography books and style guides don't give hyphenation rules for English, when they always do for my native language (Norwegian). – Sverre May 24 '19 at 15:26
  • @Sverre: For British hyphenations, the Oxford Dictionaries re-hyphenated a number of words in the 20th century (although lately, they seem to have stopped including hyphenations in their dictionary entries). So they must have a set of rules. I don't know who else would. There are so many subjective judgments (is this a morpheme boundary? which pronunciation should we use? which rule should take precedence in this case?) that it's an art and not a science. – Peter Shor May 24 '19 at 15:31
  • Yes, it's also the case for Norwegian that rules may conflict and that there are many subjective judgments at play. Yet said books will nevertheless provide a set of rules and guidelines for hyphenations, and it frustrates me that similar books for English don't. So far I'm just trusting LaTeX and its packages babel and polyglossia, but seeing how they get hyphenations for Norwegian wrong at times, I'd like to internalize a set of rules for English when I need to manually overrule what TeX does for me. – Sverre May 24 '19 at 15:44
  • @tchrist As of 2021, the TeX hyphenation exception list has 1751 entries. But even I have found quite a few more words, not yet on that list, that TeX doesn't break correctly. Here are some in the format *TeX's guess/M-W: promis-ing/prom-is-ing, medicine/med-i-cine, lawyer/law-yer, wildlife/wild-life, volatile/vol-a-tile, Schrödinger/Schrö-ding-er, salary/sal-a-ry, timetable/time-ta-ble…* So, 'remarkably well', perhaps. But I think 'almost no exceptions' is a stretch. A lot of entries on the lists (the official one and my own) are very common words. – linguisticturn Apr 27 '23 at 07:03
  • Please don’t anyone link questions about syllabification with this post about punctuation! – Araucaria - Him Oct 04 '23 at 07:07
  • The rule Never break a word before a string of consonants that cannot begin a word in English is not actionable, unfortunately. For “not an-xious”, you'd have to look at all words starting with x and check, for each such word, whether it's English. For example, are xenon and xenophobia already English enough or still too much Greek? –  Nov 26 '23 at 20:49
22

Vincent McNabb gives good advice generally on when to hyphenate—never if you can get away with it, and if you must, in a sensible place.

However, the question of where to hyphenate is something that dictionaries have answered for generations. Every entry has a word split into syllables, and technically speaking, according to traditional rules of typesetting, you can hyphenate a word at any syllable boundary. For example in the Merriam-Webster's online dictionary, the entry for "dictionary" reads "dic·tio·nary"—so you could hyphenate anywhere there appears a centered dot. Of course there are various rules of thumb and heuristics to choose the best place to hyphenate, and in many cases hyphenating a word dramatically reduces readability, but in a strict answer to OP's original question, it is acceptable to hyphenate a word at any syllable boundary, and you can find all the syllable boundaries in a dictionary.

nohat
  • 68,560
3

Technically speaking, hyphens are acceptable between any two syllables. But it is best to use them between prefixes, roots, and suffixes if at all. In most casual documents, hyphens decrease readability and oftentimes make documents look more cluttered, despite the fact that they form a nice, neat block. However, in news articles or novels, in places where moving the entire word would compromise the shape of the document, it is very common to see end hyphenation. Pick up a copy of 'Frankenstein' or 'The Magician's Nephew' and I assure you that you'll find quite a few. My copy of 'Seabiscuit' splits tomorrow between pages.

Tim Lymington
  • 35,168
Olivia
  • 31
1

Firstly, it is preferable not to split a word at the end of a line.

From the APA Style Guide, Section 1.A.9

Do not hyphenate (split) words at the end of a line.

If possible, add another word to the line, or take one away, so you don't need to split in the first place.

In fact. NEVER EVER split words. However, I will give what I consider to be ok guidelines:

There are really no proper rules as to how it should be done, when it is, so basically, use common sense. If it must be done, try to keep the components of meaning together - this is easy with obviously compound words, such as keyboard. E.g.

Key-
board. Super-
market.

It is also easy with words with prefixes such as "quasi" or "psuedo" e.g.

Pseudo-
science.

But mostly, splitting the words just makes them hard to read - and can lead to nightmares when the content of text is changed, because words that were once at the end of a line will no longer be at the end of a line, and everything will have to be re-done.

Unfortunately, most word processors are not very good at automatically splitting words, so it is best to keep that feature off. It is also possible, however, to put markers in words where the word processor will be allowed to split the word. In Microsoft Word, this is done by using Ctrl+-. This hyphen is invisible, unless the word gets split at the end of a line.

But as a rule of thumb, see if the word is still easy to understand if you say it out loud with a pause where you are going to break the word. Usually, try and split it in the middle of the word.

Civili-
sation.

But, as you can see, it just makes it harder to read. Just don't do it.

  • 16
    I would strongly argue against the “never do it” position!  The wording is the content of your writing; the typesetting is just the form it’s currently presented in.  Giving form priority over content is rarely a good thing — among other reasons, because the form may well change in future, if someone quotes your words, re-typesets them, reads them aloud,… .  When two wordings are almost equally good, then I’d agree that the one which avoids hyphenation is preferable; but compromising your wording for the sake of a hyphen is throwing the baby out with the bathwater! – PLL Dec 24 '10 at 02:23
  • 3
    Of course, as always, this is context-dependent.  In some situations — e.g. newspaper headlines, posters, and similarly compressed formats — immediate visual clarity is at a premium, so hyphen-avoidance should be given more importance than usual.  But I think in most prose contexts, it should be a pretty low priority! – PLL Dec 24 '10 at 02:26
  • @PLL Line-breaking hyphens are only needed for justified text or narrow columns. Even then, it's important to always use the soft-hyphen character instead of the standard hyphen-minus.

    Otherwise all hyphens become part of the content, leading to the need for manual adjustment in order to reuse the text. A simple "replace-all-line-breaks" action results in errors like dog-house and super-cede, necessitating more complex solutions.

    Often not an issue, but for those in publishing/archival/analysis/indexing, or with hundred-page documents, not a good thing.

    – Beejor Dec 05 '16 at 00:01
  • If you are something that will be typeset and published by somebody else later, the advice "If possible, add another word to the line, or take one away, so you don't need to split in the first place" is very bad advice. Write it the way you want it to read, and let the professional typesetter or publisher figure out how to break words at the end of a line.. – Peter Shor Oct 04 '23 at 15:42