0

[Edit]

To be clear, this is not a programing question. What is sought an example of set of rules, a government or a business (in an English speaking locale) uses to determine the acceptable character set of people's names.

I am not trying to limit names, but to understand what limits have been applied in English speaking locales.

[Edit 2018] I did find one about length:
Hawaiian Woman Gets IDs That Fit Her 36-Character Last Name: the state's cards will have room for 40 characters in "first and last names and 35 characters for middle names,"

Any other examples appreciated.

@Mark Beadles suggested addressing what problem would an answer help one solve. My primary interest stems from decoding old and new government/business records (death certificates, city directories, phone books), both typed and some hand written. How would various names using atypical letters get mangled, restricted and how that changed over time to accommodate? Yet much of that could interpretive and opinionated. Instead, I am looking for rules, preferable published, that specified what names could be published in some English speaking locales.


Obviously the letters A-Z (upper and lower case) are used in a person's name.

Last names like "Smith-Brown", "Van Buren", "O'Brian" also use -, space and '.

Historical ÆLFRÆD and novel names like "Number 16 Bus Shelter and Violence" push the range of acceptable characters.

I am looking for a modern example of some English speaking country or business (like a phone directory publisher) published rules that define acceptable/non-acceptable characters of a name of more than just the usual A to Z.


Other posts:
What non-alphabetic characters are valid in English spelling? discuses English words, but I suspect names have a broader range. I suspect acceptable characters used in names in English speaking lands have a wider range characters than the rest of the language.
Any other letters lost besides thorn, edh, and yogh? discuses early words.


  • As a further datapoint, some people spell Zoe and Chloe with a diaresis on the e. And there are surely plenty of people with names borrowed from other cultures using all sorts of accents. Given that, I'm not sure what it would mean to have a list of "acceptable/non-acceptable" characters in a name. – David Richerby Jan 02 '15 at 23:55
  • I suppose folks in the US can use any characters they want in a name. (Prince famously changed his name to an image of a fox hunting horn.) From a practical standpoint (ie, what the state records office will accept) it's the Latin alphabet, maybe the Arabic numerals, plus the three characters above, though - is probably best limited for use to hyphenate married names. – Hot Licks Jan 02 '15 at 23:55
  • 4
    This question appears to be off-topic because it directly asks for names 'having a broader range than English words'. According to Monty Python, you can use any sound or symbol you can get your hands on. – Edwin Ashworth Jan 03 '15 at 00:00
  • I'm writing an answer, but I think this should be moved to SO or perhaps the User-Interface site. – Jon Hanna Jan 03 '15 at 00:09
  • 1
    @JonHanna since a person's name is their user interface? :-) –  Jan 03 '15 at 00:36
  • UX is a place where this question comes up. – Jon Hanna Jan 03 '15 at 00:48
  • @HotLicks- Hmm, I never considered that to be a fox hunting horn. http://upload.wikimedia.org/wikipedia/en/thumb/a/af/Prince_logo.svg/870px-Prince_logo.svg.png But I suppose part of it resembles one. – Jim Jan 03 '15 at 01:01
  • 2
  • @Hot Licks Prince may have changed his name/glyph for public use, but even after this, he married as "Prince Rogers Nelson", so his legal name was not unusual. – chux - Reinstate Monica Jan 03 '15 at 01:08
  • 1
    This question appears to be off-topic because it is a programming question, not an English question. – tchrist Jan 03 '15 at 01:10
  • @Edwin Ashworth Many languages have proper names that extend the character set as compared to the rest of the language. Example: Swedish rarely uses Q and W, except in names. Japanese has a similar situation. This posts asks for the acceptable character in English and is looking for an example of some standard used. I'll re-word the post to make more clear. – chux - Reinstate Monica Jan 03 '15 at 01:24
  • @tchrist This is, if anything, is close to a legal question than a programing one, as it can apply to law. But even outside law (and even before computers) City directories (old phone books) would change a "foreign" spelling to an acceptable one because the publisher had conforming rules as to what was acceptable for an English publication. – chux - Reinstate Monica Jan 03 '15 at 01:29
  • 1
    You can’t make rules about people’s names: they may call themselves as they please. As Mr Hanna’s answer properly observes, various nonprinting code points can sometimes be safely removed; however, sometimes they cannot. Furthermore, because English uses the Latin alphabet, names that use non-Latin letters are customarily transliterated to Latin. The OED for example uses only Latin (or very rarely, Greek) letters in head words—but this is much more than you imagine: the Latin script has thousands of code points, plus nearly ∞ diacritics. – tchrist Jan 03 '15 at 02:37
  • @tchrist I am not trying to make rules about people’s names. I am looking for published rules that a government or businesses have made concerning acceptable characters for names. The proposed duplicate does not cover proper name nor it it looking for a reference to back up its claim. – chux - Reinstate Monica Jan 03 '15 at 02:55
  • @chux You should not be trying to do what you are trying to do: José Núñez will not be pleased. Leave people’s names be. If you care what a particular agency’s for-print rules might be, contact them, not us. – tchrist Jan 03 '15 at 02:57
  • @tchrist There is nothing about this post that is changing nor limiting peoples names. It is only an inquiry for an example of rules some company or government has used. – chux - Reinstate Monica Jan 03 '15 at 02:59
  • While there certainly HAVE been restrictions on past databases (7-bit ASCII or EBCDIC alphameric with a limited set of standard forms of names was one common restriction), the introduction of Unicode makes justifying such restrictions difficult. Jon's answer is the right one if you're looking for best practice rather than history. – keshlam Jan 03 '15 at 08:35
  • @keshlam Thanks for the idea, yet this is not a programming question. Jon's answer well discusses various computer issues. This is about finding an example set of rules used by a government or company (English language) to qualify people's names. – chux - Reinstate Monica Jan 03 '15 at 08:46
  • Example from when, where, and what purpose? Whitespace-delimited ASCII with users forced to pre-divide name into Title, First, Middle, Last, Suffix is a legitimate example. Not a good modern example, but an example. If you need something better, please clarify exactly what you're asking for. – keshlam Jan 03 '15 at 19:27
  • 1
    This question appears to be off-topic because it is a resource request and not a question about the nuts and bolts of the English language. – Andrew Leach Jan 03 '15 at 23:06
  • @keshlam Just looking for what character set some sample company/government had used. Like a phone company may use A-Z,',- but not allow あ, え. IOW what rules has an institution used to qualify the characters a name use when the target group is an English speaking society? I oculd have asked for "What characters make up an name in English" but that is highly opinionated. Instead I'm looking for a sample some group has used. – chux - Reinstate Monica Jan 03 '15 at 23:14
  • @Andrew Leach As commented above, I could have asked for "what characters make up a name in English"? Certainly such a post is close-able as that is opinion based. That is why I request a sample used in practice by some outfit. I would appreciate your thoughts on how to ask the question better or a better site for this non-programming question. – chux - Reinstate Monica Jan 03 '15 at 23:23
  • Regarding the 2018 edit: that story doesn't seem to be about any limitations on the person's name itself. It's about the State of Hawaiʻi making changes to their drivers's license system to accommodate displaying her long name correctly, which was a design issue, not a name issue. That seems like a programming, or maybe design, question, though you maintain it's not. It might help if you could clarify what you intend to do with an answer - what problem would an answer help you solve. – Mark Beadles Dec 31 '18 at 22:09
  • @MarkBeadles Post updated. True that article was focused on length, yet I see that as one of various possible limitations of what some entity (the state) posed on a name. Being the state of Hawaii, I suspect they allow ' in names much like the state name Hawaiʻi in Hawaiian. It is that type of info that I am also seeking. – chux - Reinstate Monica Dec 31 '18 at 23:04

1 Answers1

2

English often changes names little these days (though your examples include one where it did change names earlier, the O' form used in Anglicising names beginning with Ó, like O'Brian for the Irish Ó Briain).

For this reason you can expect all manner of characters to be used in names in English-language contexts.

Characters you can safely prohibit:

Non-characters (obviously, you can prohibit them anywhere you a round-tripping or security requirement doesn't stop you).

Control characters.

Line-separators.

Paragraph-separators.

Characters you can safely normalise:

All space [Zs] characters except Ogham space and ideographic space can be replaced with the "normal" space (U+0020) character.

Aside from that, you'd really be better off allowing even currency symbols; they'd be very unusual, but the could appear in a name, especially if someone uses a name they perform under.

You could opt to prohibit alphabetic characters that aren't Latin, and force people to transcribe into Latin, but you're probably better off if you don't. (You aren't making any customers happier and are making some less happy, since it won't affect those of us whose names are in Latin letters, but will annoy those whose names are not).

Besides which, what if you're successful enough to have to widen your support to other languages in the future? You'll just have to loosen such a rule again when that happens.

Jon Hanna
  • 53,363
  • 2
    The OP made no mention that the context was a user interface. –  Jan 03 '15 at 00:39
  • 2
    Surprised you didn’t mention the canonical answer. – tchrist Jan 03 '15 at 01:09
  • 1
    @tchrist I didn't know that one, and it only rules out some bad ideas rather than come up with something that is positive (though I don't think glancing through that I've ruled out anything it says not to rule out), but I think the above is a reasonable stab at restricting characters that aren't used in names in English that doesn't fail. (Matters of numbers of names people have, ordering, and such I don't touch but they weren't in the question). – Jon Hanna Jan 03 '15 at 01:28
  • 1
    Names somewhat aside, here’s an example of English *words* from the OED using letters outside the A–Z set. They all use only Latin letters but for a few chemical compounds that are sometimes written with a single Greek letter-glyph instead of the Greek letter-name. But there are scads of diacritics to be found in English words, so you (next-to-)never want to remove those. – tchrist Jan 03 '15 at 02:45
  • Agree strongly. Record the names in the UTF-8 encoding, and/or provide some other escape mechanism to handle "questionable" characters. You're never going to be able to handle invented glyphs like "the artist formerly known as the artist formerly known as Prince", but if Unicode has the character you can bet that some troublemaker, some day, will want to make it part of their legal name. It has happened before, it will happen again; easier to be prepared for it than to have to deal with it in a panic later. – keshlam Jan 03 '15 at 08:45