Miss William

I was diving into the SSA sub-1000 name charts after getting on a Wilhelmina kick a couple days ago and looking through all the Wil* girls names in the last 15 years and noticed something odd that I was hoping someone might shed some light on...


Babies with the girl box checked and given name listed as William have made up a couple dozen of the babies in the SSA data for the last 15 years...except in 2004 where 111 baby girls were named William. Anyone have any idea why? What was going on back then that would have tripled it for just one year?

Williams sisters weren't doing anything big, William and Kate wasn't really a thing yet...


By EVie
September 18, 2017 7:06 PM

A number of those regular occurrences are almost certainly coding errors (boys coded in as female by mistake), which tend to show up in the statistics when it's a super-popular name with a lot of births. For instance, in 2004, Jacob (the #1 boys' name) also shows up in the girls' stats with 171 births. But in 2005, there are only 40 "female" Jacobs. For Michael, it's 148 girls in 2004, but 61 in 2005. So it's not just William. I guess the bigger question this raises for me is actually, what was going on in 2004 that there were so many more data errors than in the surrounding years?

(A more detailed exploration of the data may lead to a different conclusion/set of questions, but I unfortunately don't have time to probe further right now),

September 18, 2017 8:21 PM

I think this is probably right; my guess for the spikes is a whole batch that get entered wrong, e.g. one county's Williams all entered as female instead of male one year.

If you look at any really popular name you'll find some cross-gender names in the statistics for the most popular years, generally in proportion to the name's overall popularity. A handful of these might be genuine, but since it works for feminine names as well as masculine I'm guessing most are data entry artefacts.

By mk
September 22, 2017 12:35 PM

Data entry errors, most likely.

Looking at the state info, almost all of the ones in 2004 are from Kentucky. So either there is some local person named William that was big that year, or more likely, it was a error in how the data was entered.

September 22, 2017 6:23 PM

I think it was some kind of error, as numerous other names showed up in KY ranked extraordinarily high for the "wrong" gender.

September 23, 2017 3:37 PM

Thanks for figuring out where they all were!