Now, have any of you all ever looked up this word? You know, in a dictionary? (Laughter) Yeah, that's what I thought. How about this word? You know, I'll show it to you: Lexicography: the practice of compiling dictionaries. Notice -- we're very specific. That word "compile." The dictionary is not carved out of a piece of granite, out of a lump of rock. It's made up of lots of little bits. It's little discrete -- that's spelled D-I-S-C-R-E-T-E -- bits. And those bits are words.
Now one of the perks of being a lexicographer -- besides getting to come to TED -- is that you get to say really fun words, like lexicographical. Lexicographical has this great pattern -- it's called a double dactal. And just by saying double dactal, I've sent the geek needle all the way into the red. But "lexicographical" is the same pattern as "higgledy-piggledy." Right? It's a fun word to say, and I get to say it a lot. Now, one of the non-perks of being a lexicographer is that people don't usually have a kind of warm, fuzzy, snuggly image of the dictionary. Right? Nobody hugs their dictionaries. But what people really often think about the dictionary is, they think more like this. Just to let you know, I do not have a lexicographical whistle. But people think that my job is to let the good words make that difficult left hand turn into the dictionary, and keep the bad words out.
But the thing is, I don't want to be a traffic cop. For one thing, I just do not do uniforms. And for another -- deciding what words are good and what words are bad is actually not very easy. And it's not very fun, and when parts of your job are not easy or fun, you kind of look for an excuse not to do them. So if I had to think of some kind of occupation as a metaphor for my work, I would much rather be a fisherman. I wanna throw my big net into the deep blue ocean of English and see what marvelous creatures I can drag up from the bottom. But why do people want me to direct traffic, when I would much rather go fishing? Well, I blame the Queen. Why do I blame the Queen? Well, first of all, I blame the Queen cause it's funny. But secondly, I blame the Queen because dictionaries have really not changed.
Our idea of what a dictionary is has not changed since her reign. The only thing that Queen Victoria would not be amused by in modern dictionaries is our inclusion of the F-word, which has happened in American dictionaries since 1965. So, there's this guy, right? Victorian era, James Murray, first editor of the Oxford English Dictionary. I do not have that hat. I wish I had that hat. So he's really responsible for a lot of what we consider modern in dictionaries today. When a guy who looks like that -- in that hat -- is the face of modernity, you have a problem. And so, James Murray could get a job on any dictionary today. There'd be virtually no learning curve.
And of course, a few of us are saying: Computers! Computers! What about computers? The thing about computers is -- I love computers. I mean, I'm a huge geek, I love computers. I would go on a hunger strike before I let them take away Google Book Search from me. But computers don't do much else other than speed up the process of compiling dictionaries. They don't change the end result. Because what a dictionary is, its Victorian design merged with a little bit of modern propulsion. It's steampunk. What we have is an electric velocipede. You know, we have Victorian design with an engine on it. That's all! The design has not changed.
And OK, what about online dictionaries, right? Online dictionaries must be different. This is the Oxford English Dictionary Online, one of the best online dictionaries. This is my favorite word, by the way: Erinaceous: Pertaining to the hedgehog family; of the nature of a hedgehog, very useful word. So look at that. Online dictionaries right now are paper thrown up on a screen. This is flat. Look how many links there are in the actual entry: two! Right? Those little buttons -- I had them all expanded except for the date chart. So there's not very much going on here. There's not a lot of clickiness. And in fact, online dictionaries replicate almost all the problems of print, except for searchability. And when you improve searchability, you actually take away the one advantage of print, which is serendipity. Serendipity is when you find things you weren't looking for because finding what you are looking for is so damned difficult.
So -- (Laughter) -- now, when you think about this, what we have here is a ham butt problem. Does everyone know the ham butt problem? Woman's making a ham for a big family dinner. She goes to cut the butt off the ham and throw it away, and she looks at this piece of ham and she's like, "This is a perfectly good piece of ham. Why am I throwing this away?" She thought, "Well my mom always did this." So she calls up Mom, and she says, "Mom, why'd you cut the butt off the ham when you're making a ham?" She says, "I don't know, my mom always did it!" So they call Grandma, and Grandma says, "My pan was too small!" (Laughter)
So it's not that we have good words and bad words -- we have a pan that's too small! You know, that ham butt is delicious! There's no reason to throw it away. The bad words -- see, when people think about a place and they don't find a place on the map, they think, "This map sucks!" When they find a nightspot or a bar and it's not in the guidebook, they're like, "Ooh, this place must be cool! It's not in the guidebook." When they find a word that's not in the dictionary, they think, "This must be a bad word." Why? It's more likely to be a bad dictionary. Why are you blaming the ham for being too big for the pan? So you can't get a smaller ham. The English language is as big as it is.
So if you have a ham butt problem, and you're thinking about the ham butt problem, the conclusion that leads you to is inexorable and counter-intuitive: paper is the enemy of words. How can this be? I mean, I love books. I really love books. Some of my best friends are books. But the book is not the best shape for the dictionary. Now they're gonna think "Oh, boy. People are gonna take away my beautiful, paper dictionaries?" No. There will still be paper dictionaries. When we had cars -- when cars became the dominant mode of transportation, we didn't round up all the horses and shoot them. You know, there're still gonna be paper dictionaries, but it's not gonna be the dominant dictionary. The book-shaped dictionary is not gonna be the only shape dictionaries come in. And it's not gonna be the prototype for the shapes dictionaries come in.
So think about it this way: if you've got an artificial constraint, artificial constraints lead to arbitrary distinctions and a skewed worldview. What if biologists could only study animals that made people go, "Aww." Right? What if we made aesthetic judgments about animals, and only the ones we thought were cute were the ones that we could study? We'd know a whole lot about charismatic megafauna, and not very much about much else. And I think this is a problem. I think we should study all the words, because when you think about words, you can make beautiful expressions from very humble parts. Lexicography is really more about material science. We are studying the tolerances of the materials that you use to build the structure of your expression: your speeches and your writing. And then often people say to me, "Well, OK -- how do I know that this word is real?" They think, "OK, if we think words are the tools that we use to build the expressions of our thoughts, how can you say that screwdrivers are better than hammers? How can you say that a sledgehammer is better than a ball-peen hammer? They're just the right tool for the job."
And so people say to me, "How do I know if a word is real?" You know, anyone that's read a children's book knows that love makes things real. If you love a word, use it. That makes it real. Being in the dictionary is an artificial distinction. It doesn't make a word any more real than any other way. If you love a word, it becomes real. So if we're not worrying about directing traffic, if we've transcended paper, if we are worrying less about control and more about description, then we can think of the English language as being this beautiful mobile. And any time one of those little parts of the mobile changes, is touched -- any time you touch a word, you use it in a new context, you give it a new connotation, you verb it -- you make the mobile move. You didn't break it; it's just in a new position, and that new position can be just as beautiful.
Now, if you're no longer a traffic cop -- the problem with being a traffic cop is there can only be so many traffic cops in any one intersection, or the cars get confused. Right? But if your goal is no longer to direct the traffic, but maybe to count the cars that go by, then more eyeballs are better. You can ask for help! If you ask for help, you get more done. And we really need help. Library of Congress: 17 million books. Of which half are in English. If only one out of every 10 of those books had a word that's not in the dictionary in it, that would be equivalent to more than two unabridged dictionaries.
And I find an un-dictionaried word -- a word like "un-dictionaried," for example -- in almost every book I read. What about newspapers? Newspaper archive goes back to 1759. 58.1 million newspaper pages. If only one in 100 of those pages had an un-dictionaried word on it, it would be an entire other OED. That's 500,000 more words. So that's -- that's a lot. And I'm not even talking about magazines, I'm not talking about blogs -- and I find more new words on BoingBoing in a given week than I do Newsweek or Time. There's a lot going on there.
And I'm not even talking about polysemy, which is the greedy habit some words have of taking more than one meaning for themselves. So if you think of the word "set" -- a set can be a badger's burrow, a set can be one of the pleats in an Elizabethan ruff -- and there's one numbered definition in the OED. The OED has 33 different numbered definitions for set. Tiny little word, 33 numbered definitions. One of them is just labeled "miscellaneous technical senses." Do you know what that says to me? That says to me it was Friday afternoon and somebody wanted to go down the pub. That's a lexicographical cop out, to say, "Miscellaneous technical senses."
So we have all these words, and we really need help! And the thing is, we could ask for help -- asking for help's not that hard. I mean, lexicography is not rocket science. See, I just gave you a lot of words and a lot of numbers, and this is more of a visual explanation. If we think of the dictionary as being the map of the English language, these bright spots are what we know about and the dark spots are where we are in the dark. If that was the map of all the words in American English, we don't know very much. And we don't even know the shape of the language. If this was the dictionary -- if this was the map of American English -- look, we have a kind of lumpy idea of Florida, but there's no California! We're missing California from American English. We just don't know enough, and we don't even know that we're missing California. We don't even see that there's a gap on the map.
So again, lexicography is not rocket science. But even if it were, rocket science is being done by dedicated amateurs these days. You know? It can't be that hard to find some words! So, enough scientists in other disciplines are really asking people to help, and they're doing a good job of it. For instance: there's eBird, where amateur birdwatchers can upload information about their bird sightings. And then ornithologists can go and help track populations, migrations, et cetera.
And there's this guy Mike Oates. Mike Oates lives in the U.K. He's a director of an electroplating company. He's found more than 140 comets. He's found so many comets, they named a comet after him. It's kind of out past Mars -- it's a hike. I don't think he's getting his picture taken there anytime soon. But he found 140 comets without a telescope. He downloaded data from the NASA SOHO satellite, and that's how he found them. If we can find comets without a telescope, shouldn't we be able to find words?
Now, you all know where I'm going with this, because I'm going to the Internet, which is where everybody goes. And the Internet is great for collecting words, because the Internet's full of collectors. And this is a little-known technological fact about the Internet, but the Internet is actually made up of words and enthusiasm. And words and enthusiasm actually happen to be the recipe for lexicography. Isn't that great? So there are a lot of really good word-collecting sites out there right now, but the problem with some of them is that they're not scientific enough. They show the word, but they don't show any context: Where did it come from? Who said it? What newspaper was it in? What book?
Because a word is like an archaeological artifact. If you don't know the provenance or the source of the artifact, it's not science -- it's a pretty thing to look at. So a word without its source is like a cut flower. You know -- it's pretty to look at for a while, but then it dies. It dies too fast. So this whole time I've been saying, "The dictionary, the dictionary, the dictionary, the dictionary." Not "a dictionary" or "dictionaries." And that's because -- well, people use the dictionary to stand for the whole language. They use it synecdochically -- and one of the problems of knowing a word like "synecdochically" is that you really want an excuse to say synecdochically. And so this whole talk has just been an excuse to get me to the point where I could say synecdochically to all of you. So I'm really sorry. But when you use a part of something -- like the dictionary is a part of the language, or a flag stands for the United States, a symbol of the country -- then you're using it synecdochically. But the thing is, we could make the dictionary the whole language. If we get a bigger pan, then we can put all the words in. We can put in all the meanings. Doesn't everyone want more meaning in their lives? And we can make the dictionary not just be a symbol of the language -- we can make it be the whole language.
You see, what I'm really hoping for is that my son -- who turns seven this month -- I want him to barely remember that this is the form factor that dictionaries used to come in. This is what dictionaries used to look like. I want him to think of this kind of dictionary as an eight-track tape. It's a format that died because it wasn't useful enough. It wasn't really what people needed. And the thing is, if we can put in all the words, no longer have that artificial distinction between good and bad, we can really describe the language like scientists. We can leave the aesthetic judgments to the writers and the speakers. If we can do that, then I can spend all my time fishing and I don't have to be a traffic cop anymore. Thank you very much for your kind attention.
作为字典编纂者的好处——除了有机会来TED演讲以外，就是可以说很有趣的单词，例如，lexicographical：字典编纂学。这个词有一种很棒的押韵，“扬抑抑格”。只要说到“扬抑抑格”，古怪指数就可以飙升到红色警戒。 其实，lexicographical与higgledy-piggledy (“杂乱无章”的意思)有一样的押韵。对吧？这个词单发音就很好玩，我常常说它。同时，作为字典编纂者，一个让人郁闷的地方是字典从来没有给人留下一个温暖，舒适的印象。对吧？没有人会拥抱他们的字典。但是，其实人们通常对字典的看法是这样的。告诉你一件事情，我没有什么纂字哨子，尽管大家认为我的工作是让所谓的好词做一个有难度的左转拐入字典，而把所谓的坏词拒之门外。
我发现没收录到字典里边的词(un-dictionaried)—— 以一个像“un-dictionaried” 那样的未收录词为例——在我读过的几乎每一本书里都有。还有报纸呢？报纸藏品从1759年开始，共有5810万个报纸页面。只要每100页报纸有一个没有收录的单词，那就相当于一整本OED(牛津英语字典)了，超过50万个词，那是很大的词汇量。我还没有说到杂志，博客——一周内，我在BoingBoing发现的新词比《新闻周刊》或《时代》杂志还多，那里正在创造出很多的新词。
因为一个词就像一个考古学产物，如果你不知道起源或源头，这就不够科学——这是应该去考究的。一个没有来源的词就像一朵被剪下来的花。看一会还可以，不久就蔫了。蔫得太快了。我一直在说：“字典，字典，字典，字典。”而不是“一本字典”，或“很多字典”，这是因为，人们用字典去代表整个语言。这是一种借代（以点代面）的说法(synecdochically) —— 知道“synecdochically”会引起的问题是：你真的很想找个理由去说synecdochically，这整个演讲也就是个借口，为了能让我可以有机会跟你们说synecdochically这个词。真对不起。但如果你用一样事物的一部分，例如字典是语言的一部分，或者用国旗代表美国作为一个国家的象征——那样你在用借代。不过，我们可以让字典成为语言的全部，如果我们有个大一点的锅，那我们就可以把所有的单词都放里边了，还可以把所有的单词解释都放里边。 每个人不都想人生更有意义吗（英语里，意义和单词解释是同一个单词）？那样，我们就可以让字典不仅仅是语言的象征，我们就可以让字典涵盖整个语言。