| Amy Alexander on 20 Nov 2000 03:05:28 -0000 | 
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
| Re: <nettime> The cultural bias of translating programs | 
On the general topic of Babelfish, cultural bias, etc... Had a project on plagiarist.org from about '98 to '00 called "Debabelizer." If you know the image-translation software called "Debabelizer," you know that its intent is to straighten out the Tower of Babel of image formats so that any image could appear exactly the same in any language. So, the plagiarist.org Debabelizer purported to do the same for language on the web - i.e. translate web pages into homogenized, Universal Web Language, free of Cultural Confusion... since, when you get right down to it, this seems to be somewhat the underlying assumption of translation software. (Though in the fine print they'll admit that's problematic, the big print is what the marketing folks write, obviously... more on this later...) The plagiarist Debabelizer took a web page, as chosen by the visitor, fed it into Babelfish to translate it from its native language into another, and then again to Babelfish to translate back into its native language. A canned sample can be seen at: http://plagiarist.org/debabdemo/ The original site, where you can see the setup, is: http://plagiarist.org/debab (However, that one, at least for now, *will not debabelize*... Babelfish changed their engine, and I haven't gotten it to work again with webpages so far. Of course, the same thing can be done manually...) There is also a similar debabelization technique in use for the Plagiarist Guestbook http://plagiarist.org/debab/guestbook.html "(As a service to our international community of visitors, comments will be automatically "debabelized" via Babelfish Translation Services.)" So, what have I learned about cultural bias of Babelfish from all this? I imagine that Babelfish suffers from the same cultural bias as whatever dictionaries it's using - at least when going to English. I suspect it's using fairly generic translation dictionaries... The reason I say this is, as far as I can tell, Babelfish seems to select Standard Dictionary Definition #1 as the meaning of a word whenever in doubt. Among other things, this sometimes makes for some flamboyant translations, because Definition #1 of verbs seems to tend to be more active/dramatic than other meanings. This is all very unscientific of course on my part. Also, Babelfish gets itself into trouble with adjectives fairly often, and quickly we discover the thin line between "close enough" and Politically Incorrect. My favorite Debabelizer gaffe: a New York Times front page featured a blurb about the "Yankees' colorful players." Debabelizer translated that into "the Yankees' colored players." I suspect the problem here is that the grammar algorithms are biased toward past participles. Whether or not that counts as a cultural bias I'm not sure, but it certainly does get Babelfish into some cultural hot water in such cases... Also, different language pairs generate different translations, obviously. German and Portugese tend to generate the most amusing translations when translated from English -> Other Language -> English, with German being the most consistently funny. I'm sure there's an interesting linguistic reason for this... I'm not a linguist, but perhaps it has to do with the similarity in many word definitions between the two languages, combined with a dissimilarity in the grammars. Just a guess. Originally, I thought of the premises upon which I'd based Debabelizer as being humorous exaggerations - i.e., nobody could *really* take these terrible translators at all seriously for actual communication - at least not more than an extremely rough translation of web pages. Even the marketing execs wouldn't have the chutzpah to purport that something as culturally subtle and complex as conversational language could/should be entrusted to computer programs, I thought. However, some recent articles in various mainstream press - Wired, etc., have made me think twice about this assumption. I have the following clipping from a newspaper interview taped to my refrigerator - with the interviewee's name stupidly clipped out for some reason. But anyway, he was some sort of Executive of New Technologies, for some big company which I think might have been IBM. Anyway, here's the excerpt: Q: Can you give an example of the Internet becoming a more "natural" experience? A: I am excited about language translations. [Today, instant online translation] is not good enough for contracts but it is good enough for conversation. It is good enough for customer service and support. So, for example, a Spanish-speaking person can ask a question of customer service and a Chinese person can answer it in Chinese and the Spanish person hears it in Spanish. [As such services become broadly available, they will] make the Internet a lot more natural for large numbers of people. Yeah, I can hardly wait to call Macintosh tech support to say my Apple keeps bombing, and have the FBI come arrest me for threatening to blow up New York.... -amy # distributed via <nettime>: no commercial use without permission # <nettime> is a moderated mailing list for net criticism, # collaborative text filtering and cultural politics of the nets # more info: majordomo@bbs.thing.net and "info nettime-l" in the msg body # archive: http://www.nettime.org contact: nettime@bbs.thing.net