publications by Karl G Linden.
Papers Published
- Linden, K., Finding cross-lingual spelling variants,
String Processing and Information Retrieval. 11th International Conference, SPIRE 2004. Proceedings (Lecture Notes in Comput. Sci. Vol.3246)
(2004),
pp. 136 - 7 .
(last updated on 2007/04/09)
Abstract: Finding term translations as cross-lingual spelling variants on the fly is an important problem for cross-lingual information retrieval (CLIR). CLIR is typically approached by automatically translating a query into the target language. When automatically translating the query, specialized terminology is often missing from the translation dictionary. The analysis of query properties in (Pirkola and Jarvelin, 2001) has shown that proper names and technical terms often are prime keys in queries, and if not properly translated or transliterated, query performance may deteriorate significantly. As proper names often need no translation, a trivial solution is to include the untranslated keys as such into the target language query. However, technical terms in European languages often have common Greek or Latin roots, which allows for a more advanced solution using approximate string matching to find the word or words most similar to the source keys in the index of the target language text database (Pirkola et al., 2001)
Keywords: full-text databases;information retrieval;language translation;natural languages;string matching;
|