Samenvatting
A general problem in all systems to process language (parsing, translating, etc.) is ambiguity: words have many, fuzzily defined meanings, and meanings shift with the context. This may be tackled by quantifying the connotative or associative meaning, which can be represented as a matrix of mutual association strengths. With many thousands of words, there are billions of possible associations, though, and there is no obvious method to measure all of them. This "knowledge acquisition bottleneck" can be tackled by mining implicit associations from the billions of documents and millions of users on the World-Wide Web. The present paper discusses two methods to achieve this: lexical co-occurrence, a measurement of the frequency with which words appear in each other's neighborhood, and web learning algorithms, an application of the Hebbian rule to create associations between subsequently "activated" words or pages. The mechanism of spreading activation can be applied to the resulting associative networks for clustering, context-driven disambiguation, and personalized recommendation. A generalization of such methods could transform the web into a "global brain", that is, an intelligent, learning network that assimilates the implicit knowledge and preferences of its users.
Originele taal-2 | English |
---|---|
Titel | Proceedings of the International Colloquium: Trends in Special Language & Language Technology |
Redacteuren | R. Temmerman, M. Lutjeharms |
Uitgeverij | Standaard Editions, Antwerpen |
Pagina's | 15-44 |
Aantal pagina's | 30 |
Status | Published - 2001 |
Evenement | Unknown - Duur: 1 jan 2001 → … |
Conference
Conference | Unknown |
---|---|
Periode | 1/01/01 → … |