Ubiquitous Knowledge Processing Lab
The Ubiquitous Knowledge Processing Lab (also UKP Lab) is a research lab at the Department of Computer Science at the Technische Universität Darmstadt. It was founded in 2006 by Iryna Gurevych.
Research Activities
UKP Lab develops natural language processing techniques for automatically understanding written text and applies them to information management like information retrieval, question answering, and structuring information in Wikis.[1]
The Ubiquitous Knowledge Processing Lab is among the leading research institutes in the field of utilizing Web 2.0 content as the source of lexical semantic information for natural language processing (NLP). Wikipedia and Wiktionary are employed as collaboratively constructed lexical semantic resources and used to improve expert-built resources like WordNet. These resources are used to develop semantically enhanced algorithms for information retrieval and question answering. An example is semantic search: If a user enters the query "pie-fruit" into a search engine, a standard search engine will retrieve pages containing the words "pie" but not the word "fruit", providing plenty of pages on "apple pie". An intelligent search engine will "understand" that the user is interested in pie recipes that do not use any type of fruit and retrieve appropriate documents.[2]
Further research activities at UKP lab are automatic quality assessment of text, sentiment analysis and opinion mining. Research activities are organized into the following research areas:
- Educational natural language processing
- Multilingual semantic information management
- Natural language processing for Wikis
A strong focus at UKP Lab is on utilizing novel natural language processing algorithms in real-life applications. UKP Lab collaborates with partners from academia and industry to improve various application scenarios, such as customer relationship management, digital humanities, educational applications, or public security.
Software
Part of the research efforts at UKP Lab is the development of natural language processing (NLP) software. The following software packages are freely available for research purposes:
DKPro
The Darmstadt Knowledge Processing Software Repository (DKPro) is an open source community of software projects aimed at Natural Language Processing. It offers robust, ready to use NLP components which are built on top of IBM’s Unstructured Information Management Architecture (UIMA) as a common and open framework.
DKPro contains basic natural language processing components like part-of-speech tagging and lemmatization. Additionally, the package offers components that support the processing of user generated discourse. User-generated content contains spelling errors, abbreviations and emoticons which prohibit direct application of standard NLP components. DKPro provides the required preprocessing tools.
Wikipedia API
The Java Wikipedia Library (JWPL)[3] was also developed at UKP Lab. It is a Java-based application programming interface for Wikipedia and allows programmatic access to all information contained in Wikipedia.
Wiktionary API
Parallel to JWPL, the Java Wiktionary Library (JWKTL)[3] offers programmatic access to information contained in the English and the German versions of Wiktionary.
References
- Hessen-IT News 03/2008.
- Example from: Impulse für die Wissenschaft 2010 (Volkswagenstiftung).
- Reference publication: Zesch, Müller, Gurevych: Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary, Proceedings of LREC 2008.