Description
|
The project focuses on the problem of current heterogeneity of language data intended for linguistic research. The result of the project will be a unified system for storing and using language resources together with robust tools enabling effective text processing. All the available language resources will be converted into the new system. The project is concerned also with detection and classification of "named entities" in Czech texts, a subject not yet resolved for the Czech language. Its inclusion into the unified data system will improve results of automatic language processing, especially in the field of information retrieval from large text databases.
|