Genomics IULA Corpus in Spanish Corpus Text

Resource Name Genomics IULA Corpus in Spanish
Description The corpus consists of a number of specialized texts of Genome domain. This is LSP corpus has been created with articles from specialized publications, PhD theses, etc. It contains about 1,650 K words in 276 documents.
Language Name Spanish
Url http://www.iula.upf.edu
Documentation
Annotation Mode Automatic
Annotation Standoff true
Annotation Tool TreeTagger
Annotation Type Morphosyntactic Annotation Pos Tagging
Character Encoding Utf 8
Contact Person Jorge Vivaldi
Creation Mode Automatic
Domain medicine
Funding Project Metanet4 U – Enhancing The European Linguistic Infrastructure
Identifier IULA_cGenES
Language Code http://www.fao.org/aims/aos/languagecode.owl#spa
Language Identifier es
Licence Cc By Nc Sa
Linguality Monolingual
Media Type Media Type
Meta Share Identifier NOT_DEFINED_FOR_V2
Mime Type http://purl.org/NET/mediatypes/text/xml
Original Source IULACT GENOME http://bwananet.iula.upf.edu/
Resource Creator Universitat Pompeu Fabra. Institut Universitari De Lingüística Aplicada (IULA)
Resource Short Name Genomics Spanish
Segmentation Level Word
Size Information http://lodserver.iula.upf.edu/Metashare/resource/size_N15A15
Tagset MULTEX/PAROLE