IULA Spanish-English Technical Corpus Corpus Text

Resource Name IULA Spanish-English Technical Corpus
Description The corpus consists of a number of specialized texts (Law, Economics, Medicine, Environment and Computer Science domains) available in both Spanish and English languages. This LSP corpus has been compiled with articles from specialized publications, PhD theses, etc. It contains about a total of about 2,1 M words in 127 documents in each language.
Language Name
  • English
  • Spanish
Url http://hdl.handle.net/10230/20052
Documentation
Annotation Mode Manual
Annotation Standoff true
Annotation Tool TreeTagger
Annotation Type Morphosyntactic Annotation Pos Tagging
Character Encoding Utf 8
Contact Person Jorge Vivaldi
Creation Mode Manual
Domain
  • medicine
  • economy
  • environment
  • computer science
  • law
Funding Project Metanet4 U – Enhancing The European Linguistic Infrastructure
Identifier http://hdl.handle.net/10230/20052
Language Code
Language Identifier
  • en
  • es
Licence Cc By
Media Type Media Type
Meta Share Identifier NOT_DEFINED_FOR_V2
Mime Type http://purl.org/NET/mediatypes/text/xml
Multilinguality Type Parallel
Original Source IULACT http://bwananet.iula.upf.edu
Resource Creator Universitat Pompeu Fabra. Institut Universitari De Lingüística Aplicada (IULA)
Resource Short Name bilingual corpus
Segmentation Level Word
Size Information http://lodserver.iula.upf.edu/Metashare/resource/size_N18939
Tagset MULTEX/PAROLE