Services

Services by Task

Lexicon Merging

  • LMF Merger Web Service

    Given a list of URLs pointing to LMF files, this webservice merges them into a single LMF file. It works for LMF files encoding the information in the same way, i.e. same labels, values and structure. This will work, for example, for merging different lexica learnt under PANACEA platfor... more

Back to top

NLP Applications

  • Basyque

    BASYQUE (Base de Données Syntaxique Basque) is the web application we have developed to store, organize, manage and search for all the information concerning dialectal variation in Basque speaking areas, and specifically, in the North-Eastern Basque dialects. In order to collect and ana... more

  • Bertsolari Xa

    Application that finds words ended by the character sequence given by the user. BertsolarIXA is able to find not only lemmas but also inflected forms. Results can be filtered by the domain and phonetic rules can also be applied. It is a tool aimed to help verse-makers.

Back to top

Word Sense Disambiguation

  • Wsd Ixa

    Word-Sense Disambiguation. The WSD system is based on the well known Support Vectors Machine (SVM) Algorithm. This system has been trained on EuSemCor corpus (the unique basque corpus semantically tagged). Due to corpus's reduced size, the WSD system has been trained for 402 polysemous ... more

Back to top

Alignment

Back to top

Chunking: Segmentation

  • Free Ling Chunker Parser Web Service V.2.1

    Freeling-based chunker parser. The languages supported are English, Catalan, Spanish, Asturian and Galician. WARNING: This WS has a new version.

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

  • Free Ling Sentence Splitter Web Service V.2.1

    This WS performs a FreeLing-based sentence splitter. The WS splits a file in plain text format and UTF-8 encoded into units (tokens). Output sentences are separated by empty lines. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, Russian and Po... more

Back to top

Corpus Processing

  • Stream Editor Web Service (Sed)

    This WS performs basic text transformations on an input text. The serveice is based on the 'sed' progam, a Unix utility that parses and transforms text, using a simple, compact programming language.

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

  • IULA GrAF Tagger Web Service

    This WS converts the results of IULA tagger (PoS tagger) in GrAF output.

  • Free Ling Tokenizer Web Service V.2.1

    This WS deploys a FreeLing-based text tokenizer. The WS splits a file in plain text format and UTF-8 encoded into units (tokens). The languages supported are Catalan, English, Galician, Italian, Portuguese, Russian, Spanish, Welsh, and Asturian. WARNING: This WS has a new version.

  • Corpus To Vectors Web Service

    This WS converts a corpus to Weka vector arff file. The language supported are Asturian, Catalan, English, Galician, Italian, Portuguese, Russian, Spanish, and Welsh.

  • IULA Preprocess Web Service

    This WS provides a text segmentation into minor structural units (titles, paragraphs, sentences, etc.); detection of entities (not found in a dictionary: numbers, abbreviations, URLs, emails, etc.); and the keeping of sequences of two or more words in a single block (dates, phrases, etc... more

  • Free Ling Sentence Splitter Web Service V.3

    This WS performs a FreeLing-based sentence splitter (v 3.0). The WS splits a file in plain text format and UTF-8 encoded into units (tokens) separated by new lines. Output sentences are separated by empty lines. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Ga... more

  • Free Ling Morphosyntactic Tagger Web Service V.3

    This WS performs a FreeLing-based part-of-speech tagger (v 3.0). WS job duration depends on the server load, approximately 1 million words takes one minute. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, and Portuguese. The output is a tabula... more

  • Search Signatures Web Service

    Given a list of lemmas, the WS looks for the occurrences of them in IULA corpus, applies the given regular expressions and returns all the signatures.

  • Free Ling Tokenizer Web Service V.3

    This WS deploys a FreeLing-based text tokenizer (v 3.0). The WS splits a file in plain text format and UTF-8 encoded into units (tokens) where tokens are separated by new lines. The languages supported are Catalan, English, Galician, Italian, Portuguese, Russian, Spanish, Welsh, and As... more

  • IULA Tokenizer Web Service

    The IULA tokenizer WS splits a file in plain text format and UTF-8 encoded into units (tokens). The languages supported are Catalan and Spanish.

  • Provenance Collector Web Service

    This WS collects all the headers of input XML files used in a Taverna workflow. The metadata that can be stored in the resulting XML file are: 1) workflow name, 2) workflow myExperiment link, 3) processors list, and 4) list of XML headers.

Back to top

Corpus Workbench

  • CQP Query Web Service

    This WS allows querying an already indexed corpus (see CQP indexer WS for indexing details). The WS is based on the IMS Open Corpus Workbench (CWB). Language independent WS.

  • CQP Indexer Web Service

    CQP indexer WS based on the IMS Open Corpus Workbench (CWB). The input is an annotated corpus in tabular format. The output is the Corpus ID to be used by the CQPquery Web Service. Language independent WS.

Back to top

Format Conversion

Back to top

Lexicon Terminology Extraction

  • Bayesian Parameter Estimation Web Service

    Given a training set encoded as vectors of cue (or feature) occurrences, this web service estimates the parameters P(cuei|class): the probability of seeing each cue as a member or non-member of the class. This estimation is performed using Bayesian inference, which combines prior knowle... more

  • Process Nouns Classifier Web Service

    This WS identifies process nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this pred... more

  • Weka Noun Signatures Creator Web Service

    This web service creates a weka file containing context information of a list of nouns in a given corpus. The context information for each noun is extracted using a set of Regular Expressions and it is encoded in one vector (one line per noun in the weka file). Each slot in the vector r... more

  • LMF File Merger Web Service

    Given two LMF files, this webservice merges them into a single LMF file. It works for LMF files encoding the information in the same way, i.e. same labels, values and structure. This will work, for example, for merging different lexica learnt under PANACEA platform. If the LMF files con... more

  • Artifact Nouns Classifier Web Service

    This WS identifies artifact nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this pr... more

  • Eventive Nouns Classifier Web Service

    This WS identifies eventive nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this pre... more

  • Matter Nouns Classifier Web Service

    This WS identifies matter nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this predic... more

  • Abstract Nouns Classifier Web Service

    This WS identifies abstract nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this pre... more

  • Select Nouns From LMF Lexicon Web Service

    Given a LMF file with nouns classified with a score (see Nouns classifier Web Services), this WS filters the nouns with confidence over a desired threshold. Language independent WS.

  • Semiotic Nouns Classifier Web Service

    This WS identifies semiotic nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this pre... more

  • Location Nouns Classifier Web Service

    This WS identifies location nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this pre... more

  • Human Nouns Classifier Web Service

    This WS identifies human nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The ouptut is a LMF file with the classifier prediction for each noun. ou can choose to have this predict... more

  • Social Nouns Classifier Web Service

    This WS identifies social nouns in a part of speech tagged text (with FreeLing Morphosyntactic tagger V 3.0 WS). The classification is performed with a pre-trained Decision Tree. The output is a LMF file with the classifier prediction for each noun. You can choose to have this pred... more

  • Naive Bayes Classifier Web Service

    This webservice performs traditional Naive Bayes classification of instances given in a weka file. It outputs the predicted classification for each instance and some statistics about the performance of the classification. The parameters needed as input can be learnt using estimate_bayes... more

  • Lexical Classifier Web Service

    Given a set of signatures in a weka file (test_file.arff), classify them using the parameters estimated for each cue (theta_file.csv).

Back to top

Management

Back to top

Morphological Tagging

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

  • Wsd Ixa

    Word-Sense Disambiguation. The WSD system is based on the well known Support Vectors Machine (SVM) Algorithm. This system has been trained on EuSemCor corpus (the unique basque corpus semantically tagged). Due to corpus's reduced size, the WSD system has been trained for 402 polysemous ... more

  • Morfeus

    Morphological analyzer.

  • Eustagger

    Lemmatizer. Eustagger is a robust and wide-coverage morphological analyser and a Part-of-Speech tagger for Basque. The analyser is based on the two-level formalism and has been designed in an incremental way with three main modules: the standard analyser, the analyser of linguistic vari... more

Back to top

Morphosyntactic Tagging

  • Free Ling Morphosyntactic Analyzer Web Service V.2.1

    This Web Service deploys a FreeLing-based morphological analyzer. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, Russian and Portuguese. WARNING: This WS has a new version.

  • Twitter NLP Web Service

    This WS is based on the Twitter NLP tool developed by Noah's ARK group (Noah Smith's research group at the Language Technologies Institute, School of Computer Science, Carnegie Mellon University). A fast and robust Java-based tokenizer and part-of-speech tagger for Twitter, its trainin... more

  • IULA GrAF Tagger Web Service

    This WS converts the results of IULA tagger (PoS tagger) in GrAF output.

  • IULA Preprocess Web Service

    This WS provides a text segmentation into minor structural units (titles, paragraphs, sentences, etc.); detection of entities (not found in a dictionary: numbers, abbreviations, URLs, emails, etc.); and the keeping of sequences of two or more words in a single block (dates, phrases, etc... more

  • Free Ling Morphosyntactic Tagger Web Service V.3

    This WS performs a FreeLing-based part-of-speech tagger (v 3.0). WS job duration depends on the server load, approximately 1 million words takes one minute. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, and Portuguese. The output is a tabula... more

  • Free Ling Morphosyntactic Analyzer Web Service V.3

    This Web Service deploys a FreeLing-based morphological analyzer (v 3.0). The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, Russian and Portuguese.

  • Free Ling Morphosyntactic Tagger Web Service V.2.1

    This WS performs a FreeLing-based part-of-speech tagger. WS job duration depends on the server load, approximately 1 million words takes one minute. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, and Portuguese. WARNING: This WS has a new ver... more

  • IULA Tree Tagger Web Service

    This WS is a morphosyntatic tagger. The disambiguation process is done by a TreeTagger instance trained by the IULA. The input is plain text in Catalan or Spanish. The output allows optional formats and optional encoding. (http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/)

Back to top

Named Entity Recognition

  • Eihera

    Eihera is a system for Named Entity recognition and classification in written Basque. The system is designed in four steps: first, the development of a recognizer based on linguistic information represented on finite-state-transducers; second, the generation of semi-automatically annota... more

  • Textual Emigration Analysis

    Historians, literary scientists, and others are interested in the semantic interpretation of text. With automatic pre-processing of texts, e.g. named entity recognition, coreference resolution, and dependency parsing, relevant semantic relations can be extracted. The Stuttgart tools ext... more

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

  • Anonymizer Web Service

    This WS substitutes proper nouns with tags. This process anonymizes an input text by eliminating any person, place, corporation, etc. name. The service automatically calls the FreeLing WS and makes use of its Named Entity Recognition tool to detect proper nouns. The languages supported ... more

  • Conta Words

    ContaWords is a web application that reads the words of a text file and decides what part of speech to assign to each word (credit-Noun-credit but credit-Verb-to_credit). It then begins to count how many times a word appears in the text in every possible way (credits, credit, credited… ... more

  • Free Ling Morphosyntactic Tagger Web Service V.3

    This WS performs a FreeLing-based part-of-speech tagger (v 3.0). WS job duration depends on the server load, approximately 1 million words takes one minute. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, and Portuguese. The output is a tabula... more

  • Free Ling Morphosyntactic Tagger Web Service V.2.1

    This WS performs a FreeLing-based part-of-speech tagger. WS job duration depends on the server load, approximately 1 million words takes one minute. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, and Portuguese. WARNING: This WS has a new ver... more

  • Free Ling Name Entity Recognition Web Service

    This Web Service deploys a FreeLing-based name entity recognizer (v 3.0). The languages supported are English, Catalan, Spanish, Asturian, Welsh, Galician, Italian, Russian and Portuguese.

Back to top

Querying

  • Pml Tq Search Engine And Interface

    PML-TQ is a powerful open-source search tool for all kinds of linguistically annotated treebanks with several client interfaces and two search backends (one based on a SQL database and one based on Perl and the TrEd toolkit). The tool works natively with treebanks encoded in the PML dat... more

  • Keeleveeb Query

    Keeleveeb is a portal, where one can run queries on several dictionaries and corpora. There are 12 Estonian monolingual dictionaries, 12 bilingual dictionaries (one of them Estonian), 19 Specialty dictionaries, 15 Learner dictionaries (bilingual, Estonian-Russian-Estonian), 23 corpora, ... more

  • IULA Concordancer Web Service

    Given a lemma and a category, this WS returns the sentences of the IULA corpus where this lemma occurs. The user can perform a domain search. The languages supported are Spanish and English.

  • The Glossa Corpus Search System

    New version of the corpus search and post-processing tool Glossa. While the old version was tightly coupled to the IMS Corpus Workbench (CWB) and could only search in CWB-encoded corpora, the new version is flexible with respect to search engines and can even search in corpora located o... more

  • Gretel 2.0

    GrETEL stands for Greedy Extraction of Trees for Empirical Linguistics. It is a user-friendly search engine for the exploitation of treebanks. It comes in two formats: a) Example-based search: in this search mode you can use a natural language example as a starting point for searching ... more

  • Search Signatures Web Service

    Given a list of lemmas, the WS looks for the occurrences of them in IULA corpus, applies the given regular expressions and returns all the signatures.

Back to top

Statistics Analysis

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

  • P Clue/ Lexical Class Calculator Web Service

    This WS calculates the probability of seeing a linguistic cue given a lexical class (P(cue|class) value). This probability is computed given the occurrences of cues in a corpus (codified in the signatures file) and the information of belonging or not belonging of these words to differen... more

  • Ted Pedersen's Ngrams Counter Web Service

    This WS performs the Count function from Ted Pedersen's Ngram Statistics Package (used to identify word Ngrams that appear in large corpora using standard tests of association such as Fisher's exact test, the log likelihood ratio, Pearson's chi-squared test, the Dice Coefficient, etc.).... more

  • Conta Words

    ContaWords is a web application that reads the words of a text file and decides what part of speech to assign to each word (credit-Noun-credit but credit-Verb-to_credit). It then begins to count how many times a word appears in the text in every possible way (credits, credit, credited… ... more

  • TF-IDF Calculator Web Service

    This WS calculates the Term Frequency (TF) and the Inverse Document Frequency (IDF) of a word in a given corpus. The two values, labeled TF-IDF, are a statistical measure used to evaluate how important a word is to a document in a collection or corpus.

  • Ted Pedersen's Ngram Statistics Package

    Ted Pedersen's Ngram Statistics Package (used to identify word Ngrams that appear in large corpora using standard tests of association such as Fisher's exact test, the log likelihood ratio, Pearson's chi-squared test, the Dice Coefficient, etc.).

  • P Clue/ Lexical Class From Weka Computer Web Service

    Given a training set encoded as vectors of cue (or feature) occurrences in weka format, this web service computes P(cuei|class): the probability of seeing each cue as a member or non-member of the class using MLE approach (counts frequencies of appearance of each cue in each class). ... more

  • Vocabulary Analyzer Web Service

    This WS calculates different lexicometric measures and displays them graphically (tokens, types, hapaxes and type/token ratio). The input is a plain text corpus with one token per line. Language independent WS.

  • CQP Analyzer Web Service

    This WS allows analyzing an already indexed corpus (see CQP indexer WS for indexing details). The WS returns an Excel file with some statistical metrics such as number of nouns, verbs, ngrams, etc. The languages supported are Spanish and English.

  • P Clue/ Lexical Class Computer Web Service

    This WS calculates the probability of seeing a linguistic cue given a lexical class (P(cue|class) value). This probability is computed given the occurrences of cues in a corpus (codified in the signatures file) and the information of belonging or not belonging of these words to differen... more

  • Corpus To Vectors Web Service

    This WS converts a corpus to Weka vector arff file. The language supported are Asturian, Catalan, English, Galician, Italian, Portuguese, Russian, Spanish, and Welsh.

Back to top

Stemming Lemmatization

  • IULA Tree Tagger Web Service

    This WS is a morphosyntatic tagger. The disambiguation process is done by a TreeTagger instance trained by the IULA. The input is plain text in Catalan or Spanish. The output allows optional formats and optional encoding. (http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/)

  • IULA GrAF Tagger Web Service

    This WS converts the results of IULA tagger (PoS tagger) in GrAF output.

  • IULA Preprocess Web Service

    This WS provides a text segmentation into minor structural units (titles, paragraphs, sentences, etc.); detection of entities (not found in a dictionary: numbers, abbreviations, URLs, emails, etc.); and the keeping of sequences of two or more words in a single block (dates, phrases, etc... more

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

Back to top

Syntactic Tagging

  • Trl Malt Parser Module For Spanish

    The file espmalt-1.0.mco contains a single malt configuration for parsing Spanish text with MaltParser. The parser presupposes that the input is in CoNLL-X format and tagged with the part-of-speech tags of FreeLing tagger.

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

  • Free Ling Chunker Parser Web Service V.2.1

    Freeling-based chunker parser. The languages supported are English, Catalan, Spanish, Asturian and Galician. WARNING: This WS has a new version.

  • Free Ling Chunker Parser Web Service V.3

    This WS performs a FreeLing-based chunker parser (v 3.0). The WS requires a plain text input. The possible outputs formats are FreeLing , XML, and XML CQP ready. The languages supported are English, Catalan, Spanish, Asturian and Galician.

  • Free Ling Dependency Parser Web Service V.3

    This WS deploys a FreeLing-based dependency parser (v 3.0). The WS requires a plain text input. The possible outputs formats are FreeLing, XML, and XML CQP ready. The languages supported are English, Catalan, Spanish, Asturian and Galician.

  • Ixati

    Chunking for Basque. There is a web service. Zatiak performs shallow syntactic analysis of a sentence. This program reads an input text and, after morphological processing, identifies pieces of text (chunks). Each chunk is marked with its type: nominal phrase (NP or PP) or verb chain, t... more

  • Free Ling Dependency Parser Web Service V.2.1

    Freeling-based dependency parser. The languages supported are English, Catalan, Spanish, Asturian and Galician. WARNING: This WS has a new version.

Back to top

Tokenization

  • Free Ling Tokenizer Web Service V.2.1

    This WS deploys a FreeLing-based text tokenizer. The WS splits a file in plain text format and UTF-8 encoded into units (tokens). The languages supported are Catalan, English, Galician, Italian, Portuguese, Russian, Spanish, Welsh, and Asturian. WARNING: This WS has a new version.

  • IULA GrAF Tagger Web Service

    This WS converts the results of IULA tagger (PoS tagger) in GrAF output.

  • IULA Tokenizer Web Service

    The IULA tokenizer WS splits a file in plain text format and UTF-8 encoded into units (tokens). The languages supported are Catalan and Spanish.

  • Free Ling Tokenizer Web Service V.3

    This WS deploys a FreeLing-based text tokenizer (v 3.0). The WS splits a file in plain text format and UTF-8 encoded into units (tokens) where tokens are separated by new lines. The languages supported are Catalan, English, Galician, Italian, Portuguese, Russian, Spanish, Welsh, and As... more

  • Ixa Pipes

    A modular set of Natural Language Processing tools for English and Spanish. IXA pipes is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology for English and Spanish. It offers robust and efficient linguistic annotation to both resear... more

Back to top

Text Similarity

  • Ted Pedersen's Text Similarity Web Service

    This WS is based on Ted Pedersen's Text Similarity module. It measures the similarity of two documents based on the number of shared words scaled by the lengths of the files. Text Similarity WS computes the F-Measure, the Dice Coefficient, the Cosine, and the Lesk measure. Language inde... more

Back to top

Data Anonymization

  • Tmx Shuffling Web Service

    This WS randomizes the order of the translation units in TMX files. The goal is to make it difficult to reproduce the original text. The input size limit is 100 MB. Language independent WS.

  • Anonymizer Web Service

    This WS substitutes proper nouns with tags. This process anonymizes an input text by eliminating any person, place, corporation, etc. name. The service automatically calls the FreeLing WS and makes use of its Named Entity Recognition tool to detect proper nouns. The languages supported ... more

  • Linescrambler Parallel Web Service

    This WS will scramble the lines in a parallel text corpus keeping the alignment. The goal is to make it difficult to reproduce the original text. The input size limit is 100 MB. Language independent WS.

  • Linescrambler Web Service

    This WS scrambles the lines in a file. The goal is to make it difficult to reproduce the original text. The input size limit is 100 MB. Language independent WS.

Back to top

Lexicon Look Up

  • IULA Paradigma Web Service

    Given a verb (infinitive or a verbal form) this WS outputs its verbal paradigm grouped according tense and mode. The languages supported are Catalan and Spanish.

  • IULA Lexicon Look Up Web Service

    Given a word form, this WS returns the lexical information by looking it up in the IULA's lexicon. The languages supported are Catalan, Spanish or English.

  • Word Ties: A Nordic/Baltic Multilingual Wordnet Initiative

    WordTies describes a multilingual wordnet initiative embarked in the META-NORD/ META-NET projects and originally concerned with the validation and pilot linking between Nordic and Baltic wordnets. Wordnets in Nordic/Baltic countries. The builders of these wordnets have applied very d... more

  • Keeleveeb Query

    Keeleveeb is a portal, where one can run queries on several dictionaries and corpora. There are 12 Estonian monolingual dictionaries, 12 bilingual dictionaries (one of them Estonian), 19 Specialty dictionaries, 15 Learner dictionaries (bilingual, Estonian-Russian-Estonian), 23 corpora, ... more

  • Mimore

    It is a web application that enables simultaneous search in three micro-comparative databases on Dutch dialects via a common interface. This makes it possible to investigate potential correlations between variables at the three different linguistic levels. Cartographic functionality ena... more

  • Diccionario Básico Escolar

    Students basic dictionary (Cuba). The GUI of the Diccionario Básico Escolar allows, besides common dictionary lookup, detecting the most common misspellings, consulting verb conjugation, syllabification of the headwords and, in some cases, watching illustrations attached to the entries.... more

Back to top

Machine Translation

  • Matxin

    Machine translation from Spanish to Basque. Matxin is a Transfer-based MT system from Spanish into Basque. It is an open, reusable and interoperable framework which can be improved in the next future combining it with the statistical model. The MT architecture reuses several open tools ... more

  • Eusmt

    Statistical Machine Translation from Spanish to Basque. Use of segmentation and reordering in Statistical Machine Translation from Spanish to Basque. It allows our system to achieve a relative improvement of 10% in the HTER metric.

Back to top

Question Answering

  • Ihardetsi

    A Question-Answering system for the area of Science and Technology. Ihardetsi is a question answering system for Basque. It is a general platform which architecture pays special attention to: 1) the integration of the development and evaluation environments, and 2) the systematic use of... more

Back to top

Spelling C Hecker

  • Xuxen

    Spelling corrector on-line. Xuxen is a spelling corrector for Basque integrated in MS-Office, OpenOffice, Firefox, OCR programs and others. It can be downloaded from the Basque Govern's website (> 25.000 downloads) Eleka is the company which manages it now. The fact that Basque is a ... more

Back to top

Dependency Parsing

  • Bohnet Parser Web Service

    This WS performs dependency parsing using Bohnet's graph-based Parser. The input is text in plain text or CoNLL format. The languages supported are English and Spanish.

  • Malt Parser Web Service

    This WS calls an instance of MaltParser for Spanish trained with the IULA treebank developed in the Metanet4you project. The input of this WS is plain text. The service performs PoS tagging with FreeLing and then performs the dependency parsing using Malt parser. The output follows CoNL... more

  • Free Ling Dependency Parser Web Service V.3

    This WS deploys a FreeLing-based dependency parser (v 3.0). The WS requires a plain text input. The possible outputs formats are FreeLing, XML, and XML CQP ready. The languages supported are English, Catalan, Spanish, Asturian and Galician.

  • Free Ling Dependency Parser Web Service V.2.1

    Freeling-based dependency parser. The languages supported are English, Catalan, Spanish, Asturian and Galician. WARNING: This WS has a new version.

  • Maltixa

    Statistic-based dependency parser. Given a set of sentences in Basque, each sentence in a line, obtains a dependency-analysis of the sentences in a format equivalent (although not totally equal, as the columns appear in a different order) conll format.

  • Textual Emigration Analysis

    Historians, literary scientists, and others are interested in the semantic interpretation of text. With automatic pre-processing of texts, e.g. named entity recognition, coreference resolution, and dependency parsing, relevant semantic relations can be extracted. The Stuttgart tools ext... more

Back to top

Text Handling

  • Linescrambler Web Service

    This WS scrambles the lines in a file. The goal is to make it difficult to reproduce the original text. The input size limit is 100 MB. Language independent WS.

  • Free Ling Sentence Splitter Web Service V.3

    This WS performs a FreeLing-based sentence splitter (v 3.0). The WS splits a file in plain text format and UTF-8 encoded into units (tokens) separated by new lines. Output sentences are separated by empty lines. The languages supported are English, Catalan, Spanish, Asturian, Welsh, Ga... more

  • Columns Selector Web Service

    This WS allows extracting a column from a tabular file input text. It is useful to work with CoNLL or FreeLing annotated corpora. Language independent WS.

  • IULA Character Encoding Converter Web Service

    Convert character encoding of given files from one encoding to another. Based on the Linux 'iconv' command that converts text from one encoding to another encoding.

  • Free Ling Tokenizer Web Service V.3

    This WS deploys a FreeLing-based text tokenizer (v 3.0). The WS splits a file in plain text format and UTF-8 encoded into units (tokens) where tokens are separated by new lines. The languages supported are Catalan, English, Galician, Italian, Portuguese, Russian, Spanish, Welsh, and As... more

  • Stream Editor Web Service (Sed)

    This WS performs basic text transformations on an input text. The serveice is based on the 'sed' progam, a Unix utility that parses and transforms text, using a simple, compact programming language.

  • Xslt Applicator Web Service

    A command line tool for applying XSLT stylesheets to XML documents.

  • Anonymizer Web Service

    This WS substitutes proper nouns with tags. This process anonymizes an input text by eliminating any person, place, corporation, etc. name. The service automatically calls the FreeLing WS and makes use of its Named Entity Recognition tool to detect proper nouns. The languages supported ... more

  • File Splitter Web Service

    This WS splits an input file into smaller files containing the number of lines indicated as input parameter. Splitted files are stored in the results public directory, and the output is a file with the list of URLs pointing to each splitted file. Language independent WS.

Back to top

Geovisualization

  • Textual Emigration Analysis

    Historians, literary scientists, and others are interested in the semantic interpretation of text. With automatic pre-processing of texts, e.g. named entity recognition, coreference resolution, and dependency parsing, relevant semantic relations can be extracted. The Stuttgart tools ext... more

  • Mimore

    It is a web application that enables simultaneous search in three micro-comparative databases on Dutch dialects via a common interface. This makes it possible to investigate potential correlations between variables at the three different linguistic levels. Cartographic functionality ena... more

  • Migmap

    Migmap is a web application where the user first chooses generation (forward or backward in time) and gender, while the migration map of The Netherlands related to an interactively pointed municipality (or other aggregation unit) is shown. The existing map-making software module "Kaart"... more

Back to top