Tools

The following list includes the tools developed for the Italian NLP by all the researchers working in this area. Clicking on the name of each tool you can see the basic information about it and the link to the tool web site, if it exists.
Instead, in section Links you can find a list of tools which have been applied to this language, even if not especially developed for it.

The information about systems and resources which participated to the Evalita 2011 contest will be made available very soon after the workshop (Rome, January the 24-25th 2012).

All suggestions and proposals about listed or not listed tools are welcome, and can be sent by using the Suggestions form.

Tokenization
  • Regexp_tokenizer
    Name Regexp_tokenizer
    Author(s) Marco Baroni
    Description It is a tokenizer that splits a text into tokens on the basis of a set of regular expressions that are specified by the user in a parameter file. In this way, the tokenizer can be personalized for different languages and/or tokenization purposes..
    Licence and download free
    Link http://sslmit.unibo.it/~baroni/regexp_tokenizer.html
    Contact marco.baroni[at]unitn.it
Morphologic analysis/Pos-Tagging
  • CORISTagger
    Name CORISTagger
    Author(s) Fabio Tamburini
    Description CORISTagger is an high-performance PoS-tagger for Italian. The system is composed of an Hidden Markov Model tagger followed by a Transfomation Based tagger.
    Licence and download -
    Link -
    Contact fabio.tamburini[at]unibo.it
  • C4
    Name C4
    Author(s) Simone Romagnoli
    Description C4 is a portable statistical part of speech tagger based on a second order Markov model technique, implemented in C++ using standard template libraries.
    Licence and download -
    Link -
    Contact simone.romagnoli3[at]unibo.it
  • Felice-POS-Tagger
    Name Felice-POS-Tagger
    Author(s) Felice Dell'Orletta
    Description The Felice-POS-Tagger is a combination of six component taggers, with three different algorithms, each of which is used to construct a left-to-right tagger and a right-to-left tagger. The algorithms are the TnT and others based on ILC-UniPi MaxEnt PoS tagger and used with different learning approaches in order to build the ensemble system.
    Licence and download -
    Link -
    Contact felice.dellorletta[at]ilc.cnr.it
  • ILC-UniPi MaxEnt PoS Tagger
    Name ILC-UniPi MaxEnt PoS Tagger
    Author(s) Felice Dell'Orletta, Maria Federico, Simonetta Montemagni, Vito Pirrelli
    Description The ILC-UniPi MaxEnt PoS Tagger is a combination of two Maximum Entropy PoS taggers, operating on the output of MAGIC, an Italian rule-based morphological parser, equipped with a general-purpose lexicon of about 100.000 entries.
    Licence and download -
    Link -
    Contact felice.dellorletta[at]ilc.cnr.it, maria.federico[at]ilc.cnr.it, simonetta.montemagni[at]ilc.cnr.it, vito.pirrelli[at]ilc.cnr.it, alessandro.lenci[at]ilc.cnr.it
  • TagPro
    Name TagPro
    Author(s) Emanuele Pianta, Roberto Zanoli
    Description TagPro, a system for PoS-tagging based on Support Vector Machine. TagPro exploits a rich set of features, including morphological analysis. It scored as the best system in the Italian Pos Tagging task at EVALITA 2007.
    Licence and download -
    Link -
    Contact pianta[at]fbk.eu, zanoli[at]fbk.eu
  • UniPiSynthema POS tagger
    Name UniPiSynthema POS tagger
    Author(s) Carlo Aliprandi, Carmignani Nicola, Deha Nedjma, Mancarella Paolo Maria, Rubino Michele
    Description The UniPiSynthema POS tagger basic assumption is that contextual information affects the environment where the word has to be tagged. In order to tag the word with the most likely PoS it is necessary to have a high-order representation of the context. This assumption has been consolidated into stochastic methods that are based on a second order Markov Model.
    Licence and download -
    Link -
    Contact aliprandi[at]synthema.it, nicola[at]di.unipi.it, deha[at]di.unipi.it, paolo[at]di.unipi.it, rubino[at]di.unipi.it
  • UniToPOStagger
    Name UniTo POS tagger
    Author(s) Leonardo Lesmo
    Description This rule-based PoS tagger is developed by the NLP Group of the Dipartimento di Informatica of the University of Torino, and it is part of the TULE framework. It takes as input the result of the morphological analysis of a sentence, which may include multiple entries for each word when an ambiguity is present. The output of the tagger is a sequence of single entries, each of which is associated with an input word.
    Licence and download free download
    Link http://www.tule.di.unito.it/
    Contact lesmo[at]di.unito.it
  • VEST
    Name VEnice Symbolic Tagger (VEST)
    Author(s) Rodolfo Delmonte
    Description VeST is a symbolic rule tagger that uses little quantitative and statistical information. Most of its computational work is based on tagged lexical information available in datasets made available from previous work in the field. The system also uses a morphological analyzer which is only activated for derivational nouns, cliticized verbs and some adjectives. It is also activated as a guesser by unknown, and out of vocabulary words which will end up with a default classification in case of failure: uppercase words are labeled proper nouns, lowercase words common nouns.
    Licence and download -
    Link -
    Contact delmont[at]unive.it
Parsing (Syntactic analysis)
  • DeSR
    Name Dependency Shift-Reduce (DeSR)
    Author(s) Giuseppe Attardi
    Description DeSR is a statistical dependency parser for natural language sentences. DeSR is part of the TANL framework, that provides the required tools to completely analyze sentences starting from text. It has been used both for Italian and other languages. DeSR (exequo with TULE) scored as the best system in the Italian dependency parsing task at EVALITA 2009.
    Licence and download free software that can be redistributed and/or modified under GNU General Public License v. 3
    Link http://sites.google.com/site/desrparser/
    Contact attardi[at]di.unipi.it
  • TUP
    Name Turin University Parser (TUP)
    Author(s) Leonardo Lesmo
    Description TUP is a rule-based dependency parser which is part of the TULE framework. It currently supports Italian and English. Extensions to English, Spanish, Catalan, French and Hindi are under development.
    Licence and download free download
    Link http://www.tule.di.unito.it/
    Contact lesmo[at]di.unito.it
Parsing environment (including tokenizer, PoS tagger and parser)
  • CHAOS
    Name CHAOS
    Author(s) Roberto Basili, Maria Teresa Pazienza, Fabio Massimo Zanzotto
    Description A robust syntactic parser for Italian and for English. The system implements a modular and lexicalised approach to the syntactic parsing problem. It is based on the notion of eXtended Dependency Graph (XDG) that has been seen as a useful representation mechanism in a shallow parsing approach. The system offers a collection of modules for designing parsing architectures.
    Licence and download free download for research purpose, but protected (send e-mail to the contact to obtain the account for the protected area)
    Link http://ai-nlp.info.uniroma2.it/external/chaosproject/
    Contact chaos[at]info.uniroma2.it
  • GraFo
    Name GraFo
    Author(s) Emanuele Pianta
    Description GraFo is a left corner parser for Italian, based on explicit rules manually coded in a unification formalism. As the linguistic coverage of GraFo is still quite limited, the parser produces complete parse trees for a small percentage of sentences.
    Licence and download -
    Link -
    Contact pianta[at]fbk.eu
  • TANL
    Name Tanl Italian Parser (TANL)
    Author(s) Giuseppe Attardi
    Description The Tanl Italian Parser is a Web service for parsing Italian texts and producing parse trees according to the Tanl Dependency Notation. The service uses the DeSR dependency parser and other linguistic tools from the Tanl Suite. The input is plain text, the output is in CoNLL X format.
    Licence and download free download
    Link http://paleo.di.unipi.it/parse
    Contact attardi[at]di.unipi.it
  • TULE
    Name Turin University Linguistic Environment (TULE)
    Author(s) Leonardo Lesmo
    Description TULE is the enviroment where are integrated both the PoS Tagger of the University of Torino and the dependency parser TUP. The output of TULE is in plain text and the output in TUT format, since TULE has been developed in parallel with the Turin University Treebank (TUT) and shares with this resource the same format. TULE scored as the best system in the Italian dependency parsing task at EVALITA 2007 and 2009 (exequo with DeSR).
    Licence and download free download
    Link http://www.tule.di.unito.it/
    Contact lesmo[at]di.unito.it
Word Sense Disambiguation
  • JIGSAW
    Name JIGSAW
    Author(s) Pierpaolo Basile, Giovanni Semeraro
    Description JIGSAW, is a knowledge-based WSD system that attempts to disambiguate all words in a text by exploiting external lessical knowledge-base. The main assumption is that a specific strategy for each Part-Of-Speech (POS) is better than a single strategy.
    Licence and download -
    Link -
    Contact basilepp[at]di.uniba.it, semeraro[at]di.uniba.it
Information Retrieval (search engines, voice search, document classification, text categorization)
Information Extraction and text mining
Named Entity Recognition
  • Bidirectional Sequence Classication for NER
    Name Bidirectional Sequence Classication for NER
    Author(s) Andrea Gesmundo
    Description The Bidirectional Sequence Classication is a system for Named Entities Recognition, based on the Perceptron Algorithm. In the proposed framework, the order of the inference is not forced into a monotonic behavior (left-to-right), but is learned together with the parameters of the local classifier. It applies a semi-supervised training approach, which extends the Guided Learning framework.
    Licence and download -
    Link -
    Contact andrea.gesmundo[at]unige.ch
  • EntityPro
    Name EntityPro
    Author(s) Emanuele Pianta, Roberto Zanoli
    Description EntityPro is a system for NER based on Support Vector Machines, which is part of TextPro, a suite of modular NLP tools developed at FBK. It was trained with a large number of both static and dynamic features.
    Licence and download free for research puroposes from the following link
    Link http://textpro.fbk.eu/demo.php
    Contact pianta[at]fbk.eu, zanoli[at]fbk.eu, manspera[at]fbk.eu
  • Typhoon
    Name Typhoon
    Author(s) Silvana Marianela Bernaola Biggio, Roberto Zanoli, Claudio Giuliano
    Description Typhoon is a classifier combination system for NER, in which two different classifiers are combined to exploit Data Redundancy and Patterns extracted from a large text corpus. The system consists of two classifiers in cascade, but it is possible to use a single classifier making the system faster; whereas the second classifier in the cascade can be used when more accuracy is needed.
    Licence and download -
    Link http://textpro.fbk.eu/typhoon.html
    Contact manspera[at]fbk.eu, zanoli[at]fbk.eu, pianta[at]fbk.eu, giulianog[at]fbk.eu
  • Tanl Named Entity Recognizer
    Name Tanl Named Entity Recognizer
    Author(s) Giuseppe Attardi, Stefano Dei Rossi, Felice Dell'Orletta, Eva Maria Vecchi
    Description The Tanl tagger is a generic, customizable text chunker, which can be applied to tasks such as PoS tagging, Super Sense tagging and Named Entity recognition. The chunker uses a Maximum Entropy classifier for learning how to chunk texts. Maximum Entropy is a more efficient technique than SVM, and by complementing it with dynamic programming it can achieve similar levels of accuracy.
    Licence and download free download
    Link http://medialab.di.unipi.it/wiki/NE_tagger
    Contact attardi[at]di.unipi.it,deirossi[at]di.unipi.it, felice.dellorletta[at]ilc.cnr.it,evamaria.vecchi[at]ilc.cnr.it
Local Entity Detection and Recognition
  • FBK-UNiTRN LER system
    Name Fondazione Bruno Kessler and University of Trento Local Entity Recognition system
    Author(s) Silvana Marianela Bernaola Biggio, Claudio Giuliano, Massimo Poesio, Yannick Versley, Olga Uryupina, Roberto Zanoli
    Description This system detects and recognizes local entities for the Italian language. It is divided into 2 modules, the Entity Mention Detection (EMD) module which detects all the mentions related to persons, organizations, geo-political entities and locations; and the Coreference Resolution module that recognizes which mentions refer to the same entity. Understanding entity as an object or group of objects in the world; and, mention as the textual reference of an entity.
    Licence and download -
    Link -
    Contact bernaola[at]fbk.eu, giuliano[at]fbk.eu, massimo.poesio[at]unitn.it, yversley[at]gmail.com, uryupina[at]gmail.com, zanoli[at]fbk.eu
Temporal Expression Recognition and Normalization
  • ITA-Chronos
    Name ITA-Chronos
    Author(s) Matteo Negri
    Description ITA-Chronos is designed to recognize all the Timed Entities occurring in a text, identify their extension, and normalize them according to the TIMEX2 standard. It adopts a rule-based approach, with different sets of hand-crafted rules specialized to deal with different aspects of the problem. It is the Italian extension of Chronos, a multilingual system written in Lisp, originally developed for English.
    Licence and download -
    Link http://ontotext.fbk.eu/ita-chronos.html
    Contact negri[at]fbk.eu
  • UniPg-TERNsystem
    Name UniPg-TERNsystem
    Author(s) Loris Faina, Stefania Spina
    Description This system is an Italian parser that combines two separate levels of parsing: a constituent parsing, that entails a category annotation of morphosyntactic constituents; and a dependency parsing, that implies a functional annotation of relations such as subject, complement, etc. it has been used as a Temporal Expression Recognizer in the Evalita contest.
    Licence and download -
    Link -
    Contact faina[at]unipg.it, sspina[at]unistrapg.it
Semantic analysis
  • GETARUNS
    Name GETARUNS
    Author(s) Rodolfo Delmonte
    Description GETARUNS is a system for semantic analysis which includes a "deep parser" provided with subcategorization coming from different sources. It produces, at the end of the pipeline a "discourse model" including the discours entities with their properties and features. This system is composed by the following modules:
    • tokenizer
    • sentence splitter
    • tagger and disambiguator
    • chunker and shallow parser
    • deep parser (strictly topdown) for annotated c-structure
    • semantic lexical mapping for f-structure
    • pronominal binding
    • anaphora resolution
    • topic hierarchy and centering
    • semantic information processing at propositional level
    • logical form
    • temporal reasoning
    • semantic indexing of individuals, sets, locations and events
    • discourse model creation and updating
    • discourse structure with semantic relations and discourse moves
    Licence and download free download - version Linux Ubuntu 9; for the updated versions send an email to the contact person
    Link http://project.cgm.unive.it/?page_id=194
    Contact delmont[at]unive.it
  • VENSES
    Name VENSES
    Author(s) Rodolfo Delmonte
    Description VENSES is the scaled version of GETARUNS for creating a semantic analysis system which can include a "partial parser" equipped with subcategorization information coming from different sources. The output of VENSES is a logical form without free variables and where all pronouns are substituted by their antecedents. The system is composed by the following modules:
    • tokenizer
    • sentence splitter
    • tagger and disambiguator
    • chunker and shallow parser
    • semantic lexical mapping for f-structure
    • pronominal binding
    • anaphora resolution
    • topic hierarchy and centering
    • semantic information processing at propositional level
    • logical form
    • semantic indexing of individuals, sets, locations and events
    Licence and download demo version available on the web site; for a version of the system send an email to the contact person
    Link http://project.cgm.unive.it/?page_id=196
    Contact delmont[at]unive.it
Question Answering
  • QALLME
    Name QALL-ME (Question Answering Learning technologies in a multiLingual and Multimodal Environment)
    Author(s) Bernardo Magnini (FBK - Trento, Italy) is the coordinator of the academic and industrial partners
    Description QALLME is an EU funded project in the IST area. The general objective is to establish a shared infrastructure for multilingual and multimodal open domain Question Answering for mobile phones.
    Licence and download -
    Link http://qallme.fbk.eu/
    Contact perenthaler[at]fbk.eu
Summarization
Textual entailment
  • EDITS
    Name EDITS (Edit Distance Textual Entailment Suite)
    Author(s) Milen Kouylekov, Matteo Negri
    Description EDITS is an open source software package aimed at recognizing entailment relations between two portions of text. The system is based on edit distance algorithms, and computes the T-H distance as the cost of the edit operations (i.e. insertion, deletion and substitution) that are necessary to transform T into H.
    Licence and download GNU Lesser General Public License
    Link https://docs.google.com/Doc?docid=0AV0eoH72QlJeZGNjajdyNHdfMGM1NXo1YzVn&hl=en
    Contact kouylekov[at]gmail.com, negri[at]fbk.eu
  • UniAlicante Textual Entailment system
    Name UniAlicante Textual Entailment system
    Author(s) Oscar Ferrandez, Antonio Toral, Rafael Munoz
    Description The system uses a machine learning classier fed by features derived from lexical distances, part-of-speech information and semantic knowledge from SIMPLE-CLIPS, an Italian Language Resource.
    Licence and download -
    Link -
    Contact ofe[at]dlsi.ua.es, rafaelg[at]dlsi.ua.es, antonio.toral[at]ilc.cnr.it
Topic Detection and Tracking
  • OntoTDT
    Name OntoTDT
    Author(s) FBK
    Description OntoTDT is an unsupervised Topic Detection system. A topic is defined a seminal event or activity, along with all directly related events and activities. A topic is expressed as a chronologically ordered list of "stories". A story is "on topic" whenever it discusses events and activities that are directly connected to that topic's seminal event. The goal of a topic detection system is to group together stories that discuss the same event.
    Licence and download -
    Link http://ontotext.fbk.eu/topic.html
    Contact bentivo[at]fbk.eu
Speech recognition/understanding (including Speech-To-Text transcription)
  • PROSO
    Name PROSO
    Author(s) Rodolfo Delmonte
    Description PROSO is a rule-based system for the translation of an Italian text in the corresponding version labelled for a vocal synthesizer. It uses only phonological rules and a table of verbal roots.
    Licence and download free download
    Link http://project.cgm.unive.it/?page_id=204
    Contact delmont[at]unive.it
Speech synthesis (including Text-To-Speech synthesis):
(Spoken) Dialog
  • UNITN Italian Spoken Dialogue System
    Name UNITN Italian Spoken Dialogue System
    Author(s) Stefan Rigo, Evgeny A. Stepanov, Pierluigi Roberti, Silvia Quarteroni, Giuseppe Riccardi
    Description The main features supporting the UNITN SDS are the mixed initiative control, which allows the caller to get partly in control of the dialog strategy, and the descriptive specification of dialog strategies. The application is based on a complex, high-recall grammar and a user goal planning script. The latter is tightly bound to the grammar and provides functionalities of error checking and recovery from missing or misinterpreted concepts (Automatic Speech Recognition and Spoken Language Understanding errors).
    Licence and download -
    Link -
    Contact -
  • Loquendo Spoken Dialogue System
    Name Loquendo Spoken Dialogue System
    Author(s) Enrico Giraudo, Paolo Baggia
    Description The application was designed in VoiceXML and runs on the Loquendo VoxNauta platform. The dialogue strategy is „mixed-initiative‟, with flexible recognition grammars that were designed to be modular and easy to use in different dialogue application contexts. Change of context and complex requests from caller are allowed.
    Licence and download -
    Link http://www.loquendo.com/en/technology/voxnauta_platform.htm
    Contact enrico.giraudo[at]loquendo.com, paolo.baggia[at]loquendo.com
Speaker Recognition
  • QUT Speaker Identity Verication System
    Name Queensland University of Technology (QUT) Speaker Identity Verification System
    Author(s) Mitchell McLaren, Robbie Vogt, Brendan Baker, Sridha Sridharan
    Description The system includes the following three components developed by QUT for the evaluation in Evalita 2009: Joint Factor GMM-UBM, GMM Supervector SVM and GLDS SVM. The QUT system is the score-level fusion of these components. Fusion was performed on the output scores using linear weights calculated through use of a logistic regression algorithm. This was performed using the FoCal toolkit.
    Licence and download -
    Link -
    Contact m.mclaren[at]qut.edu.au, r.vogt[at]qut.edu.au, bj.baker[at]qut.edu.au, s.sridharan[at]qut.edu.au
  • UWB Speaker Identity Verication Systems
    Name University of West Bohemia (UWB) Speaker Identity Verification Systems
    Author(s) Lukas Machlica, Jan Vanek
    Description The two UWB systems were submitted to the EVALITA 2009 evaluation campaign. Both systems are based on the UBM-GMM approach.
    Licence and download -
    Link http://www.kky.zcu.cz/en
    Contact machlica[at]kky.zcu.cz, vanekyjg[at]kky.zcu.cz
  • AGNITIO's Speaker Recognition System
    Name AGNITIO's Speaker Recognition System
    Author(s) Niko Brummer, Albert Strasheim
    Description AGNITIO's is a fusion of a state-of-the-art Joint Factor Analysis system and a new I-Vector system.
    Licence and download -
    Link -
    Contact -
  • RU Speaker Recognition Systems
    Name Radboud University Speaker Recognition Systems
    Author(s) Marijn Huijbregts, David van Leeuwen
    Description The first is a system based on Universal Background Model and Gaussian Mixture Model (UBM-GMM) and employies a linear scoring approach with channel. The second system is based on Joint Factor Analysis (JFA), also employing linear scoring.
    Licence and download -
    Link -
    Contact m.huijbregts[at]let.ru.nl, d.vanleeuwen[at]let.ru.nl
  • SMART III Speaker Recognition Systems
    Name SMART III Speaker Recognition Systems
    Author(s) Maria Tucci
    Description The SMART III System is a formant based method using an implemented decisional approach with a reference population of 305 male Italian speakers containing fundamental frequency and first three formant values for the vowels /a, e, i, o/.
    Licence and download -
    Link http://www.linguistica.unical.it/labfon/Home.htm
    Contact tucci.maria[at]libero.it
Machine Translation and Speech-To-Speech Translation
  • STILVEN-MOSES
    Name STILVEN-MOSES
    Author(s) Rodolfo Delmonte
    Description STILVEN-MOSES is a translator Venetian - English based on MOSES, which uses a parallel corpus recently updated composed by 300,000 tokens.
    Licence and download free access
    Link http://project.cgm.unive.it/cgi-bin/stilven/moses
    Contact delmont[at]unive.it
Natural Language Generation
  • FUF/SURGE
    Name FUF/SURGE
    Author(s) Charles Callaway, Alessandra Novello
    Description FUF/SURGE-Italian is a rule-based, wide coverage generator using a systemic grammar. Given a lexicalized semantic specification, it creates a grammatically correct sentence, adds morphology and orthographical information, and returns formatted text.
    Licence and download Free download, requires a LISP installation
    Link http://homepages.inf.ed.ac.uk/ccallawa/resources.html
    Contact callaway[at]fbk.eu
Emotion Recognition / Generation
Linguistic annotation
  • ANANAS
    Name AN.ANA.S. 4
    Author(s) Miriam Voghera, Francesco Cutugno, Annamaria Landolfi, Carmela Sammarco
    Description AN.ANA.S. é un sistema di annotazione sintattica basato su uno schema di regole grammaticali (DTD) per la definizione della struttura ad albero del testo. É utilizzabile per l’etichettatura sia di testi parlati che scritti. Permette l’etichettatura sintattica di tutti i tipi di testo e si avvale del software XGate che funziona da editor per creare un database di testi in formato XML.
    Licence and download free download
    Link http://www.parlaritaliano.it/index.php/it/strumenti/717-ananas-4
    Contact people.na.infn.it/~cutugno
  • XGATE
    Name Xgate
    Author(s) Francesco Cutugno, Miriam Voghera
    Description Xgate is a tool for the annotation and query of linguistic data. It is developed by the NLP group of the Dipartimento di Scienze Fisiche of the Università Federico II of Napoli and the Dipartimento degli Studi Linguistici e Letterari of the Università of Salerno.
    Licence and download free download
    Link http://www.parlaritaliano.it/index.php/en/projects/666-xgate
    Contact people.na.infn.it/~cutugno
Language modeling
  • IRSTLM
    Name IRSTLM Toolkit
    Author(s) Marcello Federico, Nicola Bertoldi
    Description The IRST Language Modeling Toolkit features algorithms and data structures suitable to estimate, store, and access very large LMs. Our software has been integrated into a popular open source Statistical Machine Translation decoder called Moses, and is compatible with language models created with other tools, such as the SRILM Tooolkit.
    Licence and download Open Source LGPL
    Link http://hlt.fbk.eu/en/irstlm
    Contact Marcello Federico, Nicola Bertoldi
Lexical substitution
  • LexSub
    Name LexSub
    Author(s) Diego De Cao, Roberto Basili
    Description The LexSub experimental platform proceeds through three steps: 1) the extraction of the lexical substitution sets for the target words, 2) the acquisition of domain models for candidates and 3) the ranking of candidate lexical substitutes over individual sentences according to the acquired domain models. A further step 4) back-off is included to deal with test cases for which the step 1) produces an empty candidate set.
    Licence and download -
    Link -
    Contact decao[at]info.uniroma2.it, basili[at]info.uniroma2.it
Connected Digit Recognition
  • Cedat85 automatic speech recognition system
    Name Cedat85 automatic speech recognition system
    Author(s) Maria Palmerini
    Description The system has been developed within a research project led in 2008 by Cedat 85 in cooperation with the European Media Laboratory in Heidelberg. It's based on the most recent IBM VoiceTailor technology; Cedat 85 provided the whole training process (acoustic data, text data, scripts) for spontaneous Italian language. Some of the features of VoiceTailor system are the speaker independence, the possibility to manage spontaneous speech, to use unlimited vocabularies, to use different acoustic and language models, to manage noise and the possibility to set some parameters in order to choose different strategies with respect to accuracy or speed. The system works in a Linux environment and can run on more processors in order to have more elaborations running in parallel.
    Licence and download -
    Link http://www.cedat85.com
    Contact m.palmerini[at]cedat85.com
  • Syllable-Based ASR System of Naples University
    Name Syllable-Based ASR System of Naples University
    Author(s) Francesco Cutugno, Bogdan Ludusan, Antonio Origlia, Serena Soldo
    Description The recognition system uses the syllable as base unit. In a first stage, the continuous speech sequence is divided in syllable-like units using an energy-based algorithm. Then, the obtained syllables are passed to a classifier in order to calculate the syllable/class probability distribution. In the final stage, a Viterbi-like decoding algorithm based on multistage graphs will find the most likely sequence corresponding to the audio input.
    Licence and download -
    Link -
    Contact cutugno[at]na.infn.it, ludusan[at]na.infn.it, soldo[at]na.infn.it
  • TSpeech
    Name TSpeech
    Author(s) Leandro D’Anna, Gianpaolo Coro, Francesco Cutugno
    Description TSpeech is an Abla srl proprietary speech recognizer, based on standard decoding algorithms, with syllabic acoustic models. The recognition phase is followed by a rescoring session, based on syllables energy and duration templates, which recover some recognition errors.
    Licence and download -
    Link www.abla.it
    Contact ldanna[at]unisa.it, gianpaolo.coro[at]abla.it, cutugno[at]na.infn.it
Other
  • TEXTPRO
    Name TEXTPRO
    Author(s) Emanuele Pianta, Christian Girardi, Roberto Zanoli
    Description TextPro is a suite of modular Natural Language Processing (NLP) tools for analysis of Italian and English texts. The suite has been designed so as to integrate and reuse state of the art NLP components developed by researchers at FBK. The current version of the tool suite provides functions ranging from tokenization to chunking and Named Entity Recognition (NER).
    Licence and download free licence obtained by registration
    Link http://textpro.fbk.eu/
    Contact manspera[at]fbk.eu
  • Coreference Resolution Module
    Name TEXTPRO
    Author(s) Octavian Popescu, Bernardo Magnini
    Description The coreference system has been developed to decide wether two mentions refer to the same entity or not. The input of the systems consists of a list of Named Entities of type Person, i.e. Person Names (PNs), that have been automatically recognized in a document collection. Its output consists of a number of clusters of PNs, where each cluster is interpreted as the set of PNs that refer to the same entity.
    Licence and download free licence obtained by registration
    Link http://textpro.fbk.eu/
    Contact bentivo[at]fbk.eu




TOP


Last updated January the 20th 2012, Contact: bosco[at]di.unito.it