Skip to Content

Corpora list

B2W-Reviews01

Link https://github.com/b2wdigital/b2w-reviews01
Title B2W-Reviews01 - A Brazilian Portuguese reviews corpus
Presented by Real, L. , Oshiro, M. Mafra, A.
Language Brazilian Portuguese
Language code pt-BR
Category resource
Status available
Type text
Year 2019

The CIEMPIESS Proper-Names Pronouncing Dictionary

Link https://mega.nz/#!NoZ3XY4Y!kROQmlt0tlhDZUngvauC7NnWi3HyDYD87jEUkyiRStE
Title The CIEMPIESS Proper-Names Pronouncing Dictionary
Presented by Hernández-Mena, C.
Language Mexican Spanish
Language code es-MX
Category resource
Status available
Type text
Year 2019

Corpus Reacción

Link https://github.com/lyr-uam/CorpusReaccion
Title Corpus Reacción: consumers engagement in Facebook posts
Presented by Rosas-Quezada, E. , Ramírez-de-la-Rosa, G. Villatoro-Tello, E.
Language Mexican Spanish
Language code es-MX
Category resource
Status available
Type text
Year 2019

COPENOR

Link https://gitlab.com/manuel.wortens/copenor
Title Construcción del Corpus Periodístico del Noroeste de México (COPENOR)
Presented by Sánchez-Fernández, M. Medina-Urrea, A.
Language Mexican Spanish
Language code es-MX
Category resource
Status in development
Type text
Year 2019

AIRA

Link https://aira.iimas.unam.mx
Title AIRA: Acoustic Interactions for Robot Audition
Presented by Rascón, C. Velez, I.
Language Mexican Spanish
Language code es-MX
Category resource
Status available
Type speech
Year 2019

IESC-Child

Title IESC-Child: An Interactive Emotional Children’s Speech Corpus
Presented by Pérez-Espinosa, H. , Martínez-Miranda, J. , Espinosa-Curiel, I. , Rodríguez-Jacobo, J. , Villaseñor-Pineda, L. Avila-George, H.
Language Mexican Spanish
Language code es-MX
Category resource
Status available
Type speech
Year 2019

Obras Brasileiras

Link http://www.linguateca.pt/
Title OBras: a fully annotated and partially human-revised corpus of Brazilian literary works in the public domain
Presented by Santos, D. , Freitas, C. Bick, E.
Language Brazilian Portuguese
Language code pt-BR
Category resource
Status available
Type text
Year 2018

Fake.Br

Link https://sites.google.com/icmc.usp.br/opinando/
Title The Fake.Br corpus – a corpus of fake news for Brazilian Portuguese
Presented by Santos, R. , Monteiro, R. Pardo, T.
Language Brazilian Portuguese
Language code pt-BR
Category resource
Status available
Type text
Year 2018

The Bosque Corpus

Link https://github.com/UniversalDependencies/UD_Portuguese-Bosque/
Title Portuguese Universal Dependencies via Bosque
Presented by dePaiva, V. , Freitas, C. , Rademaker, A. , Real, L. Chalub, F.
Language Brazilian Portuguese
Language code pt-BR
Category resource
Status available
Type text
Year 2018

Corpus of Southern Qichwa

Link https://siminchikkunarayku.pe/raw_audio.html
Title On the Building of the Large Scale Corpus of Southern Qichwa
Presented by Camacho, L. , Zevallos, R. Melgarejo., N.
Language Southern Qichwa
Language code qu-PE
Category resource
Status available
Type speech
Year 2018

Corpora of Mexican Sign Language (MSL)

Link http://cienciadedatosupiita.com/content/bd_creation.sql
Title Multimedia Corpora of Mexican Sign Language (MSL) with Syntactic Functions.
Presented by Pichardo-Lagunas, O. Martinez-Seis, B.
Language Mexican Sign Language
Language code sgn-MX
Category resource
Status available
Type multimedia
Year 2018

CorPop

Link http://www.ufrgs.br/textecc/porlexbras/corpop/index.php
Title CorPop: a corpus of popular Brazilian Portuguese
Presented by Pasqualini, B. Finatto., M.
Language Brazilian Portuguese
Language code pt-BR
Category resource
Status available
Type text
Year 2018

HWxPI

Link https://competitions.codalab.org/competitions/18362
Title HWxPI: A Multimodal Spanish Corpus for Personality Identification
Presented by Ramírez-De-La-Rosa, G. , Villatoro-Tello, E. Jiménez-Salazar, H.
Language Mexican Spanish
Language code es-MX
Category resource
Status available
Type image
Year 2018

SICK-BR

Link https://github.com/livyreal/SICK-BR
Title A brief description of SICK-BR
Presented by Real, L. , Rodrigues, A. , Silva, A. , Thalenberg, B. , Guide, B. , Silva, C. , Câmara, I. , Lima, G. , Souza, R. Paiva, V.
Language Brazilian Portuguese
Language code pt-BR
Category resource
Status available
Type text
Year 2018

The Wixarika-Spanish Parallel Corpus

Link https://github.com/pywirrarika/wixarikacorpora
Title The Wixarika-Spanish Parallel Corpus
Presented by Mager, J. , Carrillo, D. Meza, I.
Language Wixarika
Language code hch-MX (iso 639-3)
Category resource
Status available
Type text
Year 2018