Skip to Content

Program 2018

Canela, Brazil, as a part of PROPOR 2018

8:30Welcome to OpenCor Forum
Livy Real and Ivan Vladimir Meza Ruiz
8h50OBras: a fully annotated and partially human-revised corpus of Brazilian literary works in the public domain Diana Santos, Claudia Freitas and Eckhard Bick
9h20Portuguese Universal Dependencies via Bosque
Valeria de Paiva, Claudia Freitas, Alexandre Rademaker, Livy Real and Fabricio Chalub
9h50The Fake.Br corpus - a corpus of fake news for Brazilian Portuguese
Roney L. De S. Santos, Rafael A. Monteiro and Thiago Pardo
10:10A brief description of SICK-BR
Livy Real, Ana Rodrigues, Andressa Vieira E Silva, Bruna Thalenberg, Bruno Guide, Cindy Silva, Igor C. S. Câmara, Guilherme de Oliveira Lima, Rodrigo Souza and Valeria de Paiva
10:30-11:00COFFEE BREAK
11:00CorPop: a corpus of popular Brazilian Portuguese
Bianca Pasqualini and Maria José B. Finatto
11:30Invited talk: Mexican Corpora and their applications
Alfonso Medina Urrea

Invited Talk: Mexican Corpora and their applications Speaker: Alfonso Medina Urrea, El Colegio de México, Red Temática de Tecnologías del Lenguaje, Mexico

Abstract

Perhaps the oldest Latin American electronic corpus is the Corpus del español mexicano contemporáneo (CEMC), which was compiled in the seventies for lexicographical purposes. Since then, numerous corpora have appeared in Mexico for very diverse language technologies, like speech recognition, speech synthesis, text mining applications, etc., and for the study of linguistic phenomena in the synchronic and diachronic dimensions. In this talk, several prominent text corpora projects will be described. We will briefly examine their objectives, focused language, structure, tools and resources, context of apparition, and their applications, among other features.