Program 2018 • Latin American and Iberian Languages Open Corpora Forum

8:30	Welcome to OpenCor Forum Livy Real and Ivan Vladimir Meza Ruiz
8h50	OBras: a fully annotated and partially human-revised corpus of Brazilian literary works in the public domain Diana Santos, Claudia Freitas and Eckhard Bick
9h20	Portuguese Universal Dependencies via Bosque Valeria de Paiva, Claudia Freitas, Alexandre Rademaker, Livy Real and Fabricio Chalub
9h50	The Fake.Br corpus - a corpus of fake news for Brazilian Portuguese Roney L. De S. Santos, Rafael A. Monteiro and Thiago Pardo
10:10	A brief description of SICK-BR Livy Real, Ana Rodrigues, Andressa Vieira E Silva, Bruna Thalenberg, Bruno Guide, Cindy Silva, Igor C. S. Câmara, Guilherme de Oliveira Lima, Rodrigo Souza and Valeria de Paiva
10:30-11:00	COFFEE BREAK
11:00	CorPop: a corpus of popular Brazilian Portuguese Bianca Pasqualini and Maria José B. Finatto
11:30	Invited talk: Mexican Corpora and their applications Alfonso Medina Urrea

Invited Talk: Mexican Corpora and their applications Speaker: Alfonso Medina Urrea, El Colegio de México, Red Temática de Tecnologías del Lenguaje, Mexico

Abstract

Perhaps the oldest Latin American electronic corpus is the Corpus del español mexicano contemporáneo (CEMC), which was compiled in the seventies for lexicographical purposes. Since then, numerous corpora have appeared in Mexico for very diverse language technologies, like speech recognition, speech synthesis, text mining applications, etc., and for the study of linguistic phenomena in the synchronic and diachronic dimensions. In this talk, several prominent text corpora projects will be described. We will briefly examine their objectives, focused language, structure, tools and resources, context of apparition, and their applications, among other features.