Saltar al contenido

Brands.Br – a Portuguese Reviews Corpus

TítuloBrands.Br – a Portuguese Reviews Corpus
Presentado porFonseca, E. , Oliveira, A. , Gadelha, C. Guandaline, V.
LanguageBrazilian Portuguese
Código de lenguapt-BR

The Brands.Br corpus was built from a fraction of B2W-Reviews01 [1]corpus.We use a set of 252 samples selected by B2W to be enriched. In Brands.Br corpus we want to solve two main challenges in product reviews corpus. The first: it is very common to find customer reviews referring to distinct things, such as: attendance, delivery, a simple suggestion, doubt or a complaint regarding any one thing, but not about the product, as in: “Meu produto não foi entregue e a loja está descontando na fatura do meu cartão – My product was not delivered and the store is discounting it from my credit card invoice”. In this sample the customer has evaluated the product as “one star”(terrible or very bad product). But analyzing the sample it is clear that the review refers to is- sues with the delivery and not to the product itself. Plus, there are cases with two or more topics (cross-topics review). To deal with the cross-topics problem we add a new layer, that classifies the subject of the review. This field can be multi label and covers 9 classes (Elogio, Reclamação, Dúvida, Solicitação, Indicação, Sugestão, Atendimento, Produto e Entrega) – (Compliment, Complaint, Doubt, Request, Indication, Suggestion, Service, Product and Delivery) The second challenge refer to unclassified Brands. That is, in this corpus there is a field called “product_brand”. In this field there are many null instances it corresponds to 4/5 of corpus samples. To perform the annotations we use a semi-automatic method. That is, the annotations were performed using our proprietary software. To produce the gold standard, the samples were manually revised by linguists. Regarding Brands layer, unfortunately, there are cases that was not possible to assert the product_brand, due to the sample refer to a very generic product, such as: “Modelador de Cachos - Preto”(Curling Iron - Black), “Mini Filmadora HD 1080P Resistente Esportes Prova D’ Água 30m USB”(Mini Waterproof 30m HD 1080P Camcorder Sports USB), and other cases. The Brands.Br corpus is freely available1.


  1. L. Real, M. Oshiro, and A. Mafra. B2w-reviews01 an open product reviews corpus. In Proceedings of the Symposium in Information and Human Language Technology, 2018.