Saltar al contenido

B2W-Reviews02, an annotated review sample

NombreB2W-Reviews02
TítuloB2W-Reviews02, an annotated review sample
Presentado porReal, L. , Bento, A. , Soares, K. , Oshiro, M. Mafra, A.
LanguageBrazilian Portuguese
Código de lenguapt-BR
Categoríaresource
Estadoavailable
Tipocorpora
Año2020

We present B2W-Reviews02, an annotated sample of reviews left by cos- tumers on Americanas.com marketplace. This corpus consists in 250 reviews that are part of a larger corpus, B2W-Reviews01 [3]. B2w-Reviews01 is today the largest costumers’ reviews available for Portuguese. It contains valuable in- formation, bringing not only the text of the reviews, but information related to the reviewed product, as its name and category, and anonymized information related to the costumer reviewer, as gender, age and geographical location.

As a marketplace, our goal is to understand what the costumer is talking about. Often, the review has a good score, but the costumer complains about the delivery or the costumer service. The costumer review is a highly valuable source: this is the only stage in a well-succeed costumer journey when the costumer freely writes and leaves a final opinion about the whole chain of services accessed during the purchase process. Automating the process of understanding our costumer is urgent. While B2W Digital holds four major Brazilian marketplaces1 , only Americanas.com receives monthly around 30k reviews. Therefore, we need a scalable solution to solve the costumer-review challenge.

To deeply understand the opinions and feelings of the costumers left in these reviews, we modeled the problem as a complex opinion mining task, consisting in (1) finding the topics that a review is talking about, (2) analysing the senti- ment/polarity related to each topic separately. Our assumption is close to the approach adopted by [1], that firstly models the topics of a review and, finally, assigns a different sentiment to each found topic. Differently from [2] and oth- ers, we avoid the use of non-supervised methods. We do not follow unsupervised methods, specially for topic modelling, because these reviews very often bring more than one topic with different sentiments related to each.

Here, we firstly considered the topics: PRODUTO, ENTREGA, PRECO, AVALIACAO, ESTADO DO PRODUTO, SAC, OUTROS2. Most of the tags are intuitive, but we high- light that ESTADO DO PRODUTO refers to the state of the product when delivered (broken, well packed, etc) and AVALIACAO to the process of leaving a review.

Secondly, each topic was categorized as positive or negative. The annotation process was done by two human annotators, with a kappa of 0.9. Finally, a third annotator reviewed the work and we achieved the final labels of B2W-Reviews02.

With this work, we hope to open to the community the challenge we face to understand the costumer reviews. We believe this is an important step to be closer to the needs of our costumers. Also we want to encourage the discussion on Natural Language Processing between Brazilian industry and academia.

1 Americanas.com, Submarino.com, Shoptime.com and Soubarato.com.

2 Literally: PRODUCT, DELIVERY, PRICE, REVIEW, STATE OF PRODUCT, CSC (Customer Service Center), OTHERS.

References

  1. Kherwa, P., Sachdeva, A., Mahajan, D.K., Pande, N., Singh, P.: An approach to- wards comprehensive sentimental data analysis and opinion mining. 2014 IEEE International Advance Computing Conference (IACC) pp. 606–612 (2014)
  2. Lakshmanaprabu, S.K., Shankar, K., Gupta, D., Khanna, A., Rodrigues, J., Pin- heiro, P.R., de Albuquerque, V.H.C.: Ranking analysis for online customer reviews of products using opinion mining with clustering. Complexity Problems Handled by Big Data Technology (2018)
  3. Real, L., Oshiro, M., Mafra, A.: B2w-reviews01: An open product reviews corpus. Proceedings of STIL - Symposium in Information and Human Language Technology (2019)