English español

Repositorio de la Universidad de Oviedo > Trabajos Fin de Máster > Ingeniería y Arquitectura > Máster Universitario en Soft Computing y Análisis Inteligente de Datos >

Use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10651/19364

Título : Opinion Mining in Web 2.0
Autor(es) y otros: Pérez Gallego, Pablo José
Director(es): Díez Peláez, Jorge
Luaces Rodríguez, Óscar
Palabras clave: Sentiment analysis
Opinion mining
Machine Learning
Feature Selection
Fecha de publicación : jul-2012
Resumen : During the last years we are assisting to an intense Web transformation process. It is no longer a mere static information repository but a dynamic system in which users have become the main content contributors. They actively participate in sharing their opinions, thoughts and views about products, events and almost anything in social networks, forums, blogs, etc. With the latest advances in mobile technologies, users can actually interact anytime from anywhere; real time information has become a reality. All these mixture of social networks, discussion groups, forums and blogs are collectively called the user-generated content. It has many practical applications and has a potential major value from both the user and business points of view. On one hand, knowing other user opinions is useful when having to take a decision in our daily life. On the other hand, it is an invaluable information source about user preferences and tastes. Due to the large and diverse number of opinion sources, it appears the necessity of systems able to automatically discover, analyze and summarize the expressed sentiment in the so- called user-generated content. Sentiment analysis grows out of this need. It focuses on the computational study of people's opinions, appraisals, and emotions toward entities, events and their properties. In the first three chapters of this document we introduce the problem of sentiment analysis, describing its main characteristics and di culties, we brie y present the main theoretical background of the realized work, and we provide the reader with an exhaustive literature review, analyzing the previous related works in the literature. Afterwards, we face a sentiment classification problem consisting in learning to classify a series of movie reviews, as positive or negative, in function of the sentiment expressed by the author. In chapter 4 we present the dataset and its main properties, together with all the preprocess steps we have applied to the text movie reviews in order to obtain valuable representations. We also present the methodology we used to execute the experiments and to estimate the performance of the proposed approaches. In chapter 5 we describe our solutions to the problem, we present the details of all the performed experiments and evaluate and discuss the obtained results. As baseline we have reproduced an extensive part of the experiments presented in [Pang et al., 2002]. As follows we propose a series of feature reduction approaches, with the objective of selecting a reduced and representative vocabulary of the movie review domain. Finally, we propose a novel method based on measuring word cooccurrence information in order to obtain a "meaning" representation of the text documents.
URI : http://hdl.handle.net/10651/19364
Aparece en las colecciones: Máster Universitario en Soft Computing y Análisis Inteligente de Datos

Ficheros en este ítem:

Fichero Descripción Tamaño Formato
TFM_Pablo_Perez_TFM.pdf2,18 MBAdobe PDFVisualizar/Abrir

Exportar a Mendeley

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons
Creative Commons

Los ítems de DSpace están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.


Base de Datos de Autoridades Biblioteca Universitaria Consultas / Sugerencias