Transcript PPT

Visualization of Spread of Topic Words
on Twitter using Stream Graphs and
Relational Graphs
Keigo Amma, Shunsuke Wada, Kanto Nakayama, Yuki Akamatsu, Yuichi Yaguchi, and Keitaro
Naruse
School of Computer Science and Engineering
University of Aizu, Japan
SCIS & ISIS, Japan
IEEE 2014
Introduction
Related Work
Method
Experiment
Result
Conclusion
• Microblog services can post small messages (up to 150 characters). These
services are easy to update and users can communicate with other
connected users in real time.
• Users can publish and share their moods and opinions.
• This research proposes ways to visualize a set of tweets for analysis of
the topic space and temporal patterns and to study how topics and user
interests spread or moods change over time.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
• Yang et al created a visualization scheme for four phases in crisis
response (preparation, response, restoration, and risk reduction) using
machine learning techniques from a set of tweets.
• Lee et al introduces stream graphs which we also use, but our research
aims to indicate how each topic is distributed and how active it is at a
particular time; we are not restricted to typhoon disaster tweets.
• Sakaki et al also studied a set of tweets as a sensor. They estimated the
spreading of tweets as away of measuring a disaster using supervised
machine learning techniques. This research was also an application of
information visualization.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
A. Visualization
• We developed an interface using the Data-Driven Document D3.js, which
is a member of the library of JavaScript HTML and JavaScript.
• This figure shows a graph
visualizing a change in the use
frequency of some words. This
figure expresses data by the
increase and decrease of the
width of each layer. The layers are
separated by color; each layer
represents the frequency of a
single noun.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
• This figure shows a graph visualizing the association of nouns. This graph
represents the relevance of words using nodes developed in a particular
time frame. Each node corresponds to one noun in tweets.
• Smaller distances between nodes
indicate stronger relevance of the
nouns. For convenience in
calculating the coordinates of
nodes, the co-occurrence* of the
most important word was not
placed at the origin of the
coordinates.
*co-occurrence: the two associated topic words occur in the same tweet
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
B. Occurrence frequency of nouns
• For visualizing the stream graph, we count the noun frequency of the
associated topic word of a set of tweets of each user.
• If we have N words, Byron’s graph is defined as follows:
fi,…,fn
• Where the number of words is n(n ≤ N - 1).
• First, the baseline g0 of the stream graph is defined as:
g0 = 0
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
• Then, if the graph is to be stacked symmetrically, the baseline g0 can be defined as:
g0 = -
1
2
𝑛
𝑖=1 𝑓𝑖
• If the graph is to be stacked asymmetrically, then g0 is defined as:
g0 = -
1
𝑛+1
𝑛
𝑖=1(𝑛
− 𝑖 + 1)𝑓𝑖
• Here, the top position of the stacked graph gi is given by the ith value of fi, where the
time i is such that:
gi = g0 +
𝑛
𝑖=1 𝑓𝑖
• With this information, the stacked graph is drawn as a time sequence.
• We define a topic word and extract associated topic words automatically using
semantics, and then we plot the frequency of associated topic words as a stream
graph.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
C. DBpedia
• Dbpedia uses Wikipedia as a structured database.
• We use Dbedia to find semantically associated words from topic words
automatically.
• For instance, we search using ‘sports’, then we find 52 associated nouns.
• In this research, we visualize the frequency of these associated topic
words as a stream graph.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
D. Hayashi’s quantification method type IV
• We also plot the association of topic words in a set of tweets. First, we
create a co-occurrence matrix S of associated topic words.
• We set period for collection of tweets as one week, then we construct
the co-occurrence matrix, and we apply quantification method IV (Q-IV)
to calculating the two-dimensional position of each word.
• If two objects have strong associations, then they are plotted closer
together.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
• We collected 836.030 tweets as a seed from user ‘yagu1’ to 300 followers
over 104 weeks from May 5, 2010 to May 1, 2012. This dataset contains
topics discussed between people.
• First, we applied morphological analysis to the collected tweets. Next, we
selected topic words to visualize trends. We selected two topics words ‘野菜’
(vegetable in Japanese) and ’スポーツの’ (sports).
• Then, we found associated words using Dbpedia for each topic word: the
word ‘vegetable’ was associated with 72 words, such as ‘農家’ (farmer);
sports was associated with 52 words, such as ‘水泳’ (swimming).
• From this, we counted a set of tweets that were segmented by time, created
frequency indexes of these words by time segment, and then plotted them as
a stream graph
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
• The stream graph explains the variation of each
associated topic word, and the width of the
stream graph shows the variations in numbers of
topic words. In figure 3 and 4, each set of tweets
includes a weekly view (blue vertical ribbon), and
7-8 weeks in a view. The relational graph supports
the week of tweets indicated by the blue vertical
ribbon.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
• The large red area indicates ‘vegetable’ and
its frequency is very high over three weeks.
Here, the sentence tweeted multiple times
included only the term ‘vegetable’.
• Although this result is a little confused,
there are two clusters: {‘vegetable’, ‘salad’,
‘plant’, ‘custom’, ‘farmer’, ‘J-Pepper’} and
{‘fruit’, ’melon’, ‘gardening’}. These are well
separated from the concepts of ‘vegetable’
and ‘fruit’.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Introduction
Related Work
Method
Experiment
Result
Conclusion
• Comparing existing research, our research is more focusing a short time of
particular. The reason that graphs have relational graph. So we can see short
point of relationships.
• It is considered that those characteristics is easily out because those associated
words selected by Dbpedia. For verification, we need to check more test cases.
• It focused on nouns and eliminated synonyms and abbreviations. We should
develop this system further to cover other words such as verbs and adverbs,
index synonyms, and abbreviation using Word Net.
• This system is not automatic so we need to analyze individually. We plan to
make caches or eliminate data during the calculation process for large-scale
tweet databases to reduce processing time.
• In addition, we plan to automate the entire process from gathering Twitter data
to visualization and to publish this service.
Visualization of Spread of Topic Words on Twitter using Stream Graphs and Relational Graphs
Visualization in Media Big Data Analysis
Yingjian Qi, Xinyan Yu, Guoliang Shi, Ying Li
Faculty of Science and Technology
Communication University of China
ICIS, USA
IEEE 2015
Introduction
Content
Conclusion
• Nowadays the application of visualization technology is very wide while
big data technology is rapidly developing.
• According to the micro-blog users forwarding hotspots public opinion,
we introduce the geographic information data, which is able to be more
in-depth data analysis and demonstrate the nature of the law of the
things by visualization.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
I. Text Visualization
A. Meaning of text visualization
• Although traditional text analysis techniques achieve to dig out important
information from large data, information is usually unable to meet the
people’s requirements or to use the browser and screen them for a
reasonable way to analyze, understand and apply.
• Faced with this challenge, text visualization technology emerges.
• Text visualization combines text analytics, data mining, data visualization,
computer graphics, human-computer interaction, theory, and methods of
cognitive science and other disciplines.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
I. Text Visualization
B. Content of Text Visualization
• Text visualization system consists of three
parts:
• Producing visualization needs the process of text
analysis. This refers to the extraction of a
substantial collection of descriptors (keywords,
freedom words, heading, etc).
• Visual presentation. Text visual coding relates to
size, shape, orientation, texture, etc.
• Interaction between user and information map.
This mainly involves the associated update,
highlighting, animated transitions, zooming, and
so on.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
I. Text Visualization
C. Text Visualization Application Status
• The use of text visualization technology can produce rich charts and
images, which can help people to summarize the results of the full text,
analysis of the resulting data, and show it in a more easily understanding
and accepting way.
• It plays a huge effect on intelligence research, decision support, and
other related fields.
• Because of the development of text visualization, the emergence of
micro-blog subverts the process of people browsing traditional text
information and changes the traditional mode of transmission
information.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
II. Micro-Blog Visualization
A. Definition and Characteristics of Micro-Blog
• Micro-blog is a social networking platform to share real-time information
through relationship social networking, which based on user information
sharing, communication and achievement.
• Micro-blog has characteristics of convenience, originality, and grassroots.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
II. Micro-Blog Visualization
B. Analysis Tools of Micro-Blog Visualization
• Currently, popular micro-blog visual analysis tools are ‘Sina micro-blog
index’, ‘zhiwei’, ‘a find microanalysis’, ‘PKUVIS’, etc.
• Especially in the case of micro-blog trend of public opinion, the
application of hot words index is very extensive. Only open the search
interface, you will see the hot recent public opinion concerned by
people.
• Each micro-blog forwarding path is clearly visible and we can distinguish
forwarding hierarchy, so the most important is we can quickly find hot
micro-blog articles of opinion leaders.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
II. Micro-Blog Visualization
C. Influence and Significance of Micro-Blog Visualization
• Micro-blog becomes major positions for micro-blog marketing,
government affairs and public opinion fermentation. Government should
not be underestimated its power of spread and influence for individuals,
businesses.
• Micro-blog visualization analyzes the dissemination of information, traces
the source of hot events and identifies the authenticity of the relevant
decision-making departments. Those are helpful to serve the public.
• The spread of corruption micro-blog, micro-blog crackdown, micro-blog
marketing caused by micro-blog is changing our lives. Our life will be more
colorful as long as we use micro-blog reasonably to spread information
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
III. Geographical Information Visualization
A. Content of Geographical Information Visualization
• Geospatial information visualization is
based on computer science, cartography,
cognitive science, information science
and GIS, so that it can visually interpret
and transmit performance geospatial
information and reveal its rule through
computer technology and multimedia
technology
• Da Liu etc discussed the exploring the
ground based on three-dimensional
volume rendering technology, which the
prospects are very extensive
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
III. Geographical Information Visualization
B. Features of Geographic Information Visualization
• The geographic information visualization emerged gradually and has the
following characteristics:
• Intuitive and vivid. Geographic information visualization reveals information to the
reader though vivid visual image of graphics, images, video, sound, etc.
• Interactive exploratory. Interactive mode is helpful to visual thinking. Geographic
workers can easily learn to use interactive way to compare, synthesize, and analyze
more sources of information.
• Information carrier diversity. With the development of multimedia technology,
expressing geography information is no longer limited to tables, graphs and
documents, but extends to images, animation, 3D simulation, virtual reality, etc.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
III. Geographical Information Visualization
C. Process of Geographic Information Visualization
• Visualization process mainly includes forming graph and querying spatial
information:
• Forming a graphic image. Geographic information includes a 2D image and 3D map
data as well as data analysis and evaluation of visual expression of plots, histograms,
etc. GIS set up the many types of geographic phenomena graphics, images in the
Windows environment.
• Spatial information query. Quick query spatial information is an important
application of GIS visualization based on certain requirements. Spatial query GIS
includes: spatial relationship queries, property characteristics queries, queries
based on spatial relations and attribute characteristics.
Visualization in Media Big Data Analysis
Introduction
Content
Conclusion
• It is not easy for people to deal with the mass of information nowadays.
• The text visualization technology is emerged that is helpful for people to
extract information from the huge amount of information they are
interested in.
• Micro-blog subverts the traditional mode of transmission of information,
which is beneficial study the geographical information of communicators.
Visualization in Media Big Data Analysis
Visualization Time-Varying Topics via
Images and Texts for Inter-Media
Analysis
Masahiko Itoh, Masaru Kitsuregawa
Institute of Industrial Science
University of Tokyo
17th International Conference on Information Visualization
IEEE 2013
Introduction
Related Work
Method
Case Study
Conclusion
• Various types of content such as text, images, and videos have spread
throughout multiple media, such as TV and the Web, that have
complementary information and influence one another.
• It is important to compare how these media react to real world events to
understand recent societal behaviors and how each medium reacts to
other media.
• The proposed systems enable users to visually monitor changes in
thought, activities, and interests of people, and differences between
media through interactively exploring flows of texts and images extracted
from the media.
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• Hevre et al provides methods of visualizing changes in the values of
multiple attributes on a timeline.
• Wei et al simultaneously combines ThemeRivew with tag-clouds to
visualize changes in keywords consisting of topics.
• Dou et al provides flow-like metaphors and arranges these flows in
parallel to represent topical themes over time.
• Imoto et al placed a set of polylines in 3D space to provide two different
viewpoints that enabled users to explore both overviews of data from a
top view and details of specific parts of data from a frontal view.
• Gomi et al visualized images categorized by time, location, and people in
life log data to visualize flows of images.
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• Flake’s pivot provided visualization of cover photographs of magazines
from a particular facet
• Image depot visualized flows of images from captured data packets at
every IP address to check inappropriate use of the Internet.
• Crandall et al proposed a method of predicting locations from the visual,
textual, and temporal features of photographs that people took of the
locations.
• Lie et al introduced a tag ranking method using visual and textual features
to extract tags related to selected images
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• The proposed system is implemented as a composite component of the
IntelligentBox system (Okada et al), which is a component- based visual
software development system for interactive 3D graphics applications.
• Our visualization system consists of two main parts such as the Image
Flow View and Event view:
• Image Flow View
• Our system visualizes changes in images in topics in 3D visualization space.
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• It uses the x-axis for the timeline. We adapt a
histogram of images by stacking images on
the timeline to represent the flow of various
images at each timing.
• It uses the y-axis to stack images on the topic
with a specified time window. It enables us
to find the birth timing, bursting points,
changes in popular content, and the lifetime
of trends for each topic.
• Topics represented by the histograms of
images are arranged along the z-axis in the
3D space. Users can manually or
automatically define the order of topics on
the z-axis by using their rankings if they have
them.
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• Event View
• When users select to the topic and timing, the Event View retrieves events related
to the topic and automatically moves to the point of timing on the timeline to
display events belonging to the time window.
• We modify our event visualization component in the proposed system using text
analysis mechanism (Itoh, Yoshinaga, Toyoda, Kitsuregawa). It consists of two
components such as TimeSlice (Itoh, Toyoda, Kitsuregawa) and Timeflux (Itoh).
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• Figure 3 has an example of an event tree
for a selected keyword. We place a topic
keyword at the center of the tree in the
TimeSlice and arrange verbs, on which the
keyword depends, around the keyword.
• A TimeFlux is a line of bubbles, in this case
3D polygon fonts, that visualizes changes
in the amount of information such as the
number of events within a given period of
time (Figure 1)
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• Case Study 1: Inter-Media Event Exploration on TV and Blogs
• Extracting images from TV archive. News video images related to specified
keywords are extracted from a broadcast news video archive created by National
Institute of Informatics in Japan.
• Extracting events from Blog archive. We first extract sentences in which the names
of key people appeared from the blog archive for this purpose. We then extract
phrase dependency structures, in which each dependency relation includes the key
people, from the extracted sentences by using dependency analysis, and then
construct the event database.
• Example of visualization. Figure 1 and 2 indicate the number of mentions on TV and
blogs every day corresponding to the four key people. Figure 3 visualizes detailed
events that represent the activities of Prime minister Kan and people’s thoughts
and/or interests in his actions. We can also see that they appear in different image
flows from Figure 4.
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• Case Study 2: Visual Trends in Social Media
• Extracting visual trends from Blog archive. Our system retrieves relevant articles
from our blog archive for a given query, and then extracts images and surrounding
texts included in the articles. These images are clustered based on their visual,
textual, and chronological similarities, and then visualized as an image flow.
• Example of visualization. Figure 5 visualizes clustered images for a given query
related to ‘Prime Minister Hatoyama’, where the top 20 clusters are arranged from
front to back according to their rankings. Figure 6 visualizes changes in trends of
new products related to ‘kitkat’ confectionery in Japan.
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
Introduction
Related Work
Method
Case Study
Conclusion
• We proposed a visualization system to explore trends and events in
various types of media such as TV and/or blogs through observing the
flows of images and changes in event tree structures.
• Our approach could also be applied to images from other types of media
such as Twitter and Flickr.
• Our future work includes expanding our current work that allows users to
easily recognize the characteristic of events from image flows using
statistical values such as the mean and variance of histograms to extract
important events, and cross correlation histograms from different types of
media to extract lead lags from them.
Visualization Time-Varying Topics via Images and Texts for Inter-Media Analysis
“Moon Phrases”: A Social Media
Facilitated Tool for Emotional Reflection
and Wellness
Munmun De Choudhury, Michael Gamon, Aaron Hoff, Asta Roseway
Microsoft Research
ICST
IEEE 2013
Introduction
Related Work
Design
Technical
Conclusion
• Social media platforms, including Twitter and Facebook provide a window
onto the thoughts and feelings of individuals around small and big
happenings in their lives.
• We hypothesize that identifying the changes in language, emotion, and
social activity on social media would enable individuals to reflect in their
own behavior over time and in a fine-grained manner, which are
otherwise known to be difficult to keep track of
• We believe Moon Phrases thus bears the potential to act as a selfnarrative or ‘behavioral fingerprint’, and thereby serve as an unobtrusive
mechanism to facilitate emotional wellness in individuals.
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
Introduction
Related Work
Design
Technical
Conclusion
• Kotikalapudi et al analyzed patterns of web activity of college students
that could signal emotional concerns
• Park et al found initial evidence that people do post about their affective
concerns and even their treatment on Twitter.
• Choudhury et al examined linguistic and emotional correlates for
postnatal course of new mothers as manifested in Twitter.
• Munson et al presented an application for Facebook that promotes health
interventions.
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
Introduction
Related Work
Design
Technical
Conclusion
• A core challenge of the design process was identifying what are the best
social media cues to be visualized, based on an end user’s activity that can
promote emotional wellness.
• Hence, a key aspect of Moon Phrases was the ability of users to observe
how they use language, and how the usage of various style categories
reflected their behavioral characteristics.
• Emotions are founded on interrelated patterns of cognitive processes,
physiological arousal, and behavioral reactions.
• Weaving these observations together, or design process indicated Moon
Phrases to be a mechanism to show trends of people’s social activity (i.e.,
volume of posting on Twitter), and affect over time; as well as usage of
linguistic styles that might relate to people’s social and psychological
environment. We chose one day as the granularity to show these trends.
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
Introduction
Related Work
Design
Technical
Conclusion
• As Figure 1, the postings are
organized based their timestamp
of postings – most recent months
are represented as rows at the
top, and scrolling down one could
browse the affect over previous
timeframes.
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
Introduction
Related Work
Design
Technical
Conclusion
• On clicking on any moon on a certain day, a user
could view the particular date and day of the week,
as well as each posting made on that day, including
the degree of positivity expressed in each, through a
smaller-sized moon (Fig 2)
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
Introduction
Related Work
Design
Technical
Conclusion
• Fuller moons (the light-filled area of the moon in Figure 1) indicate
greater positive affect.
• The moons that are white/lighter shade in color in the illuminated portion
represented positive affect corresponding to 1-3 posts, whereas the
orange/darker shade ones corresponded to more than three posts - thus
an end user gets a sense of the general volume of posting that leads to
certain measurement of affect.
• Each of the ‘daily’ moons is also interactive – on clicking on any moon on
a certain day, a user could view the particular data and day of the week,
as well as each posting made on that day, including the degree of
positivity expressed in each, through a smaller-sized moon (Figure 2)
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
Introduction
Related Work
Design
Technical
Conclusion
• Studies in Choudhury et al indicated that these hashtags often acted as a
supervisory summary signal indicating a person’s affective state in the
context of the post.
• These hashtags were then mapped into positive and negative affect and
used as a training signal to identify affect from Twitter posts, based on a
text classifier – a maximum entropy classifier trained on unigrams and
bigrams of post content. This classifier has been validated on Twitter
datasets, with mean accuracy of more than 85% for the two classes of
affect.
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
Introduction
Related Work
Design
Technical
Conclusion
• In this paper, we have not validated the Moon Phrases prototype in terms
of its ability to act as an intervention mechanism for emotional wellness
yet.
• Through our design was closely based on feedback from fluent social
media users, in the future we intend to evaluate how Moon Phrases can
positively impact their behavior, beyond revealing mere ‘signals’ of their
affect and behavior.
• We are considering a longitudinal study design in which a set of users may
be asked to use the web tool over the course of three to four weeks; with
self-reported behavioral and emotional reflection surveys undertaken at
the end of every week.
“Moon Phrases”: A Social Media Facilitated Tool for Emotional Reflection and Wellness
CosMovis: Semantic Network
Visualization by Using Sentiment Words
of Movie Review Data
Hyoji Ha, Wonjoo Hwang, Sungyun Bae, Hanmin Choi, Hyunwoo Han, Gi-nam Kim, Kyungwon Lee
Department of Digital Media
Ajou University
19th International Conference on Information Visualization
IEEE 2015
Introduction
Related Work
Method
Conclusion
• Social Network Analysis performs a significant role in understanding and
finding solutions of society-functional problems by examining the original
structure and relationships of network.
• This paper proposed to discover the correlations between keywords
through ‘Multi-dimensional Scaling: MDS’ and reflect the analysis result in
a two-dimensional distribution map, to distribute nodes in semantic
positions when designing network visualization based in similarities.
• We also applied a constellation map formed upon nodes and edges of a
network clustering structure to label the characteristics of each cluster.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
• Sentiment Words
• Kim et al covered the sentiment words shown in online postings.
• JoungYeon et al illustrated the adjectives to describe the texture of Haptic, and
indicated the relations between adjectives on MDS.
• Network Visualization and Layouts
• Dunne et al introduce a technique called motif simplification, in which common
patters of nodes and links are replaced with compact and meaningful glyphs,
leading users to easily analyze network visualization.
• Uboldi et al presented a tool called ‘Knot’, aiming to analyze the multi-dimensional
and heterogeneous data, while focusing on interface design and information
visualization on multidisciplinary research context.
• Henry et al suggested the methods to solve the clustering ambiguity and increase
readability in network visualization.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
• Data Processing
• Sentiment Words Collection
We selected 100 sentiment word based on Hahn and Kang’s research. We investigated to what degree
of the emotion represented in each sentiment word can be drawn from watching the movies. The
questionnaire used a 7-point Likert Scale from ‘strongly irrelevant’ to ‘strongly relevant’. After
eliminating 32 sentiment words relatively under the average, 68 sentiment words were finally selected
• Sentiment Words Refinement
To select the final sentiment words from among 68 sentiment words, we collected and compared the
sentiment word data in existing movie reviews, eliminating the words rarely used.
•
Crawling
Movie review data were collected from NAVER, a web portal site with largest number of users in Korea. We
designed a web crawler to automate the sentiment word collection from movie reviews.
•
Establishing sentiment word dictionary
We divided the text data into morphemes collected through the crawling process. Extracting emotion
morphemes and classifying then by category was conducted with the consultation of Korean linguists.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
• Applying TF-IDF
We eliminated less influential sentiment word
clusters after matching them with actual movie
review data, in order to produce more accurate
results. We eliminated the sentiment words of
which the TF-IDF score was less than 10%, and
eventually selected 36 sentiment words.
• Movie Data Collection
Movie samples used in network visualization
were also collected from NAVER movie service
in accordance with movie review data. As a
result, 678 movie samples were selected and
utilized as network sample data.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
• Visualization Proposal
• Heat Map Visualization
First, we measured space among the
selected 36 sentiment words and
analyzed its correlations. We
conducted a survey on semantic
distance among 20 college students.
The sentiment words were on both
axis (36x36) and the distance
between words was scored by giving
plus/minus 3 points, considering
their emotional distance.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
We measured the frequency of the sentiment words on each movie by
contrasting sentiment words in the movie review data. Also, we measured
numerical values by calculating TF-IDF score to lower the weight of particular
sentiment words. Therefore, TF-IDF scores on each sentiment word could be
interpreted as a numerical value reflected on the Heat Map Visualization Graph
for target movies.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
As Figure 4 B, it can be interpreted that there
were various spectators with different emotions
about this movie, includes disappointments.
One indicated that only one characteristic
showed high frequency among several
sentiment words including ‘happy’, ‘surprise’,
‘boring’, ‘sad’, ‘anger’, ‘disgust’, and ’fear’
(Figure 4 A). Furthermore, using the Heat Map
made it possible to easily compare movie nodes
which have contrasting or similar sentiment
words.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
• Sentiment-Movie Network
We aim to explain the basic structure of suggested graphs and examples
and that the location of nodes can be altered depending on the main
sentiment word from the movie review. We connected Sentiment Words on
2D Scaling Map with Movie Network, we called Sentiment-Movie Network.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
The first layer is called The Semantic
Layer and it consist of Semantic Points
based on the 36 sentiment words. The
Semantic Point of the sentiment word
is located at an initially set value and it
stays immovable.
The second layer is called the Network
Layer, which includes the nodes that
comprise the movie network.
Each movie node forms the edge of
other movie nodes based on
similarities and also forms imaginary
edges with the sentiment word based
on sentiment word that the pertinent
node connotes.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
• Constellation Visualization
This chapter facilitates a cognitive
understanding of the process to design
constellation image visualization, based upon
specific nodes and edges with significant
sentiment word frequency to clarify the
semantic parts of each clustering.
We created an asterism graphic of each
cluster network, considering the significant
sentiment words, information on movies, and
synopses in each cluster.
Table 2 shows the main emotions and movie
examples that each cluster has, and the
motivates for choosing each asterism name.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Introduction
Related Work
Method
Conclusion
• In order to efficient analyze network visualization, this research
proposed Heat Map Visualization to understand the characteristics of
each node, a method to describe the network nodes based upon a 2D
sentiment word map and asterism graphic for the semantic
interpretation of clustering.
• However We did not consider the relation between color tones and
emotions when designing in satisfying the users’ possible needs to
connect the node’s color with emotions.
• This research is expected to be adopted in another network system
since our method is applicable regardless of the number of review
data, and even to other media contents such as web-based cartoons,
music, and books, using assorted constellation images related to
target field.
CosMovis: Semantic Network Visualization by Using Sentiment Words of Movie Review Data
Thank you for your Listening!