BI-8-Web Mining file

Download Report

Transcript BI-8-Web Mining file

Business Intelligence and Analytics:
Systems for Decision Support
Global Edition
(10th Edition)
Chapter 8:
Web Analytics, Web Mining, and
Social Analytics
Web Mining ‫تعدين شبكة األنترنت‬
Overview



Web is the largest repository of data
Data is in HTML, XML, text format
Challenges (of processing Web data)






8-2
The
The
The
The
The
Web
Web
Web
Web
Web
is too big for effective data mining
is too complex
is too dynamic
is not specific to a domain ‫مجال‬
has everything
Opportunities and challenges are great!
© Pearson Education Limited 2014
Web Mining



Web mining (or Web data mining) is the
process of discovering ‫ اكتشاف‬intrinsic ‫الجوهرية‬
relationships from Web data (textual, linkage,
or usage)
Is it the same as data mining on data
generated on the Internet?
Web data?


Web Mining versus Web Analytics

8-3
Content, Link, Log, …
Look at the simple taxonomy on the next slide
© Pearson Education Limited 2014
Web Mining
Data
Mining
Text
Mining
WEB MINING
Web Content Mining
Source: unstructured
textual content of the
Web pages (usually in
HTML format)
Search Engines
Page Rank
Web Structure Mining
Source: the unified
resource locator (URL)
links contained in the
Web pages
Sentiment Analysis
Information Retrieval
Search Engines Optimization
8-4
Semantic Webs
Graph Mining
Social Network Analysis
Marketing Attribution
Web Usage Mining
Source: the detailed
description of a Web
site’s visits (sequence
of clicks by sessions)
Customer Analytics
Social Analytics
Web Analytics
Clickstream Analysis
Social Media Analytics
360 Customer View
© Pearson Education Limited 2014
Log Analysis
Web Content/Structure Mining



Mining the textual content on the Web
Data collection via Web Crawlers/Spiders
‫الشبكة العنكبوتية‬
Web pages include hyperlinks ‫وصالت‬



8-5
Authoritative pages ‫صفحات موثوقة‬
Hubs ‫محاور‬
hyperlink-induced topic search (HITS) alg.
© Pearson Education Limited 2014
Application Case 8.1
Identifying Extremist Groups with Web
Link and Content Analysis
Questions for Discussion
How can Web link/content analysis be used to
identify extremist groups?
2. What do you think are the challenges and the
potential solution to such intelligence gathering
activities?
1.
8-6
© Pearson Education Limited 2014
Search Engines ‫محرك البحث‬




8-7
Google, Bing, Yahoo, …
For what reason do you use search engines?
Search engine is a software program that
searches for documents (Internet sites or
files) based on the keywords (individual
words, multi-word terms, or a complete
sentence) that users have provided that have
to do with the subject of their inquiry ‫تحقيق‬
They are the workhorses ‫ حقول منتجه‬of the
Internet
© Pearson Education Limited 2014
Structure of a
Typical Internet Search Engine
Cashed / Indexed
Documents DB
Or Rank
de
red edPag
es
Document
Matcher/Ranker
Web Crawler
to
Responding Cycle
User
8-8
Ls
UR l
f
to w
Lis Cra
Scheduler
Development Cycle
ed
tch
a
M
of ges
t
s
Li
Pa
Cra
wli
n
We g the
b
Index
Query Analyzer
Pro
ce
Qu ssed
ery
Metadata
rch
Sea ery
Qu
Pro
ce
Pag ssed
es
© Pearson Education Limited 2014
Document
Indexer
World Wide Web
d
sse
e
c
s
pro ge
Un eb Pa
W
Search Engine Optimization ‫األمثل‬
(SEO)



It is the intentional activity of affecting the
visibility ‫ الرؤية‬of an e-commerce site or a Web
site in a search engine’s natural (unpaid or
organic) search results
Part of an Internet marketing strategy
Based on knowing how a search engine works


Indexing based on …


8-9
Content, HTML, keywords, external links, …
Webmaster submission of URL
Proactively ‫ استباقي‬and continuously ‫ مستمر‬crawling the
Web
© Pearson Education Limited 2014
Top 15 Most Popular Search
Engines (by eBizMBA, March 2013)
8-10
© Pearson Education Limited 2014
Application Case 8.3
Understanding Why Customers
Abandon Shopping Carts Results in
$10 Million Sales Increase




8-11
Situation
Problem
Solution
Results
© Pearson Education Limited 2014
Web Usage Mining


 Web Analytics!
Extraction ‫ استخراج‬of information from data
generated through Web page visits and
transactions…




8-12
data stored in server access logs, referrer logs,
agent logs, and client-side cookies
user characteristics and usage profiles
metadata, such as page attributes ‫سمات‬, content
attributes, and usage data
Clickstream data, clickstream ‫مسار النقر‬
analysis
© Pearson Education Limited 2014
Web Usage Mining

Web usage mining applications







8-13
Determine the lifetime value of clients
Design cross-marketing strategies across products
Evaluate promotional campaigns ‫حمالت ترويجية‬
Target ‫ استهداف‬electronic ads and coupons at user
groups based on user access patterns
Predict user behavior based on previously learned
rules and users' profiles
Present dynamic information to users based on their
interests and profiles
…
© Pearson Education Limited 2014
Application Case 8.4
Allegro Boosts Online Click-Thru Rates
by 500 Percent with Web Analysis
Questions for Discussion
1. How did Allegro significantly improve
clickthrough rates with Web analytics?
2. What were the challenges, the proposed
solution, and the obtained results?
8-14
© Pearson Education Limited 2014
Web Analytics Metrics

Provides near-real-time data to deliver invaluable
information to …




Web analytics metric categories:




8-15
Improve site usability
Manage marketing efforts ‫جهود تسويقية‬
Better document ROI, …
Web site usability: How were they using my Web site?
Traffic ‫ الحركة‬sources: Where did they come from?
Visitor profiles: What do my visitors look like?
Conversion statistics ‫إحصاءات التحويل‬: What does all this
mean for the business?
© Pearson Education Limited 2014
Web Analytics Metrics
- Web Site Usability
Web Site Usability
1. Page views
2. Time on site
3. Downloads
4. Click map
5. Click paths ‫مسار‬
8-16
Traffic Source
1. Referral ‫ إحالة‬Web
sites
2. Search engines
3. Direct
4. Offline campaigns
‫حمالت‬
5. Online campaigns
© Pearson Education Limited 2014
Web Analytics Metrics
- Web Site Usability
Visitor Profiles
1. Keywords
2. Content groupings
3. Geography
4. Time of day
5. Landing page
8-17
Conversion Statistics
1. New visitors
2. Returning
visitors ‫زيارات متكررة‬
3. Leads
4. Sales/conversions
5. Abandonment ‫الهجر‬
rates
© Pearson Education Limited 2014
Web Analytics Maturity ‫النضج‬
Model


Maturity  degree of proficiency, formality, and
optimization of business models
Business Intelligence Maturity Model (TDWI)


Business Analytics Maturity Model (INFORMS)


8-18
Management Reporting ➔ Spreadmarts ➔ Data Marts
➔ Data Warehouse ➔ Enterprise Data Warehouse ➔
BI Services
Descriptive Analytics ‫ ➔تحليالت وصفية‬Predictive
Analytics ‫ ➔ تحليالت تنبؤية‬Prescriptive Analytics ‫تحليالت‬
‫إلزامية‬
Web analytics maturity model  next slide…
© Pearson Education Limited 2014
Web Analytics Tools ‫ادوات‬

Plenty of them exist, and numbers are increasing
(Web-based versus downloadable)









8-19
Google Web Analytics (google.com/analytics)
Yahoo! Web Analytics (web.analytics.yahoo.com)
Open Web Analytics (openwebanalytics.com)
Piwik (PIWIK.ORG)
FireStats (firestats.cc)
Site Meter (sitemeter.com)
Woopra (woopra.com)
AWStats (awstats.org)
Snoop (reinvigorate.net) …
© Pearson Education Limited 2014
Social Media ‫وسائل التواصل األجتماعي‬
Definitions and Concepts



Enabling technologies of social interactions
among people
Relies ‫ تعتمد‬on enabling ‫ تمكن‬technologies of
Web 2.0
Takes on many different forms


Different types of social media

8-20
Internet forums, Web logs, social blogs,
microblogging, wikis, social networks,
podcasts, pictures, video, and product reviews
Based on media research and social process
© Pearson Education Limited 2014
Different Types of Social Media
1.
2.
3.
4.
5.
6.
Collaborative ‫ تعاونية‬projects (e.g.,
Wikipedia)
Blogs ‫ مدونة‬and microblogs (e.g., Twitter)
Content communities (e.g., YouTube)
Social networking sites (e.g., Facebook)
Virtual ‫ ظاهري‬game worlds (e.g., World of
Warcraft), and
Virtual social worlds (e.g., Second Life)
--Kaplan and Haenlein (2010)
8-21
© Pearson Education Limited 2014
Social versus Industrial Media


Web-based social media are different from
traditional/industrial media, such as
newspapers, television, and film
Differentiating characteristics





8-22
Quality ‫جودة‬
 Immediacy ‫الفورية‬
Reach ‫الوصول‬
 Updatability ‫التحديث‬
Frequency ‫التكرر‬
Accessibility ‫امكانية الوصول‬
Usability ‫سهولة االستخدام‬
© Pearson Education Limited 2014
How Do People Use Social Media?

Different engagement ‫ مشاركة‬levels
Level of Social Media Engagement
Creators
Critics
Joiners
Collectors
Spectators
Inactives
Time
8-23
© Pearson Education Limited 2014
Social Media Analytics


It is the systematic and scientific ways to
consume ‫ استهالك‬the vast amount of content
created by Web-based social media outlets,
tools, and techniques for the betterment ‫القدرة‬of
an organization’s competitiveness ‫التنافسية‬
Fastest growing movement in analytics
Social Media
Tweeter
Facebook
LinlkedIn
…
8-24
Insights ‫رؤى‬
Solutions ‫حلول‬
Course of Actions ‫إجراءات‬
…
© Pearson Education Limited 2014
Social Media Analytics

HBR Analytic Services survey (HBR, 2010)





Measuring the Social Media Impact



8-25
75% of the companies did not know where their
customers are talking about them
31% do not measure effectiveness of social media
only 23% are using social media analytics tools
7% are able to integrate ‫ دمج‬social media into marketing
Descriptive analytics – simple counts/statistics
Social network analysis
Advanced ‫ متقدمة‬analytics – predictive analytics, text
mining
© Pearson Education Limited 2014
Best Practices in
Social Media Analytics








8-26
Think of measurement as a guidance ‫ توجيه‬system,
not a rating ‫ تصنيف‬system
Track ‫ تتبع‬the elusive sentiment ‫شعور بعيد المنال‬
Continuously improve the accuracy of text analysis
Look at the ripple effect ‫اثر مضاعف‬
Look beyond the brand
Identify your most powerful influencers
Look closely at the accuracy of your analytic tool
Incorporate ‫ دمج‬social media intelligence into
planning
© Pearson Education Limited 2014
Social Media Analytics
Tools and Vendors








8-27
Attensity360
Radian6/Salesforce Cloud
Sysomos
Twitter
Collective Intellect ‫ فكر جماعي‬Facebook
Webtrends
YouTube
LinkedIn
Crimson Hexagon ‫سداسي‬
Flickr
Converseon
…
SproutSocial … ‫برعم االجتماعية‬
© Pearson Education Limited 2014