Crowdsourcing
Download
Report
Transcript Crowdsourcing
dsourcing
Ling 240
What is crowdfunding?
Crowdsourcing—definition
“the practice of obtaining information or services by soliciti
Examples:
• Wikipedia
• Google Translate
• FamilySearch Indexing
COCA's registers based on publication
Crowdsourcing
• What are the benefits of collecting data through crowdsou
• What are the limitations/weaknesses?
• What can be done to ensure that crowdsourcing workers a
Crowdsourcing in linguistics
• Wilhelm Kaeding (1897)
• Thousands of non-experts helped compile and analyze an 11 millio
• Oxford English Dictionary (1858 – 1928)
• Hundreds of non-expert readers submitted 6 million quotation slip
• Perceptual dialectology
• Dialect perceptions elicited from non-experts
Mechanical Turk (Amazon)
• Strengths
•
•
•
•
Inexpensive
Fast
Quality control
Access to thousands of people
• Growing body of research strongly supports the quality of
• E.g., Buhmester et al., 2011; Kittur et al., 2008; Suri & Watts, 2011; Urbano et a
Case study--
Register classification
• Traditional ‘user’-based approach
• ‘Expert’ classifies texts into registers by simply sampling from the
• Limitations
• ‘Publication type’ is not a meaningful criterion for web documents
• Experts can’t agree on register category for internet texts
Corpus
• Extracted from the Corpus of Global Web-based English (GloWbE),
• (Near) random sampling methods used to build the corpus
• Google searches of highly frequent English 3-grams (e.g., is not the, and from the) used
• 800-1000 links for each n-gram (i.e., 80-100 Google results pages)
• Davies randomly extracted c. 49,300 URLs from GloWbE
• Only web pages from USA, UK, Canada, Aus., and NZ
• Documents < 75 words were excluded
• Non-textual material was removed from all web pages (HTML scrubbing and boilerplate
• 1,445 URLs were excluded from subsequent analysis because they c
• Final corpus for the study: 48,555 web documents.
People asked to determine mode of passage, then participan
Crowdsourcing end-user data: Classification
• Developed a computer-adaptive survey for register classif
• Tested the tool through 10 rounds of piloting, resulting in
• Recruited 908 raters through Mechanical Turk
• 6 responses x 4 raters x 49,300 texts = 1.2 million individu
Agreement results for the general register classificat
(Fleiss’ Kappa = .47, moderate agreement)
4 agree
3 agree
2-2 split
17,511
36.4%
15,684
32.6%
5,682
11.8%
2-1-1
split
8,515
17.7%
No
agreement
755
1.6%
• 69% of documents achieved majority agreement
• Additional 11.8% are potential 2-way hybrids
Frequencies of general register categories
(i.e., documents where 3 or 4 raters were in agreement)
Systematic patterns of disagreement
• 28 different 2-2 combinations are possible in theory
• But, only 7 of those combinations occurred > 100 times in o
• Because these are widely attested user-based patterns, we
Frequencies of 2-way hybrids that occu
Multi-Dimensional analysis
• Factor analysis to identify dimensions based on co-occurre
• Interpret dimensions functionally
• Calculate scores for each text on each dimension
17
Features used by Biber adopted:
Positive features:
Verbs: present tense verbs, mental verbs, do as pro-verb, be as main ver
Pronouns: 1st person pronouns, 2nd person pronouns, it, demonstrative
Adverbs: general emphatics, hedges, amplifiers
Dependent clauses: that complement clauses (with that deletion), caus
Other: contractions, analytic negation, discourse particles, sentence rel
==================================
Negative features:
Nouns, long words, prepositional phrases, attributive adjectives, lexical d
The results
• Linguistic (use-based) variation across user-based register
Web registers along Dimension 1
Web registers along Dimension 1
What have we learned?
• Non-expert users can reliably classify web documents
• At least 1 in 10 internet texts belongs to a hybrid register c
• Publication type ≠ register (at least for the web)
• E.g., blogs showed up in several register categories
• Triangulating end-user classifications with linguistic analys
research: Next steps
• Comprehensive linguistic description of the patterns of registe
• A new multi-dimensional analysis of web registers
• Detailed linguistic descriptions of ‘unique’ web registers
• Automatic prediction of register (‘AGI’)
• Automatically coded large corpus of web documents
• Extend descriptions to include ‘private’ web registers
Areas for future user-based research
• Register classification of printed texts
• Reader/listener perceptions
• Corpus annotation
• Word sense disambiguation
5. The future of crowdsourcing in user
• User-based analyses have always happened; now we can d
• Triangulating use-based linguistic data offers a more comp
• Linguists are often unable to fully analyze and interpret pa
• Harnessing the power of user-based data via crowdsourcin
Mechanical Turk
• The name comes from an 18th century machine that playe
• A person actually hid inside and played
Mechanical Turk
• Amazon's Mechanical Turk is a crowdsourcing tool.
• Researchers who need human evaluation can get data
• People who want to make some money help with the proj
– Image recognition
– Speech processing
– Subjective evaluation
– Giving opinions
– Tagging corpora
– Match picture with product
Mechanical Turk
• Example: word sense disambiguation in corpora
– What should head be tagged as? Noun or verb?
– What does head mean in a sentence?
• They charged the head of finances with the crime. (
• The beer was flat with no head. (froth)
• They were going head first (manner of movement)
• Computers can't do it well but people can
How does it work?
Couldn't people cheat?
After reviewing results the requester can
reject a worker
When rejected, they don't get paid
Workers have approval rates
Requesters can choose only workers with
good rates
Advantages
Thousands of potential workers available
You can get results fast
Demographic variety (not just undergrads)
Cheap (average $1.40 per hour)
Disadvantages
Cheating
Some studies show it's at same rates as in
lab
Ways to test
“While exercising how often have you had a
fatal heart attack?”
It requires money
Can't do many types of experiments (RT)
Go look at it
Mechanical Turk website