How To Read Research Papers

Download Report

Transcript How To Read Research Papers

Federated & Meta Search
• What are they?
- Environment
• Library (institutional), Everywhere (Web)
- Content
• Web, Databases, Catalogs (books), (numerical)
data
- Users
• Researchers, Students, Academics, Anyone
• How are they used?
- Comparing results
- Widest possible information set for retrieval
Do you use Metasearch?
• For research?
- Research papers
- General information seeking
• When shopping?
- Trips
- Books
• Technical support (help)?
What else?
What are Digital Libraries?
• What’s not a digital library?
- The Web, Lexis-Nexis, UTNetCAT, ACM DigLib, YouTube,
Amazon.com, your laptop’s hard drive?
• Users think they’re content
• Librarians think they’re institutions & services
- Are they digital content only?
- An easier, digital way to find physical content or help?
• “Content, collections & communities”
- How do all of these fit together for Info Retrieval?
- Organizing everything for effective retrieval seems to be the
key challenge
- Making everything (possible) searchable is the key feature for
users.
• Metasearch is the key to Digital Libraries
Digital Library = Virtual Library?
• Freely available Web content is a pretty good
digital library
• Your own content is a good library (for refinding content)
• Databases & Indexes are traditional library
content. Now more digital
• Should it matter where the content is?
• Costs?
• Findability?
• Scalability?
Federated Search
• Everything is accessible
• Legal issues & pricing is coordinated
• Clustering & redundant information is
processed accordingly (cheapest first?)
• Query syntax is universal & transformed for
each dataset
• Databases, catalogs & text
• Relevancy is weighted & precise
• Multiple vendors & open access sources
- A balance? How “deep” in the deep Web?
Web Dynamics & Metasearch
• Different documents have many different
characteristics
- Web documents vs. other types of content
- Links, Metadata, Genre, Dynamically changing
• How well is the Web indexed?
- In terms of completeness? 60%?
• Metasearch is an index of the indices
- Parallel queries are not always the same
• Special purpose search engines a better idea?
- Google Scholar vs. Google
• Is Personalized (meta) search the answer?
- Special purpose is your purpose
- Relevance, ranking & importance
- Pricing, availability, locality
Categorizing Web search results
• The interface on metasearch may be more
important to users than the content
- Understanding results over finding (all) content
• Show results in context - use categories
- Understanding searches
- Building a taxonomy for results
- Customized for each result set?
- Show when there aren’t any results
- When results don’t rank high enough
• Do we need more overviews for results?
- Visualization for clustering
Category Building for Search
• How deep, shallow, lean or rich should
categories be?
- Should the content be the main criteria for
categories?
- Host, links, user perspective, genre?
• What features of content should be used to
cluster results?
- For a metasearch?
Fast-feature categorization
• Online lean techniques
- DNS, time visited, format, language, size, index
date
• Online rich techniques
- Fit to existing categories such as ODP, Yahoo!,
Music, Gov, Inventory
• Offline techniques
- Directory hierarchy
- Query probing
• Results, pages, words, (category) nodes, depth
& type of hierarchy
• Understanding the content is critical
Yahoo! Cataloging the Web
• A non-automated, technique
• How do information professionals build an “index” of
the Web?
• Cataloging applies to the Web
• Indexing with synonyms
• Browsing indexes vs searching them
• Comprehensive index not the goal
- Quality
- Information Density
• Yahoo’s own ontology – points to site for full info
• Subject Trees with aliases (@) to other locations
• “More like this” comparisons as checksums
Yahoo uses tools for indexing
More metasearch tools
•
•
•
•
Scroogle
Thumshots.org Ranking
Jux2
Search Engine Relationship Chart