Bradford`s Formula Itself
Download
Report
Transcript Bradford`s Formula Itself
Enterprise & Intranet Search
• How Enterprise is different from Web search
• What to think about when evaluating
Enterprise Search
• How Intranet use is different from Web use
- And what that means for search
Intranet Differences
• Intranet = content inside the organization
• Learning from content, not for commerce
• Smaller content collections
- Smaller content subjects
- Smaller number of possible tasks or queries
• More document types than the Web (supports)
• Filtering could be more applicable
• Taxonomies may be present (& understandable)
- Work groups, locations, departments, projects
• Content managed in some way (culture or policy)
• Is the goal to discover tacit knowledge?
Differences in Intranet use
• Bandwidth
- Wireless too
• Security
- Financial work, Access policies
• New technology
- Mobile, high-resolution displays, …
• Legal
- Regulation (Sarbanes Oxley)
- Privacy
• Cultural
- Adoption
- Revolution
One week of corporate search
• What are the patterns of search in a corporation?
- Big company 70K
- 740K documents
• 80% HTML, ~15% PDF
- Ultraseek engine
• 11-15 minute search sessions
• Small drop in Friday searching
• 71% of the 5644 users only active on one work-week
day
• User sessions
- 1.2 w 2.47 activities = infrequent
- 3.03 w 9.7 activities/day = frequent
• What interfaces & tools could increase use?
- Is increased searching a net good for knowledge workers?
Is Enterprise IR different?
• Application Design
-
Webify – Front Ends
Web Services
Application Service Providing
(More) Database Integration
• (Even More) Integration Issues
- Content (CMS and Politics)
- Quality & Quantity
• Existing Design Guidelines
• More Specific Users
- One corporation
- The accounting department
• More Definable Goals
- Dictated by management
- Interaction with (all?) potential users
• Must Use
- Use Data
- Feedback for Verification
Enterprise Search
• Centralized & Measurable
- More Return on Investment
• Work tasks
• Easier to develop than Web-wide search
• Clarified
- Consistent
- Accurate
• Simplified Technology Platform
- More Open to Information Sharing
- IA Structures Help Define Organization (Goals)
• Extendable IA System
Intranet Search & Info Extraction
• Building a system specifically for knowledge
workers, vertical markets & types of users
• Do you think intranet search is different?
• IT workers spend 15-35% of work time
searching for information
• We need more than relevance as a measure of
specific tasks
- Question Answering: specific answers, not
keyword matching
- Categorizing user needs
• By user? Department? Job? Task?
• Satisficing results vs. the right answers
Information Desk
• Tasks
-
Term definitions
Homepages for (internal) groups or topics
Experts
Employee contact (personal) info
• Categorization of need
- Query text itself
- Resulting documents
- Selected documents
• Developed a hierarchy
Catergories of Search Needs
Analysis of Search Needs
• Query logs
- Information & navigational needs
- Home pages & Relevance (content)
• Survey
- How to’s & Downloads
- Technologies, products, services, groups, projects,
people
• More in-depth analysis possible (logs, more
questionnaire surveys)
• How different are these needs from Web
search?
Challenges in Enterprise Search
• Google (Web) is the worst enemy of
Enterprise search
• Content complexity: dbms, non-linked docs,
email, CMS content, access levels,
servers/locations
• Ranking becomes more difficult with different
document types, metadata, systems
• Do we need Enterprise Metasearch?
- Enterprise, Federated, Web content ++ ?
- Corporate Web site, intranet, email, company
directory, forms, templates, reports…
Key IR research for Enterprises
• Defining an appropriate enterprise search test
collection
• Effective ranking over heterogeneous collections that
a characteristic of enterprise environments
• Portals for knowledge workers (intranet & internet?)
• Email search
• PageRank, relevance measures for internal
documents
• Understanding search context
• Future considerations for linked, internal media
• Multimedia*
• Web 2.0 features & document types*
• Crawling & updating strategies*
Solutions to Enterprise IR
• Designing linking mechanisms
- Based on use or (user generated) metadata
- Derive metadata & evaluate automation (e.g. email)
•
•
•
•
Navigation in intranets (saves searching)
APIs & open access
Part of records management activities
Intense focus on user evaluation &
development cycles
Final Projects & Papers
• Use class readings for prove your points
• Be daring with your ideas & state why you
think they’re right or interesting
• Cite any non-obvious facts
• Proof read you writing
• Be conscious of writing style & grammar
• Use APA or ACM style guidelines
• This should be a good contribution to your
portfolio of graduate work.