Aurora: A Conceptual Model for Web

Download Report

Transcript Aurora: A Conceptual Model for Web

Aurora: A Conceptual Model
for Web-content Adaptation
to Support the Universal
Accessibility of Web-based
Services
Anita W. Huang, Neel Sundaresan
Presented by Allan Spale – EECS 578
Introduction
• The World Wide Web is a place for
information and commerce
• Electronic information distribution
removes previous accessibility barriers
– Flexible presentation of information
• The use of HTML on web pages
removes its meaning and functionality
Problems Using HTML
• Makes “comprehension, navigation, and
input difficult or completely impossible”
– Literal content
– Services
Improving Web Accessibility
• Low-level accessibility
– Provide alternatives to different media
types
• High-level accessibility
– Make Web services in a service domain
accessible to a large audience
Description of Aurora
• Aurora provides high-accessibility
– Analyzes web objects according to their
functions within a particular domain of Web
pages
– Based on a transaction model
• Provides a framework for encapsulating
general goals within a service domain
• “[P]rovides a set of schemas that describes
how a user obtains the identified services”
Transaction Model and XML
• Converts web data in service domains
into XML
• XML data is input to interface adaptors
• Each interface adaptor creates the new
Web page
User Scenario
• Blind user visits an on-line auction site
• Semantic obstacles appear on the page
to “hide” information necessary to make
a bid
– Aurora can improve this situation
• Access the site using Aurora
• Aurora will render the page in a format
acceptable to the user
Electronic Information
Accessibility
• Web Accessibility
– “…add provisions to existing Web pages.”
– Focus on low-level issues
• Provide alternate presentation forms for
different electronic media
Electronic Information
Accessibility
• Adaptive Hypermedia
– Offers adaptive measures for “new Webbased information systems”
– Challenges
• Incorporating each user’s goals into a user
model
• Structuring of information to permit translation
across presentation formats
Electronic Information
Accessibility
• Wrapper generation
– Wrapper applications have two roles
• Extract information from data
• Reorganize the data into structured forms
Aurora’s Transaction Model
• Specifies the user’s abstract goals
• “Scrapes” information from the Web page
relevant to the user goals
• Relevant to a specific service domain
– Common services
– Sequences of tasks to receive services
– Declaration of specific steps to accomplish tasks
Transaction Model Specifics
• Services
– Analyze a service domain to determine “a
discrete, common set of abstract user
goals”
• Transactions
– Tasks to be done to receive a service
Transaction Model Specifics
• Task Hierarchical Work-Flow Model
– Create a node for each step
– Connect the nodes according to
sequences of steps
– Label transitions between nodes where
appropriate
Web Content Classification
• Transaction model tracks each page’s
function in relation to the user goal
– Model applies to sets of web pages
• Transaction model used to transform
content without altering Web pages
• Can provide additional structure to data
at the source
Benefits for Universal
Usability
• Consistency
– Currently web sites differ in many ways
from one another
– Transaction model reduces this problem
• Consistent interface
• “[M]odel specifies a common set of goalorientated transactions for each service
domain”
Benefits for Universal
Usability
• Simplicity
– Web pages usually contain some irrelevant
information in relation to user goals
– Solution
• “Scrape” information from the Web page
according to the transaction model that
encapsulates the user goal
Benefits for Universal
Accessibility
• Adaptability
– Semantic information is implicit according
to its appearance
– Solution
• Use a transaction model to extract functional
semantics and add semantic markup
• Interface adaptors take output from transaction
model to create presentations for a user group
XML Framework
• eXtensible Markup Language used for
creating structured data
• The transaction model that maps web
objects can be stored using a DTD
(Document Type Definition)
• XML data will maintain the functional
semantics which will allow interface
adaptation
Using DTDs for Translation
Schemas
• Describe abstract tasks for every
service goal
• Contains the semantics and sequence
of task steps
• Together with the “scraped” web page
data, Aurora can write the transaction
document in XML
Operation of Aurora
Using XML
• User requests a Web page
• Aurora downloads Web page and recreates
the page
– Downloads the web page
– Extracts information and objects
– Creates XML document using a DTD
• Aurora uses the interface adaptor on the XML
document to create the new HTML page
Example of User-Aurora
Interaction
• User views Aurora-generated Web page
of an auction site converted from XML
to be displayed in HTML
– This is a node in the current transaction
document
– Current node is the item for bidding
– Hyperlinks in the generated HTML page
lead to other nodes in the XML document
Example of User-Aurora
Interaction
• User selects a hyperlink
– Each XML hyperlink will link to a Web page
and a transformation rule
– Aurora will download the web page and
apply the transformation rule
– A hyperlink links to a downloaded HTML
document and the extracted XML segment
Example of User-Aurora
Interaction
• Present the downloaded page to the user
– XML segment serves as input to the interface
adaptor
– Aurora will use the XML segment as input to the
interface adaptor
– The result is a displayable web page typically in
HTML
• The process repeats for future interactions
Aurora’s Method of
Content Extraction
• Uses PatML
* XML transformation tool that…match[es]
and transform[s] patterns in XML
documents
– Three parts to a PatML transformation rule
* XML pattern to match (source)
* Way to transform the matched pattern (target)
* Java code block to invoke on the pattern
(action)
*Items quoted directly from the paper
Aurora’s Method of
Content Extraction
• A transaction step has one PatML rule
– This rule will be used for all pages on a single
Web site
• Three parts of PatML rule, specifically
– Source: matches HTML tag patterns
– Target: turns matched part into an XML part
– Action: gets the XML part and returns its output
Aurora Architecture Using
WeB Intermediaries (WBI)
• “…enables applications to manipulate HTTP
streams during a Web transaction.”
• Three components
– Request editor
• Interface adaptor translates user actions
– Document generator
• Downloads web pages, applies transformation rules to
web pages, returns XML parts
– Document editor
• Interface adaptor adapts requested Web pages
Extensible Architecture
• Interface Adaptors
– “[S]ends XML data and receives user
responses.”
– Transforms XML data into a low-level
presentation format
– DTDs used to help generate additional
semantic meaning
– Two types of adaptors
• HTML text-only, Icon-enhanced HTML
Extensible Architecture
• Service Domains and Web Sites
– XML configuration document stores all
service domain definitions
• Adding new domains or sites into a domain
only involves editing the XML document
• Transcoding Engine
– Aurora can use other “transcoding and/or
extraction technologies”
• Currently uses PatML within its transformer
interface
Implementation Details
• Java plug-in for “WBI using PatML as the
transcoding tool”
• Schemas include auction and search engine
service domains
• PatML rules written for specific sites
– Auctions: eBay, Yahoo! Auctions
– Search Engines: AltaVista, Yahoo!, Google
• Uses previously mentioned interface adaptors
Summary
• Transaction Model
– Extracts semantics of web sites within
selected service domains
– Uses XML to maintain structured data from
Web pages
• Semantic Transcoding System
– “Scrapes” and adapts web pages to help
the user accomplish abstract goals within
an XML framework
Summary
• Extensible Structure
– Supports custom adapters that convert
XML data into some presentation format
• Improvements
– Needs “to support semi-automated or
automated rule generation and
maintenance”
Resources
• XML 1.0
– http://www.w3c.org/TR/REC-xml
• Web Content Accessibility Guidelines 1.0
– http://www.w3c.org/wai-webcontent
• PatML
– http://www.alphaworks.ibm.com/aw.nsf/techmain/00ADAB37
5888BDD2882566F300703F7F?OpenDocument
•
WeB Intermediaries (WBI) Development Kit
– http://www.almaden.ibm.com/cs/wbi/doc