Search engine

Download Report

Transcript Search engine

Mobile Search Engine
Based on idea presented in paper Data
mining for personal navigation, Hariharan,
G., Fränti, P., Mehta S. (2002)
Introduction
What if we could use location as a search option?
Task:
To implement a www -search engine for mobile
devices.
To study if it is possible to utilize www to find targets
and services near user’s location.
Location information in www
documents
In order to locate targets and services we should be
able to find location information from web pages.
Some studies exists: Geospatial Mapping and
Navigation of the Web (McCurley, 2001),
Paikkatiedon käyttö web-dokumenteissa (Vänskä,
2004).
Location information in www
documents
•
•
•
•
•
•
Geotags (GeoTags, 2005).
Address -tags (World Wide Web Consortium, 2005).
Address.
Postal code.
Phone number.
Well-known places
Using address- and geotags is very rare. In practice it
is necessary to find addresses inside the text.
One possible solution
• Use some existing and efficient web search engine to get
potential links related to user’s location and interests.
• Implement tools for searching location information from
those links.
Test App. : Defining search options
GPS
WWW -server
Addr./coords.
Lat, Lon
Lat, Lon
Joensuu
Pizzeria
PDA
Options: Pizzeria,
Joensuu
Test app. : extracting data
GPS
WWW -server
Pizzeria, Joensuu
Google
Relevant links
PDA
Extracting
location info.
- address
coordinates
Counting
distances
Addr. / coords.
Test app. : handling data
GPS
Pizzaspecial, puh …
Pizzeria Al Mooro, puh…
WWW -server
Creating result list.
Ordered by distance.
Pizza Express Cafe, puh…
...
PDA
Save to
database
(optional)
Show results to
user
Result database
Phone application
City Search Engine Demo
• http://www.cs.joensuu.fi/paikka/suomi/suo
mi.php
Software solutions
• Implemention of module that executes
Google-search with search options
”keyword” (user defined) and ”area”
(commune(s) within certain distance from
the user). Module returns list of links.
Software solutions
For every link :
• Get the plain text out from the html-document
• Create a table consisting all numbers and words in the
document.
• Going through the table, try to detect street names. With
the help of address/coordinate db, try to create addresses.
• Try to extract descriptive information related to addresses.
• After all links have been gone through, gather all results to
result list
Software solutions
For every result list item:
• Try to evaluate relevance of list items
• Arrange the list by distance (maybe
combined with relevance?)
• Delete multiple occurances
• Show results to the user
• (Save results)
Test app. problems
Search result relevance problem:
• Web page can include one or several useful results but at the same time
it can include information of totally different targets and services.
• Keyword -matching information can be found from the page, but the
keyword can have other meaning in current web page’s context.
• Keyword -matching information and addresses have been found from
the page but there is no relation between those two.
Test app. problems
Creating serach results:
• If we find an address from a web page, how
to find descriptive information related to
that address?
• How to measure search result’s relevance
(to the user)? We should get rid of nonrelevant search results.