Full Text Search

Download Report

Transcript Full Text Search

Full Text Search
Some Info
 An optional component
 Much faster and complex than the previous version
 Allow you to search for words and tokens in binary,
char, nchar, varchar, nvarchar. XML, image,
systname, text, and varbinary
 It build indexes on these columns
 Select * from Authors where contains(bio, ‘Oregon’);
Not
 Select * from Authors where bio like ‘%Oregon%’
Comparing with Like
 Full-text search
• Is magnitude faster
• Can other than text
• Can use language features such as searching for
‘take’ gives you ‘took’, takes, and taken as well
 However, like can
• Search middle of a work,  like ‘&puter’
• Search for sequence  like ‘ab[c,d][1-9]%’
• Faster than full text search with nonclustered index,
and you search the entire column or beginning of
the column
Few new features




48 languages
Noise lists, or Stop Words, (of, the, a etc.)
Failover support on mirrored database
Dynamic Management View supports
troubleshoot
 ……
Architecture
 Three key elements
• Indexing
• Searching
• Filter
 The column is index per words in the column,
not the entire content of a record’s column
 Searching uses the indexes
 A filter is used if the column is not texted
• For example, a XML filter is used for XML data, a
MS Word filter is used for MS Word data, etc
Word Breaker
 A tool determine how to break words from
sentences
• For example, FBI will match with F.B.I, but not
for fbi.
• For example, UK word breaker will understand
realise, realising, and realised; while the US
breaker will understand realize, realizing, and
realized
Search
 Contains – more exact matches
 FreeText – matches other forms mouse +
mice
 ContainsTable/FreeTextTable – returns
results for ranking
 Contains can be more powerful combining
with FormsOf, Near, boolean operations,
weighted, or * (a wildcard operator)
Hands on
 First create the category
 Then create indexes
 Use the wizards