Thales provides a Java full-text resource indexer and search engine. Multiple search contexts may be created. Any LDBC-supported database may be used including MySQL, Oracle, MS-SQL, HSQLDB and DB/2.
thales
:
noun
.
[
thA-lEs
]
Thales of Miletus 624 BC to 545 BC) named after the first known Greek philosopher, scientist and
mathematician although his occupation was that of an engineer. He is believed to have been the teacher
of Anaximander (611 BC - 545 BC) and he was the first natural philosopher in the Milesian School.
Thales differs from other "bag-of-words" indexes in that there may be multiple bags (contexts) and the search may be limited to one or more of them. The Thales engine indexes text and associates the text with a reference. Each reference has a title. Thus an engine to index and search an entire web site might to read each page on the site, take the title element and the URL to create a reference and reference title and then index the textual contents of the page. In addition if the author was specified the authors name might be indexed in the "author" context. Thales uses JDBC and the LDBC package. The JDBC driver must be one of those supported by LDBC. Thales ConceptsWords and phrases are indexed within the scope of a Reference. They may further be restricted to a single Context within the reference.
QueryingWhen querying the database the context(s) that are to be searched may be specified or, if null, all contexts will be searched. For example you may want to index documents and index the body of the document in the "body" context and the authors in the "author" context. This would allow a query to locate all documents that John Smith had authored separate from all the documents about John Smith the author. Words are located by direct hit, synonym hits and soundex hits. The relative values associated with each hit type are set in the ThalesConfig object. Setting a value to 0 (zero) removes it from the calculation. The relative values of each word along with the number of words in the query and the number of times the word appears in the database factor into a final rank for a resource. With the highest rank being the closest match to what was requested. The results are sorted by rank order. FindingWhen finding words only exact words are located. This is much quicker than the Query operation but is much more restricted. Cross ReferencingOnce a reference has been located it is possible to find all other references that share the same indexed words. The XRef mechanism provides this functionality. Thales Requirements
Optional Packages
All trademarks and copyrights are the property of their respective owners.
Copyright © 2002-2004 by Xenei.com, All Rights Reserved
|