Fuentek LLC blank Our Services Our Experience Our Team Our Partners News Articles
Intellectual Property Management Available Technologies
blank Visit our blog Available Technologies Collaborative R&D Publications Samples Contact us
    Innovation Sphere

Communicate

Details

Project TitlePerilog
Track CodeAR-0016-LST
Short Description

NASA's Ames Research Center offers for license its patented Perilog software, a contextual search method that provides a simple-to-use means of finding and ranking text documents according to their relevance to particular words or phrases.

Abstract

NASA's Ames Research Center offers for license its patented Perilog software, a contextual search method that provides a simple-to-use means of finding and ranking text documents according to their relevance to particular words or phrases. Rather than simply finding documents that contain particular words or phrases, Perilog overcomes the shortcomings of other search methods by discovering contextual associations between words and phrases. Perilog's ability to automatically identify contextual associations in a document set enables conceptual and semantic search without the need to maintain categorization for the documents. Users are quickly able to identify related topics, even if those topics do not co-occur in the same document and the user has no prior knowledge of the documents. Perilog can be used for a wide range of conceptual search and semantic search applications—in knowledge management systems, as enhancements or add-ons to commercial search engines, and for contextual advertising solutions.

 
Tagscomputer software*, data processing & analysis*, document management, document retrieval, information retrieval, keyword search, perilog, quorum, search, text analysis, text retrieval
 
Posted DateFeb 28, 2012 12:47 PM

Client Contact Information

Fuentek Contact Information

If you would like more information or want to pursue transfer of this technology, please contact us by phone or email: (919) 249-0327, arc14512@fuentek.com.

Promotional Title

Perilog

Project Subtitle

A contextual search method for improving search results

Technology Summary

NASA’s Ames Research Center offers for license its patented Perilog software, a contextual search method that provides a simple-to-use means of finding and ranking text documents according to their relevance to particular words or phrases. Rather than simply finding documents that contain particular words or phrases, Perilog overcomes the shortcomings of other search methods by discovering contextual associations between words and phrases. Perilog’s ability to automatically identify contextual associations in a document set enables conceptual and semantic search without the need to maintain categorization for the documents. Users are quickly able to identify related topics, even if those topics do not co-occur in the same document and the user has no prior knowledge of the documents.

Perilog can be used for a wide range of conceptual search and semantic search applications—in knowledge management systems, as enhancements or add-ons to commercial search engines, and for contextual advertising solutions.

For more information about this licensing and joint development opportunity, please contact us by phone or e-mail: (919) 249-0327, arc14512@fuentek.com.

› Read our blog article about how this technology is keeping our skies safe.

Benefits

  • Reliable results: Delivers more relevant search results, with fewer queries by the user, compared to other search engine technologies
  • Contextual relevance: Enables users to discover words, ideas, and situational details that are contextually associated with a specific query
  • Intelligent search: Allows users to discover key themes in large document sets, with no prior knowledge of the documents
  • Efficient classification: Eliminates the time and expense associated with maintaining document categorization (i.e., ontology), delivering semantic and conceptual search results

Applications

Perilog can enhance many search-related applications, including:

  • Large knowledge management and document retrieval systems, for legal research, market research, intellectual property asset management, claims management, etc.
  • Life sciences and medical research Intelligence analyses
  • Commercial search engines
  • Contextual online advertising solutions

Technology Details

Perilog's underlying algorithm is based on the theory of experiential iconicity, which states that patterns of relatedness among things in the world of experience systematically influence patterns of relatedness among words in written discourse. Perilog's ability to deliver semantic and conceptual search results through an automated algorithm, without the need to rely on natural language processing or manually (or semi-manually) maintained categorization, follows directly from the theory of iconicity.

How It Works

Perilog measures the degree of contextual association of large numbers of term pairs in text to produce network models that capture the structure of the text and, by virtue of Perilog's validated theory of iconicity, the structure of the domain(s), situation(s), and concern(s) expressed by the author(s) of the text. In fact, given alphanumeric representations of any other sequences in which context is meaningful—such as music or genetic sequences—Perilog can derive their contextual structure.

Operating on a document set (i.e., corpus) or a single document, Perilog creates a network model of contextually related words and phrases. When a user enters a keyword or key phrase search, Perilog creates a query network of “topical hubs,” based on the query words input by the user. Phrases may be of any number and length. Each phrase is represented by a network, and these networks are combined into a single query network.

By matching the phrase query network with document networks, Perilog's phrase search provides flexible and thorough phrase matching that is unavailable with other methods. Instead of the keyword search being limited to the query words alone, Perilog uses the relationships of keywords within their contextual associations to find documents in which those relationships are significant.

Key features and methods

Perilog’s key features and methods encompass text analysis, modeling, relevance-ranking, keyword and phrase search, phrase generation, and phrase discovery:

Text Analysis: The process converts bodies of text to sequences of terms and measures the contextual associations among them. This determines the structure of text as a way of measuring the structure of the domains and situations represented by the text. Terms that are contextually related in the structure of text are considered contextually related in the world represented by the text.

Modeling: Each Perilog model consists of a network of contextually associated terms. A Perilog model can represent any body of text, from an entire database to a short phrase, and it can represent any domain, sub-domain, situation, situational detail topic, or subtopic.

Relevance Ranking: Perilog quantifies the similarity of any two models by comparing their paired terms and contextual measurements. One model’s features are used as relevance-ranking criteria and compared to a collection of models, enabling the models in the collection to be ranked according to their relevance to the criteria. By ranking a collection of models on every model in the collection, an association matrix can be created to provide data for input to clustering methods.

Keyword and Phrase Search: Perilog retrieves from a user-specified database documents that contain one or more user-specified keywords or phrases in typical or selected contexts, and ranks the documents on their relevance to the keywords or phrases in context. The most relevant documents are automatically highlighted and displayed in a Web browser window, allowing the user to scroll through and review them.

Each of the documents is accompanied by a list of related words or phrases that contribute to the relevance of the document. Experienced users can refer to these relations to understand which features were interpreted as contributing to the relevance of the document. In some cases, this can lead the user to modify or fine-tune the search strategy.

Phrase Generation: To aid a search, Perilog can produce a list of phrases from the database that contain a user-specified word or phrase that can be used to suggest queries for phrase searches. To generate phrases, the user provides a word or phrase that is to be contained in each of the output phrases. Perilog builds phrases around this input, based on its phrase models. The resulting phrases are displayed on the computer screen or can be redirected to a file. The phrases are sorted based on an estimate of their prominence in the document set.

Phrase Discovery: Further aiding the relevance of search results, Perilog can find phrases that are related to topics of interest. For example, given a topic such as “fatigue,” Perilog can discover related phrases such as “rest period,” “reduced rest,” “duty period,” and “crew scheduling.” Phrase discovery can help users understand the variations and scope of topics in a document set. They also can be used selectively as input to a Perilog phrase search, enabling retrieval of documents that contain particular topical variations.

The first step in phrase discovery is to perform a keyword or phrase search. Next, phrases are automatically extracted from the most relevant documents. The phrases produced at this point may be useful, but further processing will improve the results. From these phrases, topical phrases are distilled by a combination of manual and automated methods. The refined set of topical phrases then can be used to query the database, using phrase search. The cycle of phrase extraction and search is repeated to produce a final set of documents. If documents relevant to the topic are available in the document set, this final collection of documents will be highly relevant to the topic.

The main product is a list of topical phrases that are extracted from the final collection of documents.


Why it is better

Perilog overcomes many of the fundamental flaws that hinder more basic search engines:

  • Reliance on the “bag of words” model, in which all words are treated equally, and relevance determined only by the frequency with which a keyword or phrase appears in the text
  • Word ambiguity, or the inability to distinguish between different meanings of the same word (e.g., “bark”), producing ambiguous results for such queries
  • Term mismatch, when the search engine yields only a fraction of relevant results because users select the same term to describe an object less than 20 percent of the time
  • Query drift, when automatic expansion of queries results in unexpected and incorrect results

Perilog helps improve search results by addressing each of these flaws, determining the contextual association of words and word pairs in documents.

This network model enables Perilog to answer questions such as:

  • What are the most prominent topics in this document (even if the user has never seen the document before)?
  • What are the most relevant sections of the document regarding a particular topic?
  • What other topics are related most closely to this topic?

6,823,333; 6,741,981; 6,697,793; and 6,721,728.

Additional Web Content


Fuentek, LLC
Phone: (919) 249-0327


© 2010, Fuentek, LLC. All Rights Reserved.