1. Articles in category: WSD

    337-354 of 354 « 1 2 ... 12 13 14 15
    1. Meta search engine

      A computer implemented meta search engine and search method. In accordance with this method, a query is forwarded to one or more third party search engines, and the responses from the third party search engine or engines are parsed in order to extract information regarding the documents matching the query. The full text of the documents matching the query are downloaded, and the query terms in the documents are located. The text surrounding the query terms are extracted, and that text is displayed.
      Read Full Article
      Mentions: Reuters Yahoo Lycos
    2. Linguistic disambiguation system and method using string-based pattern training to learn to resolve ambiguity sites

      A linguistic disambiguation system and method creates a knowledge base by training on patterns in strings that contain ambiguity sites. The string patterns are described by a set of reduced regular expressions (RREs) or very reduced regular expressions (VRREs). The knowledge base utilizes the RREs or VRREs to resolve ambiguity based upon the strings in which the ambiguity occurs. The system is trained on a training set, such as a properly labeled corpus. Once trained, the system may then apply the knowledge base to raw input strings that contain ambiguity sites. The system uses the RRE- and VRRE-based knowledge base ...
      Read Full Article
    3. Information generation and retrieval method based on standardized format of sentence structure and semantic structure and system using the same

      The present invention relates to an information generation and retrieval apparatus based on a standardized format of sentence structure and semantic structure and a method thereof and a computer readable recording medium for recording a program for implementing the method. The method for generating and retrieving information for use in an apparatus for generating and retrieving information based on standardized formats of sentence structure and semantic structure, comprises a first step of transforming a natural language sentence (information and knowledge) described by a information provider to a conceptual graph depending on standardized formats of sentence structure and semantic structure and ...
      Read Full Article
      Mentions: Korea Yahoo Paris
    4. Method and system for finding a query-subset of events within a master-set of events

      A method and system for determining similarity between a first event set, the first event set including a first plurality of event types, and a second event set, the second event set including a second plurality of event types, is provided. Observed events are randomly mapped to a multidimensional vector-Q and query events are mapped to a multidimensional query vector-q. Comparison of the vectors for a predetermined similarity according to: .parallel.Q-q.parallel..ltoreq.SV, where SV=a predetermined similarity value determines similarity.
      Read Full Article
    5. Terminology translation for unaligned comparable corpora using category based translation probabilities

      The invention relates to a method and apparatus for generating translations of natural language terms from a first language to a second language. A plurality of terms are extracted from unaligned comparable corpora of the first and second languages. Comparable corpora are sets of documents in different languages that come from the same domain and have similar genre and content. Unaligned documents are not translations of one another and are not linked in any other way. By accessing monolingual thesauri of the first and second languages, a category is assigned to each extracted term. Then, category-to-category translation probabilities are estimated ...
      Read Full Article
    6. System and method for matching a textual input to a lexical knowledge base and for utilizing results of that match

      The present invention can be used in a natural language processing system to determine a relationship (such as similarity in meaning) between two textual segments. The relationship can be identified or determined based on logical graphs generated from the textual segments. A relationship between first and second logical graphs is determined. This is accomplished regardless of whether there is an exact match between the first and second logical graphs. In one embodiment, the first graph represents an input textual discourse unit. The second graph, in one embodiment, represents information in a lexical knowledge base (LKB). The input graph can be ...
      Read Full Article
    7. Techniques for controlling distribution of information from a secure domain

      Techniques for controlling distribution of information from a secure domain by automatically detecting outgoing messages which violate security policies corresponding to the secure domain. Semantic models are constructed for one or more message categories and for the outgoing messages. The semantic model of an outgoing message is compared with the semantic models of the message categories to determine a degree of similarity between the semantic models. The outgoing message is classified based on the degree of similarity obtained from the comparison. A determination is made, based on the classification of the outgoing message, if distribution of the outgoing message would ...
      Read Full Article
    8. System and method for personalized search, information filtering, and for generating recommendations utilizing statistical latent class models

      The disclosed system implements a novel method for personalized filtering of information and automated generation of user-specific recommendations. The system uses a statistical latent class model, also known as Probabilistic Latent Semantic Analysis, to integrate data including textual and other content descriptions of items to be searched, user profiles, demographic information, query logs of previous searches, and explicit user ratings of items. The disclosed system learns one or more statistical models based on available data. The learning may be reiterated once additional data is available. The statistical model, once learned, is utilized in various ways: to make predictions about item ...
      Read Full Article
    9. Linguistic disambiguation system and method using string-based pattern training to learn to resolve ambiguity sites

      A linguistic disambiguation system and method creates a knowledge base by training on patterns in strings that contain ambiguity sites. The string patterns are described by a set of reduced regular expressions (RREs) or very reduced regular expressions (VRREs). The knowledge base utilizes the RREs or VRREs to resolve ambiguity based upon the strings in which the ambiguity occurs. The system is trained on a training set, such as a properly labeled corpus. Once trained, the system may then apply the knowledge base to raw input strings that contain ambiguity sites. The system uses the RRE- and VRRE-based knowledge base ...
      Read Full Article
    10. Type-based selection of rules for semantically disambiguating words

      In semantically disambiguating words, where more than one disambiguation applies to the context in which a word occurs, a rule can be selected based on the type of information from which it was obtained. The rules can be derived from different types of information in a corpus such as a dictionary, and rules can be selected in accordance with a prioritization of the types of information.
      Read Full Article
    11. Method of and apparatus for processing an input text, method of and apparatus for performing an approximate translation and storage medium

      A method of processing an input text comprising a plurality of words is provided. The method comprising the steps of deriving from the input text a plurality of sets such that each set comprises at least one of the words of the input text, all of the words of each set are present in the input text, and the words of each if any set containing more than one word constitute a collocation; assigning to each set a unique relative rank; comparing each set in order of decreasing relative rank with the input text; and selecting each set, all of ...
      Read Full Article
      Mentions: London Grenoble
    12. Method and apparatus for measuring the degree of polysemy in polysemous words

      A system and apparatus are disclosed for identifying polysemous terms and for measuring their degree of polysemy. A polysemy index provides a quantitative measure of how polysemous a word is. A list of words can be ranked by their polysemy indices, with the most polysemous words appearing at the top of the list. A polysemy evaluation process collects a set of terms near a target term. Inter-term distances of the set of terms occurring near the target term are computed and the multi-dimensional distance space is reduced to two dimensions. The two dimensional representation is converted into radial coordinates. Isotonic ...
      Read Full Article
    13. Natural language processing system for semantic vector representation which accounts for lexical ambiguity

      A natural language processing system uses unformatted naturally occurring text and generates a subject vector representation of the text, which may be an entire document or a part thereof such as its title, a paragraph, clause, or a sentence therein. The subject codes which are used are obtained from a lexical database and the subject code(s) for each word in the text is looked up and assigned from the database. The database may be a dictionary or other word resource which has a semantic classification scheme as designators of subject domains. Various meanings or senses of a word may ...
      Read Full Article
    14. System for parametric text to text language translation

      The present invention is a system for translating text from a first source language into a second target language. The system assigns probabilities or scores to various target-language translations and then displays or makes otherwise available the highest scoring translations. The source text is first transduced into one or more intermediate structural representations. From these intermediate source structures a set of intermediate target-structure hypotheses is generated. These hypotheses are scored by two different models: a language model which assigns a probability or score to an intermediate target structure, and a translation model which assigns a probability or score to the ...
      Read Full Article
    15. Method and system for natural language translation

      The present invention is a system for translating text from a first source language into a second target language. The system assigns probabilities or scores to various target-language translations and then displays or makes otherwise available the highest scoring translations. The source text is first transduced into one or more intermediate structural representations. From these intermediate source structures a set of intermediate target-structure hypotheses is generated. These hypotheses are scored by two different models: a language model which assigns a probability or score to an intermediate target structure, and a translation model which assigns a probability or score to the ...
      Read Full Article
    16. Document information retrieval using global word co-occurrence patterns

      A method and apparatus accesses relevant documents based on a query. A thesaurus of word vectors is formed for the words in the corpus of documents. The word vectors represent global lexical co-occurrence patterns and relationships between word neighbors. Document vectors, which are formed from the combination of word vectors, are in the same multi-dimensional space as the word vectors. A singular value decomposition is used to reduce the dimensionality of the document vectors. A query vector is formed from the combination of word vectors associated with the words in the query. The query vector and document vectors are compared ...
      Read Full Article
    17. Method for natural language data processing using morphological and part-of-speech information

      An enhancement and retrieval method for natural language data using a computer is disclosed. The method includes executing linguistic analysis upon a text corpus file to derive morphological, part-of-speech information as well as lexical variants corresponding to respective corpus words. The derived linguistic information is then used to construct an enhanced text corpus file. A query text file is linguistically analyzed to construct a plurality of trigger token morphemes which are then used to construct a search mask stream which is correlated with the enhanced text corpus file. A match between the search mask stream and the enhanced corpus file ...
      Read Full Article
    18. Method for document retrieval and for word sense disambiguation using neural networks

      A method for storing and searching documents also useful in disambiguating word senses and a method for generating a dictionary of context vectors. The dictionary of context vectors provides a context vector for each word stem in the dictionary. A context vector is a fixed length list of component values corresponding to a list of word-based features, the component values being an approximate measure of the conceptual relationship between the word stem and the word-based feature. Documents are stored by combining the context vectors of the words remaining in the document after uninteresting words are removed. The summary vector obtained ...
      Read Full Article
    337-354 of 354 « 1 2 ... 12 13 14 15
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles