1. Articles in category: NER

    409-415 of 415 « 1 2 ... 15 16 17 18
    1. System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted And (wand)

      Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value ...
      Read Full Article
    2. System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations

      Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of ...
      Read Full Article
    3. Method and system for segmenting and identifying events in images using spoken annotations

      A method for automatically organizing digitized photographic images into events based on spoken annotations comprises the steps of: providing natural-language text based on spoken annotations corresponding to at least some of the photographic images; extracting predetermined information from the natural-language text that characterizes the annotations of the images; segmenting the images into events by examining each annotation for the presence of certain categories of information which are indicative of a boundary between events; and identifying each event by assembling the categories of information into event descriptions. The invention further comprises the step of summarizing each event by selecting and arranging ...
      Read Full Article
    4. Method and apparatus providing capitalization recovery for text

      A method for capitalizing text in a document includes processing a reference corpus to construct a plurality of dictionaries of capitalized terms, where the plurality of dictionaries include a singleton dictionary and a phrase dictionary. Each record in the singleton dictionary contains a word in lowercase, a range of phrase lengths m:n for capitalized phrases that the word begins, where m is a minimum phrase length and n is a maximum phrase length, and where each record in the phrase dictionary includes a multi-word phrase in lowercase. The method adds proper capitalization to an input monocase document by capitalizing ...
      Read Full Article
    5. Acm Sigkdd Explorations special issue on NLP and Text Mining

      The June 2005 issue of the ACM SIGKDD Explorations is a special issue on Natural Language Processing and Text Mining. Read full text at http://www.acm.org/sigs/sigkdd/explorations/issue.php?issue=current Text Mining and Natural Language Processing Introduction for the Special Issue Anne Kao, Steve Poteet Mining Knowledge from Text Using Information Extraction Raymond J. Mooney, Razvan Bunescu Instance Filtering for Entity Recognition Alfio Massimiliano Gliozzo, Claudio Giuliano, Raffaella [...]
      Read Full Article
    6. Probabilistic record linkage model derived from training data

      A method of training a system from examples achieves high accuracy by finding the optimal weighting of different clues indicating whether two data items such as database records should be matched or linked. The trained system provides three possible outputs when presented with two data items: yes, no or I don't know (human intervention required). A maximum entropy model can be used to determine whether the two records should be linked or matched. Using the trained maximum entropy model, a high probability indicates that the pair should be linked, a low probability indicates that the pair should not be ...
      Read Full Article
    7. System for chinese tokenization and named entity recognition

      A system (100, 200) for tokenization and named entity recognition of ideographic language is disclosed. In the system, a word lattice is generated for a string of ideographic characters using finite state grammars (150) and a system lexicon (240). Segmented text is generated by determining word boundaries in the string of ideographic characters using the word lattice dependent upon a contextual language model (152A) and one or more entity language models (152B). One or more named entities is recognized in the string of ideographic characters using the word lattice dependent upon the contextual language model (152A) and the one or ...
      Read Full Article
    409-415 of 415 « 1 2 ... 15 16 17 18
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles