1. Articles in category: NER

    409-423 of 423 « 1 2 ... 15 16 17 18
    1. Incorporating Dictionary Features into Conditional Random Fields for Gene/Protein Named Entity Recognition

      Biomedical Named Entity Recognition (BioNER) is an important preliminary step for biomedical text mining. Previous researchers built dictionaries of gene/protein names from online databases and incorporated them into machine learning models as features, but the effects were very limited. This paper gives a quality assessment of four dictionaries derived form online resources, and investigate the impacts of two factors (i.e., dictionary coverage and noisy terms) that may lead to the poor performance of dictionary features. Experiments are performed by comparing performances of the external dictionaries and a dictionary derived from GENETAG corpus, using Conditional Random Fields (CRFs) with ...
      Read Full Article
      Mentions: China Genetag Dalian
    2. Synchronicity

      Google and Microsoft are both active in the Named Entity Recognition (NER) field, and more notably, in Named Entity Disambiguation. This task consists of “disambiguating between multiple named entities that can be denoted by the same proper name” (Bunescu and Pasca 2006). For instance, politicians, Internet entrepreneurs and criminals share the name of James Clark. [...]
      Read Full Article
    3. F-measure versus Accuracy

      I had a bit of a revelation a few years ago. In retrospect, it's obvious. And I'm hoping someone else out there hasn't realized this because otherwise I'll feel like an idiot. The realization was that F-measure (for a binary classification problem) is not invariant under label switching. That is, if you just change which class it is that you call "positive" and which it is that you call "negative", then your overall F-measure will change. What this means is that you have to be careful, when usin
      Read Full Article
      Mentions: Pepsi
    4. Business Intelligence and Text Analytics

      With this kind of news becoming more frequent, it’s safe to say that named entity recognition technologies are playing an increasingly significant role in business intelligence (BI) and enterprise search (ES): “The marriage of business intelligence and text analytics is starting to have a profound impact on companies in several industries, including health care, insurance and [...]
      Read Full Article
    5. System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted And (wand)

      Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value ...
      Read Full Article
    6. System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations

      Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of ...
      Read Full Article
    7. Method and system for segmenting and identifying events in images using spoken annotations

      A method for automatically organizing digitized photographic images into events based on spoken annotations comprises the steps of: providing natural-language text based on spoken annotations corresponding to at least some of the photographic images; extracting predetermined information from the natural-language text that characterizes the annotations of the images; segmenting the images into events by examining each annotation for the presence of certain categories of information which are indicative of a boundary between events; and identifying each event by assembling the categories of information into event descriptions. The invention further comprises the step of summarizing each event by selecting and arranging ...
      Read Full Article
    8. Method and apparatus providing capitalization recovery for text

      A method for capitalizing text in a document includes processing a reference corpus to construct a plurality of dictionaries of capitalized terms, where the plurality of dictionaries include a singleton dictionary and a phrase dictionary. Each record in the singleton dictionary contains a word in lowercase, a range of phrase lengths m:n for capitalized phrases that the word begins, where m is a minimum phrase length and n is a maximum phrase length, and where each record in the phrase dictionary includes a multi-word phrase in lowercase. The method adds proper capitalization to an input monocase document by capitalizing ...
      Read Full Article
    9. Acm Sigkdd Explorations special issue on NLP and Text Mining

      The June 2005 issue of the ACM SIGKDD Explorations is a special issue on Natural Language Processing and Text Mining. Read full text at http://www.acm.org/sigs/sigkdd/explorations/issue.php?issue=current Text Mining and Natural Language Processing Introduction for the Special Issue Anne Kao, Steve Poteet Mining Knowledge from Text Using Information Extraction Raymond J. Mooney, Razvan Bunescu Instance Filtering for Entity Recognition Alfio Massimiliano Gliozzo, Claudio Giuliano, Raffaella [...]
      Read Full Article
    10. Probabilistic record linkage model derived from training data

      A method of training a system from examples achieves high accuracy by finding the optimal weighting of different clues indicating whether two data items such as database records should be matched or linked. The trained system provides three possible outputs when presented with two data items: yes, no or I don't know (human intervention required). A maximum entropy model can be used to determine whether the two records should be linked or matched. Using the trained maximum entropy model, a high probability indicates that the pair should be linked, a low probability indicates that the pair should not be ...
      Read Full Article
    11. System for chinese tokenization and named entity recognition

      A system (100, 200) for tokenization and named entity recognition of ideographic language is disclosed. In the system, a word lattice is generated for a string of ideographic characters using finite state grammars (150) and a system lexicon (240). Segmented text is generated by determining word boundaries in the string of ideographic characters using the word lattice dependent upon a contextual language model (152A) and one or more entity language models (152B). One or more named entities is recognized in the string of ideographic characters using the word lattice dependent upon the contextual language model (152A) and the one or ...
      Read Full Article
    409-423 of 423 « 1 2 ... 15 16 17 18
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (1 articles) NLP
    2. (1 articles) RSS
    3. (1 articles) Medline
    4. (1 articles) Human Phenotype Ontology
  4. Locations in the News

    1. (1 articles) Pacific