1. Articles in category: NER

    361-381 of 381 « 1 2 ... 13 14 15 16
    1. State-of-the-art anonymisation of medical records using an iterative machine learning framework.

      Related Articles State-of-the-art anonymisation of medical records using an iterative machine learning framework. J Am Med Inform Assoc. 2007 Jun 28; Authors: Szarvas G, Farkas R, Busa-Fekete R OBJECTIVE The anonymisation of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. DESIGN We introduce ...
      Read Full Article
    2. Rapidly retargetable approaches to de-identification in medical records.

      Related Articles Rapidly retargetable approaches to de-identification in medical records. J Am Med Inform Assoc. 2007 Sep-Oct;14(5):564-73 Authors: Wellner B, Huyck M, Mardis S, Aberdeen J, Morgan A, Peshkin L, Yeh A, Hitzeman J, Hirschman L OBJECTIVE: This paper describes a successful approach to de-identification that was developed to participate in a recent AMIA-sponsored challenge evaluation. METHOD: Our approach focused on rapid adaptation of existing toolkits for named entity recognition using two existing toolkits, Carafe and LingPipe. RESULTS: The "out of the box" Carafe system achieved a very good score (phrase F-measure of 0.9664) with only ...
      Read Full Article
    3. State-of-the-art anonymization of medical records using an iterative machine learning framework.

      Related Articles State-of-the-art anonymization of medical records using an iterative machine learning framework. J Am Med Inform Assoc. 2007 Sep-Oct;14(5):574-80 Authors: Szarvas G, Farkas R, Busa-Fekete R OBJECTIVE: The anonymization of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. DESIGN ...
      Read Full Article
    4. Semi-Supervised Named Entity Recognition

      YooName originates from the PhD research titled Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision. The thesis was successfully defended at University of Ottawa, Canada, and is now available online. Here’s the abstract: * * * Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper [...]
      Read Full Article
    5. A conditional random fields approach to biomedical named entity recognition

      Abstract  Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena, cascaded named entity and boundary ...
      Read Full Article
    6. Domain adaptation vs. transfer learning

      The standard classification setting is a input distribution p(X) and a label distribution p(Y|X). Roughly speaking, domain adaptation (DA) is the problem that occurs when p(X) changes between training and test. Transfer learning (TL) is the problem that occurs when p(Y|X) changes between training and test. In other words, in DA the input distribution changes but the labels remain the same; in TL, the input distributions stays the same, but the labels change. The two problems are clearly quite si
      Read Full Article
    7. Incorporating Dictionary Features into Conditional Random Fields for Gene/Protein Named Entity Recognition

      Biomedical Named Entity Recognition (BioNER) is an important preliminary step for biomedical text mining. Previous researchers built dictionaries of gene/protein names from online databases and incorporated them into machine learning models as features, but the effects were very limited. This paper gives a quality assessment of four dictionaries derived form online resources, and investigate the impacts of two factors (i.e., dictionary coverage and noisy terms) that may lead to the poor performance of dictionary features. Experiments are performed by comparing performances of the external dictionaries and a dictionary derived from GENETAG corpus, using Conditional Random Fields (CRFs) with ...
      Read Full Article
      Mentions: China Genetag Dalian
    8. Synchronicity

      Google and Microsoft are both active in the Named Entity Recognition (NER) field, and more notably, in Named Entity Disambiguation. This task consists of “disambiguating between multiple named entities that can be denoted by the same proper name” (Bunescu and Pasca 2006). For instance, politicians, Internet entrepreneurs and criminals share the name of James Clark. [...]
      Read Full Article
    9. F-measure versus Accuracy

      I had a bit of a revelation a few years ago. In retrospect, it's obvious. And I'm hoping someone else out there hasn't realized this because otherwise I'll feel like an idiot. The realization was that F-measure (for a binary classification problem) is not invariant under label switching. That is, if you just change which class it is that you call "positive" and which it is that you call "negative", then your overall F-measure will change. What this means is that you have to be careful, when usin
      Read Full Article
      Mentions: Pepsi
    10. Business Intelligence and Text Analytics

      With this kind of news becoming more frequent, it’s safe to say that named entity recognition technologies are playing an increasingly significant role in business intelligence (BI) and enterprise search (ES): “The marriage of business intelligence and text analytics is starting to have a profound impact on companies in several industries, including health care, insurance and [...]
      Read Full Article
    11. System, method and computer program product for performing unstructured information management and automatic text analysis, including a search operator functioning as a Weighted And (wand)

      Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. A search query includes a search operator containing of a plurality of search sub-expressions each having an associated weight value. The search engine returns a document or documents having a weight value ...
      Read Full Article
    12. System, method and computer program product for performing unstructured information management and automatic text analysis, and providing multiple document views derived from different document tokenizations

      Disclosed is a system architecture, components and a searching technique for an Unstructured Information Management System (UIMS). The UIMS may be provided as middleware for the effective management and interchange of unstructured information over a wide array of information sources. The architecture generally includes a search engine, data storage, analysis engines containing pipelined document annotators and various adapters. The searching technique makes use of a two-level searching technique. Also disclosed is system, method and computer program product to process document data. The method includes inputting a document and operating at least one text analysis engine that comprises a plurality of ...
      Read Full Article
    13. Method and system for segmenting and identifying events in images using spoken annotations

      A method for automatically organizing digitized photographic images into events based on spoken annotations comprises the steps of: providing natural-language text based on spoken annotations corresponding to at least some of the photographic images; extracting predetermined information from the natural-language text that characterizes the annotations of the images; segmenting the images into events by examining each annotation for the presence of certain categories of information which are indicative of a boundary between events; and identifying each event by assembling the categories of information into event descriptions. The invention further comprises the step of summarizing each event by selecting and arranging ...
      Read Full Article
    14. Method and apparatus providing capitalization recovery for text

      A method for capitalizing text in a document includes processing a reference corpus to construct a plurality of dictionaries of capitalized terms, where the plurality of dictionaries include a singleton dictionary and a phrase dictionary. Each record in the singleton dictionary contains a word in lowercase, a range of phrase lengths m:n for capitalized phrases that the word begins, where m is a minimum phrase length and n is a maximum phrase length, and where each record in the phrase dictionary includes a multi-word phrase in lowercase. The method adds proper capitalization to an input monocase document by capitalizing ...
      Read Full Article
    15. Acm Sigkdd Explorations special issue on NLP and Text Mining

      The June 2005 issue of the ACM SIGKDD Explorations is a special issue on Natural Language Processing and Text Mining. Read full text at http://www.acm.org/sigs/sigkdd/explorations/issue.php?issue=current Text Mining and Natural Language Processing Introduction for the Special Issue Anne Kao, Steve Poteet Mining Knowledge from Text Using Information Extraction Raymond J. Mooney, Razvan Bunescu Instance Filtering for Entity Recognition Alfio Massimiliano Gliozzo, Claudio Giuliano, Raffaella [...]
      Read Full Article
    16. Probabilistic record linkage model derived from training data

      A method of training a system from examples achieves high accuracy by finding the optimal weighting of different clues indicating whether two data items such as database records should be matched or linked. The trained system provides three possible outputs when presented with two data items: yes, no or I don't know (human intervention required). A maximum entropy model can be used to determine whether the two records should be linked or matched. Using the trained maximum entropy model, a high probability indicates that the pair should be linked, a low probability indicates that the pair should not be ...
      Read Full Article
    17. System for chinese tokenization and named entity recognition

      A system (100, 200) for tokenization and named entity recognition of ideographic language is disclosed. In the system, a word lattice is generated for a string of ideographic characters using finite state grammars (150) and a system lexicon (240). Segmented text is generated by determining word boundaries in the string of ideographic characters using the word lattice dependent upon a contextual language model (152A) and one or more entity language models (152B). One or more named entities is recognized in the string of ideographic characters using the word lattice dependent upon the contextual language model (152A) and the one or ...
      Read Full Article
    361-381 of 381 « 1 2 ... 13 14 15 16
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (2 articles) NLP
    2. (2 articles) NER
    3. (2 articles) BMC Med Inform Decis Mak
    4. (2 articles) POS
    5. (1 articles) European Union
    6. (1 articles) API
    7. (1 articles) Markov
    8. (1 articles) ICT
    9. (1 articles) Genia
    10. (1 articles) Sbar
    11. (1 articles) Faculty of Mathematics
    12. (1 articles) RDF
  4. Locations in the News

    1. (1 articles) Slovenia
    2. (1 articles) Ljubljana
  5. People in the News

    1. (1 articles) Jozef Stefan Institute
    2. (1 articles) Denny JC