1. Articles in category: NER

    337-360 of 374 « 1 2 ... 12 13 14 15 16 »
    1. Natural Language Processing in aid of FlyBase curators.

      Related Articles Natural Language Processing in aid of FlyBase curators. BMC Bioinformatics. 2008 Apr 14;9(1):193 Authors: Karamanis N, Seal R, Lewin I, McQuilton P, Vlachos A, Gasperin C, Drysdale R, Briscoe T ABSTRACT: BACKGROUND: Despite increasing interest in applying Natural Language Processing (NLP) to biomedical text, whether this technology can facilitate tasks such as database curation remains unclear. RESULTS: PaperBrowser is the first NLP-powered interface that was developed under a user-centered approach to improve the way in which FlyBase curators navigate an article. In this paper, we first discuss how observing curators at work informed the design ...
      Read Full Article
    2. Cost-Effective Web Search in Bootstrapping for Named Entity Recognition

      In this paper, we propose a cost-effective search strategy framework to extract keywords in the same semantic class from the Web. Constructing a dictionary based on the bootstrapping technique is one promising approach to harnessing knowledge scattered around the Web. Open web application programming interfaces (APIs) are powerful tools for the knowledge-gathering process. However, we have to consider the cost of API calls because too many queries can overload the search engines, and they also limit the number of API calls. Our goal is to optimize a search strategy that can collect as many new words as possible with the ...
      Read Full Article
    3. Labeling Categories and Relationships in an Evolving Social Network

      Modeling and naming general entity-entity relationships is challenging in construction of social networks. Given a seed denoting a person name, we utilize Google search engine, NER (Named Entity Recognizer) parser, and CODC (Co-Occurrence Double Check) formula to construct an evolving social network. For each entity pair in the network, we try to label their categories and relationships. Firstly, we utilize an open directory project (ODP) resource, which is the largest human-edited directory of the web, to build a directed graph, and then use three ranking algorithms, PageRank, HITS, and a Markov chain random process to extract potential categories defined in ...
      Read Full Article
      Mentions: Markov Taipei Google
    4. An Executable Survey Of Advances In Biomedical Named Entity Recognition.

      Related Articles BANNER: an executable survey of advances in biomedical named entity recognition. Pac Symp Biocomput. 2008;:652-63 Authors: Leaman R, Gonzalez G There has been an increasing amount of research on biomedical named entity recognition, the most basic text extraction problem, resulting in significant progress by different research teams around the world. This has created a need for a freely-available, open source system implementing the advances described in the literature. In this paper we present BANNER, an open-source, executable survey of advances in biomedical named entity recognition, intended to serve as a benchmark for the field. BANNER is implemented ...
      Read Full Article
    5. NER Demos on the Web

      Here’s a list of demos for Named Entity Recognition technologies: YooName, this is our demo LingPipe, Alias-i Cognitive Computation Group, University of Illinois at Urbana-Champaign FreeLing, Open-Source Suite of Language Analyzers NET, University of Colorado POSBIOTM/W (biomedical), PosTech ClearForest, Reuters TriFeed, TriFeed Ltd. FactMine (for Dutch language), University of Groningen in The Netherlands Aventinus (for Swedish language), University of Gothenburg Natural [...]
      Read Full Article
    6. Classifier subset selection for biomedical named entity recognition

      Abstract  Classifier ensembling approach is considered for biomedical named entity recognition task. A vote-based classifier selection scheme having an intermediate level of search complexity between static classifier selection and real-valued and class-dependent weighting approaches is developed. Assuming that the reliability of the predictions of each classifier differs among classes, the proposed approach is based on selection of the classifiers by taking into account their individual votes. A wide set of classifiers, each based on a different set of features and modeling parameter setting are generated for this purpose. A genetic algorithm is developed so as to label the predictions of ...
      Read Full Article
    7. A web-based Bengali news corpus for named entity recognition

      Abstract  The rapid development of language resources and tools using machine learning techniques for less computerized languages requires appropriately tagged corpus. A tagged Bengali news corpus has been developed from the web archive of a widely read Bengali newspaper. A web crawler retrieves the web pages in Hyper Text Markup Language (HTML) format from the news archive. At present, the corpus contains approximately 34 million wordforms. Named Entity Recognition (NER) systems based on pattern based shallow parsing with or without using linguistic knowledge have been developed using a part of this corpus. The NER system that uses linguistic knowledge has ...
      Read Full Article
    8. What is a Named Entity?

      To our surprise, when it comes to defining the task of Named Entity Recognition (NER), nobody seems to question including temporal expressions and measures. This probably deserves some historic consideration, since the domain was popularized by information extraction competitions where, clearly, the date and the money generated by the event were crucial. But we receive [...]
      Read Full Article
      Mentions: London German Person
    9. Domain Information for Fine-Grained Person Name Categorization

      Named Entity Recognition became the basis of many Natural Language Processing applications. However, the existing coarse-grained named entity recognizers are insufficient for complex applications such as Question Answering, Internet Search engines or Ontology population. In this paper, we propose a domain distribution approach according to which names which occur in the same domains belong to the same fine-grained category. For our study, we generate a relevant domain resource by mapping and ranking the words from the WordNet glosses to their WordNetDomains. This approach allows us to capture the semantic information of the context around the named entity and thus to ...
      Read Full Article
    10. Kernel approaches for genic interaction extraction.

      Related Articles Kernel approaches for genic interaction extraction. Bioinformatics. 2008 Jan 1;24(1):118-26 Authors: Kim S, Yoon J, Yang J MOTIVATION: Automatic knowledge discovery and efficient information access such as named entity recognition and relation extraction between entities have recently become critical issues in the biomedical literature. However, the inherent difficulty of the relation extraction task, mainly caused by the diversity of natural language, is further compounded in the biomedical domain because biomedical sentences are commonly long and complex. In addition, relation extraction often involves modeling long range dependencies, discontiguous word patterns and semantic relations for which the ...
      Read Full Article
    11. Semantic taxonomy induction from heterogenous evidence

      Semantic Taxonomy Induction from Heterogenous Evidence Rion Snow Computer Science Department Stanford University Stanford, CA 94305 rion@cs.stanford.edu Daniel Jurafsky Linguistics Department Stanford University Stanford, CA 94305 jurafsky@stanford.edu Andrew Y. Ng Computer Science Department Stanford University Stanford, CA 94305 ang@cs.stanford.edu Abstract We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on ind
      Read Full Article
    12. Named entity (NE) interface for multiple client application programs

      The present invention is a named entity (NE) interface to a linguistic analysis layer. The interface exposes each input sentence to the NE recognizers of all applications and returns all recognized NEs. Thus, the present invention can accommodate NEs which dynamically change in the applications, because each input string will be handed to the applications for NE recognition. The present invention also includes a data structure which is a normalized form of recognized NEs.
      Read Full Article
    13. Ranking Algorithms for Named-Entity Extraction: Boosting and theVoted Perceptron.

      Ranking Algorithms for Named--Entity Extraction: Boosting and the Voted Perceptron Michael Collins AT&T Labs-Research, Florham Park, New Jersey. mcollins@research.att.com Abstract This paper describes algorithms which rerank the top N hypotheses from a maximum-entropy tagger, the application being the recovery of named-entity boundaries in a corpus of web data. The first approach uses a boosting algorithm for ranking problems. The second approach uses the voted perceptron algorithm. Both algorit
      Read Full Article
    14. Machine Learning Methods in Natural Language Processing

      Machine Learning Methods in Natural Language Processing Michael Collins MIT CSAIL Some NLP Problems Information extraction – Named entities – Relationships between entities Finding linguistic structure – Part-of-speech tagging – Parsing Machine translation Common Themes Need to learn mapping from one discrete structure to another – Strings to hidden state sequences Named-entity extraction, part-of-speech tagging – Strings to strings Machine translation – Strings to underlying trees Pa
      Read Full Article
    15. Discriminative Reranking for Natural Language Parsing.

      Discriminative Reranking for Natural Language Parsing Michael Collins and Terry Koo Massachusetts Institute of Technology This paper considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that de ne an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as evidence. The strength
      Read Full Article
    16. State-of-the-art anonymisation of medical records using an iterative machine learning framework.

      Related Articles State-of-the-art anonymisation of medical records using an iterative machine learning framework. J Am Med Inform Assoc. 2007 Jun 28; Authors: Szarvas G, Farkas R, Busa-Fekete R OBJECTIVE The anonymisation of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. DESIGN We introduce ...
      Read Full Article
    17. Rapidly retargetable approaches to de-identification in medical records.

      Related Articles Rapidly retargetable approaches to de-identification in medical records. J Am Med Inform Assoc. 2007 Sep-Oct;14(5):564-73 Authors: Wellner B, Huyck M, Mardis S, Aberdeen J, Morgan A, Peshkin L, Yeh A, Hitzeman J, Hirschman L OBJECTIVE: This paper describes a successful approach to de-identification that was developed to participate in a recent AMIA-sponsored challenge evaluation. METHOD: Our approach focused on rapid adaptation of existing toolkits for named entity recognition using two existing toolkits, Carafe and LingPipe. RESULTS: The "out of the box" Carafe system achieved a very good score (phrase F-measure of 0.9664) with only ...
      Read Full Article
    18. State-of-the-art anonymization of medical records using an iterative machine learning framework.

      Related Articles State-of-the-art anonymization of medical records using an iterative machine learning framework. J Am Med Inform Assoc. 2007 Sep-Oct;14(5):574-80 Authors: Szarvas G, Farkas R, Busa-Fekete R OBJECTIVE: The anonymization of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. DESIGN ...
      Read Full Article
    19. Semi-Supervised Named Entity Recognition

      YooName originates from the PhD research titled Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision. The thesis was successfully defended at University of Ottawa, Canada, and is now available online. Here’s the abstract: * * * Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper [...]
      Read Full Article
    20. A conditional random fields approach to biomedical named entity recognition

      Abstract  Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena, cascaded named entity and boundary ...
      Read Full Article
    21. Domain adaptation vs. transfer learning

      The standard classification setting is a input distribution p(X) and a label distribution p(Y|X). Roughly speaking, domain adaptation (DA) is the problem that occurs when p(X) changes between training and test. Transfer learning (TL) is the problem that occurs when p(Y|X) changes between training and test. In other words, in DA the input distribution changes but the labels remain the same; in TL, the input distributions stays the same, but the labels change. The two problems are clearly quite si
      Read Full Article
    22. Incorporating Dictionary Features into Conditional Random Fields for Gene/Protein Named Entity Recognition

      Biomedical Named Entity Recognition (BioNER) is an important preliminary step for biomedical text mining. Previous researchers built dictionaries of gene/protein names from online databases and incorporated them into machine learning models as features, but the effects were very limited. This paper gives a quality assessment of four dictionaries derived form online resources, and investigate the impacts of two factors (i.e., dictionary coverage and noisy terms) that may lead to the poor performance of dictionary features. Experiments are performed by comparing performances of the external dictionaries and a dictionary derived from GENETAG corpus, using Conditional Random Fields (CRFs) with ...
      Read Full Article
      Mentions: China Genetag Dalian
    337-360 of 374 « 1 2 ... 12 13 14 15 16 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (1 articles) NLP
    2. (1 articles) CRF
    3. (1 articles) Jnlpba
  4. People in the News

    1. (1 articles) Miyao Y