1. Articles in category: NER

    385-408 of 415 « 1 2 ... 14 15 16 17 18 »
    1. A web-based Bengali news corpus for named entity recognition

      Abstract  The rapid development of language resources and tools using machine learning techniques for less computerized languages requires appropriately tagged corpus. A tagged Bengali news corpus has been developed from the web archive of a widely read Bengali newspaper. A web crawler retrieves the web pages in Hyper Text Markup Language (HTML) format from the news archive. At present, the corpus contains approximately 34 million wordforms. Named Entity Recognition (NER) systems based on pattern based shallow parsing with or without using linguistic knowledge have been developed using a part of this corpus. The NER system that uses linguistic knowledge has ...
      Read Full Article
    2. What is a Named Entity?

      To our surprise, when it comes to defining the task of Named Entity Recognition (NER), nobody seems to question including temporal expressions and measures. This probably deserves some historic consideration, since the domain was popularized by information extraction competitions where, clearly, the date and the money generated by the event were crucial. But we receive [...]
      Read Full Article
      Mentions: London German Person
    3. Domain Information for Fine-Grained Person Name Categorization

      Named Entity Recognition became the basis of many Natural Language Processing applications. However, the existing coarse-grained named entity recognizers are insufficient for complex applications such as Question Answering, Internet Search engines or Ontology population. In this paper, we propose a domain distribution approach according to which names which occur in the same domains belong to the same fine-grained category. For our study, we generate a relevant domain resource by mapping and ranking the words from the WordNet glosses to their WordNetDomains. This approach allows us to capture the semantic information of the context around the named entity and thus to ...
      Read Full Article
    4. Kernel approaches for genic interaction extraction.

      Related Articles Kernel approaches for genic interaction extraction. Bioinformatics. 2008 Jan 1;24(1):118-26 Authors: Kim S, Yoon J, Yang J MOTIVATION: Automatic knowledge discovery and efficient information access such as named entity recognition and relation extraction between entities have recently become critical issues in the biomedical literature. However, the inherent difficulty of the relation extraction task, mainly caused by the diversity of natural language, is further compounded in the biomedical domain because biomedical sentences are commonly long and complex. In addition, relation extraction often involves modeling long range dependencies, discontiguous word patterns and semantic relations for which the ...
      Read Full Article
    5. Semantic taxonomy induction from heterogenous evidence

      Semantic Taxonomy Induction from Heterogenous Evidence Rion Snow Computer Science Department Stanford University Stanford, CA 94305 rion@cs.stanford.edu Daniel Jurafsky Linguistics Department Stanford University Stanford, CA 94305 jurafsky@stanford.edu Andrew Y. Ng Computer Science Department Stanford University Stanford, CA 94305 ang@cs.stanford.edu Abstract We propose a novel algorithm for inducing semantic taxonomies. Previous algorithms for taxonomy induction have typically focused on ind
      Read Full Article
    6. Named entity (NE) interface for multiple client application programs

      The present invention is a named entity (NE) interface to a linguistic analysis layer. The interface exposes each input sentence to the NE recognizers of all applications and returns all recognized NEs. Thus, the present invention can accommodate NEs which dynamically change in the applications, because each input string will be handed to the applications for NE recognition. The present invention also includes a data structure which is a normalized form of recognized NEs.
      Read Full Article
    7. Ranking Algorithms for Named-Entity Extraction: Boosting and theVoted Perceptron.

      Ranking Algorithms for Named--Entity Extraction: Boosting and the Voted Perceptron Michael Collins AT&T Labs-Research, Florham Park, New Jersey. mcollins@research.att.com Abstract This paper describes algorithms which rerank the top N hypotheses from a maximum-entropy tagger, the application being the recovery of named-entity boundaries in a corpus of web data. The first approach uses a boosting algorithm for ranking problems. The second approach uses the voted perceptron algorithm. Both algorit
      Read Full Article
    8. Machine Learning Methods in Natural Language Processing

      Machine Learning Methods in Natural Language Processing Michael Collins MIT CSAIL Some NLP Problems Information extraction – Named entities – Relationships between entities Finding linguistic structure – Part-of-speech tagging – Parsing Machine translation Common Themes Need to learn mapping from one discrete structure to another – Strings to hidden state sequences Named-entity extraction, part-of-speech tagging – Strings to strings Machine translation – Strings to underlying trees Pa
      Read Full Article
    9. Discriminative Reranking for Natural Language Parsing.

      Discriminative Reranking for Natural Language Parsing Michael Collins and Terry Koo Massachusetts Institute of Technology This paper considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that de ne an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as evidence. The strength
      Read Full Article
    10. State-of-the-art anonymisation of medical records using an iterative machine learning framework.

      Related Articles State-of-the-art anonymisation of medical records using an iterative machine learning framework. J Am Med Inform Assoc. 2007 Jun 28; Authors: Szarvas G, Farkas R, Busa-Fekete R OBJECTIVE The anonymisation of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. DESIGN We introduce ...
      Read Full Article
    11. Rapidly retargetable approaches to de-identification in medical records.

      Related Articles Rapidly retargetable approaches to de-identification in medical records. J Am Med Inform Assoc. 2007 Sep-Oct;14(5):564-73 Authors: Wellner B, Huyck M, Mardis S, Aberdeen J, Morgan A, Peshkin L, Yeh A, Hitzeman J, Hirschman L OBJECTIVE: This paper describes a successful approach to de-identification that was developed to participate in a recent AMIA-sponsored challenge evaluation. METHOD: Our approach focused on rapid adaptation of existing toolkits for named entity recognition using two existing toolkits, Carafe and LingPipe. RESULTS: The "out of the box" Carafe system achieved a very good score (phrase F-measure of 0.9664) with only ...
      Read Full Article
    12. State-of-the-art anonymization of medical records using an iterative machine learning framework.

      Related Articles State-of-the-art anonymization of medical records using an iterative machine learning framework. J Am Med Inform Assoc. 2007 Sep-Oct;14(5):574-80 Authors: Szarvas G, Farkas R, Busa-Fekete R OBJECTIVE: The anonymization of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act. DESIGN ...
      Read Full Article
    13. Semi-Supervised Named Entity Recognition

      YooName originates from the PhD research titled Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision. The thesis was successfully defended at University of Ottawa, Canada, and is now available online. Here’s the abstract: * * * Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper [...]
      Read Full Article
    14. A conditional random fields approach to biomedical named entity recognition

      Abstract  Named entity recognition is a fundamental task in biomedical data mining. In this letter, a named entity recognition system based on CRFs (Conditional Random Fields) for biomedical texts is presented. The system makes extensive use of a diverse set of features, including local features, full text features and external resource features. All features incorporated in this system are described in detail, and the impacts of different feature sets on the performance of the system are evaluated. In order to improve the performance of system, post-processing modules are exploited to deal with the abbreviation phenomena, cascaded named entity and boundary ...
      Read Full Article
    15. Domain adaptation vs. transfer learning

      The standard classification setting is a input distribution p(X) and a label distribution p(Y|X). Roughly speaking, domain adaptation (DA) is the problem that occurs when p(X) changes between training and test. Transfer learning (TL) is the problem that occurs when p(Y|X) changes between training and test. In other words, in DA the input distribution changes but the labels remain the same; in TL, the input distributions stays the same, but the labels change. The two problems are clearly quite si
      Read Full Article
    16. Incorporating Dictionary Features into Conditional Random Fields for Gene/Protein Named Entity Recognition

      Biomedical Named Entity Recognition (BioNER) is an important preliminary step for biomedical text mining. Previous researchers built dictionaries of gene/protein names from online databases and incorporated them into machine learning models as features, but the effects were very limited. This paper gives a quality assessment of four dictionaries derived form online resources, and investigate the impacts of two factors (i.e., dictionary coverage and noisy terms) that may lead to the poor performance of dictionary features. Experiments are performed by comparing performances of the external dictionaries and a dictionary derived from GENETAG corpus, using Conditional Random Fields (CRFs) with ...
      Read Full Article
      Mentions: China Genetag Dalian
    17. Synchronicity

      Google and Microsoft are both active in the Named Entity Recognition (NER) field, and more notably, in Named Entity Disambiguation. This task consists of “disambiguating between multiple named entities that can be denoted by the same proper name” (Bunescu and Pasca 2006). For instance, politicians, Internet entrepreneurs and criminals share the name of James Clark. [...]
      Read Full Article
    18. F-measure versus Accuracy

      I had a bit of a revelation a few years ago. In retrospect, it's obvious. And I'm hoping someone else out there hasn't realized this because otherwise I'll feel like an idiot. The realization was that F-measure (for a binary classification problem) is not invariant under label switching. That is, if you just change which class it is that you call "positive" and which it is that you call "negative", then your overall F-measure will change. What this means is that you have to be careful, when usin
      Read Full Article
      Mentions: Pepsi
    19. Business Intelligence and Text Analytics

      With this kind of news becoming more frequent, it’s safe to say that named entity recognition technologies are playing an increasingly significant role in business intelligence (BI) and enterprise search (ES): “The marriage of business intelligence and text analytics is starting to have a profound impact on companies in several industries, including health care, insurance and [...]
      Read Full Article
    385-408 of 415 « 1 2 ... 14 15 16 17 18 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles