1. Articles in category: NER

    313-336 of 423 « 1 2 ... 11 12 13 14 15 16 17 18 »
    1. Text Mining for Discovery of Host–Pathogen Interactions

      Text processing systems now supplement the information needs of professionals across a variety of industries. Applications such as relationship extraction, information retrieval, document summarization, question answering, and multilingual machine translation demonstrate practical utility in terms of accuracy and speed. Significant drivers behind these advances stem from performance improvements in underlying technologies such as syntactic parsing, named entity recognition, and semantic interpretation. Text mining consolidates these and other language processing technologies to extract meaningful information. This chapter surveys the field of biomedical text mining and develops a case study to illustrate the underlying resources that are available, as well as the ...
      Read Full Article
    2. Natural language interface for driving adaptive scenarios

      A "Natural Language Script Interface" (NLSI), provides an interface and query system for automatically interpreting natural language inputs to select, execute, and/or otherwise present one or more scripts or other code to the user for further user interaction. In other words, the NLSI manages a pool of scripts or code, available from one or more local and/or remote sources, as a function of the user's natural language inputs. The NLSI operates either as a standalone application, or as a component integrated into existing applications. Natural language inputs may be received from a variety of sources, and include ...
      Read Full Article
    3. Segmentation of strings into structured records

      An system for segmenting strings into component parts for use with a database management system. A reference table of string records are segmented into multiple substrings corresponding to database attributes. The substrings within an attribute are analyzed to provide a state model that assumes a beginning, a middle and an ending token topology for that attribute. A null token takes into account an empty attribute component and copying of states allows for erroneous token insertions and misordering. Once the model is created from the clean data, the process breaks or parses an input record into a sequence of tokens. The ...
      Read Full Article
    4. Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles.

      Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles. J Biomed Inform. 2009 Nov 6; Authors: Blake C Massive increases in electronically available text have spurred a variety of natural language processing methods to automatically identify relationships from text; however, existing annotated collections comprise only bioinformatics (gene-protein) or clinical informatics (treatment-disease) relationships. This paper introduces the Claim Framework that reflects how authors across biomedical spectrum communicate findings in empirical studies. The Framework captures different levels of evidence by differentiating between explicit and implicit claims, and by capturing underspecified claims such as correlations, comparisons, and observations. The results ...
      Read Full Article
    5. Named Entity Recognition Experiments on Turkish Texts

      Named entity recognition (NER) is one of the main information extraction tasks and research on NER from Turkish texts is known to be rare. In this study, we present a rule-based NER system for Turkish which employs a set of lexical resources and pattern bases for the extraction of named entities including the names of people, locations, organizations together with time/date and money/percentage expressions. The domain of the system is news texts and it does not utilize important clues of capitalization and punctuation since they may be missing in texts obtained from the Web or the output of ...
      Read Full Article
    6. Design of an Interface for Interactive Topic Detection and Tracking

      This paper presents the design of a new interface for interactive Topic Detection and Tracking (TDT) called Ievent. It is composed of 3 main views; a Cluster View, a Document View, and a Named Entity View, supporting the user in identifying new events and tracking them in a news stream. The interface has also been designed to test the usefulness in interactive TDT of named entity recognition. We report some initial findings from a user study on the effectiveness of our novel interface. Content Type Book ChapterDOI 10.1007/978-3-642-04957-6_20Authors Masnizah Mohd, University of Strathclyde Glasgow UKFabio Crestani, University of ...
      Read Full Article
    7. Using Answer Retrieval Patterns to Answer Portuguese Questions

      Esfinge is a general domain Portuguese question answering system which has been participating at QA@CLEF since 2004. It uses the information available in the “official” document collections used in QA@CLEF (newspaper text and Wikipedia) and information from the Web as an additional resource when searching for answers. Where it regards the use of external tools, Esfinge uses a syntactic analyzer, a morphological analyzer and a named entity recognizer. This year an alternative approach to retrieve answers was tested: whereas in previous years, search patterns were used to retrieve relevant documents, this year a new type of search patterns ...
      Read Full Article
    8. Highly Multilingual News Analysis Applications

      The publicly accessible Europe Media Monitor (EMM) family of applications (http://press.jrc.it/overview.html) gather and analyse an average of 80,000 to 100,000 online news articles per day in up to 43 languages. Through the extraction of meta-information in these articles, they provide an aggregated view of the news; they allow to monitor trends and to navigate the news over time and even across languages. EMM-NewsExplorer additionally collects historical information about persons and organisations from the multilingual news, generates co-occurrence and quotation-based social networks, and more. All EMM applications were entirely developed at, and are being ...
      Read Full Article
    9. An Iterative Model for Discovering Person Coreferences Using Name Frequency Estimates

      In this paper we present an approach to person coreference in a large collection of news, based on two main hypothesis: first, coreference is an iterative process, where the easy cases are addressed first and are then made available as an incrementally enriched resource for resolving more difficult cases. Second, at each iteration coreference among two person names is established according to a probabilistic model, where a number of features (e.g. frequency of first and last names) are taken into account. The approach does not assume any prior knowledge about persons mentioned in the collection and requires basic linguistic ...
      Read Full Article
    10. Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish

      This paper introduces a new set of tools and resources for Polish which cover all the steps required to transform a raw unrestricted text into a reasonable input for a parser. This includes (1) a large-coverage morphological lexicon, developed thanks to the IPI PAN corpus as well as a lexical acquisition techique, and (2) multiple tools for spelling correction, segmentation, tokenization and named entity recognition. This processing chain is also able to deal with the XCES format both as input and output, hence allowing to improve XCES corpora such as the IPI PAN corpus itself. This allows us to give ...
      Read Full Article
    11. Method and system for displaying time-series data and correlated events derived from text mining

      FIELD OF THE INVENTIONThe present invention generally relates to a method and system for displaying time-series data and correlated events. More specifically, the present invention relates to a method and system for displaying time-series data and correlated eventsderived from text mining.BACKGROUND OF THE INVENTIONNumerical serial data, such as the prices of stocks on any given date, is commonly presented graphically on a chart. For example, financial serial data is commonly presented in the fo
      Read Full Article
    12. Cascaded classifiers for confidence-based chemical named entity recognition.

      Related Articles Cascaded classifiers for confidence-based chemical named entity recognition. BMC Bioinformatics. 2008;9 Suppl 11:S4 Authors: Corbett P, Copestake A BACKGROUND: Chemical named entities represent an important facet of biomedical text. RESULTS: We have developed a system to use character-based n-grams, Maximum Entropy Markov Models and rescoring to recognise chemical names and other such entities, and to make confidence estimates for the extracted entities. An adjustable threshold allows the system to be tuned to high precision or high recall. At a threshold set for balanced precision and recall, we were able to extract named entities at an F ...
      Read Full Article
    13. Accelerating the annotation of sparse named entities by dynamic sentence selection.

      Related Articles Accelerating the annotation of sparse named entities by dynamic sentence selection. BMC Bioinformatics. 2008;9 Suppl 11:S8 Authors: Tsuruoka Y, Tsujii J, Ananiadou S BACKGROUND: Previous studies of named entity recognition have shown that a reasonable level of recognition accuracy can be achieved by using machine learning models such as conditional random fields or support vector machines. However, the lack of training data (i.e. annotated corpora) makes it difficult for machine learning-based named entity recognizers to be used in building practical information extraction systems. RESULTS: This paper presents an active learning-like framework for reducing the human ...
      Read Full Article
    14. Knowledge Discovery via Machine Learning for Neurodegenerative Disease Researchers

      Ever-increasing size of the biomedical literature makes more precise information retrieval and tapping into implicit knowledge in scientific literature a necessity. In this chapter, first, three new variants of the expectation–maximization (EM) method for semisupervised document classification (Machine Learning 39:103–134, 2000) are introduced to refine biomedical literature meta-searches. The retrieval performance of a multi-mixture per class EM variant with Agglomerative Information Bottleneck clustering (Slonim and Tishby (1999) Agglomerative information bottleneck. In Proceedings of NIPS-12) using Davies–Bouldin cluster validity index (IEEE Transactions on Pattern Analysis and Machine Intelligence 1:224–227, 1979), rivaled the state-of-the-art transductive support ...
      Read Full Article
    15. Knowledge Discovery via Machine Learning for Neurodegenerative Disease Researchers.

      Related Articles Knowledge Discovery via Machine Learning for Neurodegenerative Disease Researchers. Methods Mol Biol. 2009;569:173-96 Authors: Ozyurt IB, Brown GG Ever-increasing size of the biomedical literature makes more precise information retrieval and tapping into implicit knowledge in scientific literature a necessity. In this chapter, first, three new variants of the expectation-maximization (EM) method for semisupervised document classification (Machine Learning 39:103-134, 2000) are introduced to refine biomedical literature meta-searches. The retrieval performance of a multi-mixture per class EM variant with Agglomerative Information Bottleneck clustering (Slonim and Tishby (1999) Agglomerative information bottleneck. In Proceedings of NIPS-12) using Davies-Bouldin cluster ...
      Read Full Article
    16. System, and method for interactive browsing

      FIELD OF THE INVENTIONThe present invention generally relates to information technology, and more particularly, to a system and method for interactively browsing information.DESCRIPTION OF RELATED ARTAs more and more electronic documents are stored in computer, it becomes important how to manage the documents and get information effectively.At present, there are primarily three ways to acquire information. The first one is taxonomy. Taxonomy typically organizes a large scale of documents into a
      Read Full Article
      Mentions: N. sub
    17. Identification of Chemical Entities in Patent Documents

      Biomedical literature is an important source of information for chemical compounds. However, different representations and nomenclatures for chemical entities exist, which makes the reference of chemical entities ambiguous. Many systems already exist for gene and protein entity recognition, however very few exist for chemical entities. The main reason for this is the lack of corpus to train named entity recognition systems and perform evaluation. In this paper we present a chemical entity recognizer that uses a machine learning approach based on conditional random fields (CRF) and compare the performance with dictionary-based approaches using several terminological resources. For the training and ...
      Read Full Article
    18. A Comparison of Performance of Sequential Learning Algorithms on the Task of Named Entity Recognition for Indian Languages

      We have taken up the issue of named entity recognition of Indian languages by presenting a comparative study of two sequential learning algorithms viz. Conditional Random Fields (CRF) and Support Vector Machine (SVM). Though we only have results for Hindi, the features used are language independent, and hence the same procedure could be applied to tag the named entities in other Indian languages like Telgu, Bengali, Marathi etc. that have same number of vowels and consonants. We have used CRF++ for implementing CRF algorithm and Yamcha for implementing SVM algorithm. The results show a superiority of CRF over SVM and ...
      Read Full Article
      Mentions: India Indian Marathi
    313-336 of 423 « 1 2 ... 11 12 13 14 15 16 17 18 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (1 articles) NLP
    2. (1 articles) RSS
    3. (1 articles) Medline
    4. (1 articles) Human Phenotype Ontology
  4. Locations in the News

    1. (1 articles) Pacific