1. Articles in category: Segmentation

    745-768 of 834 « 1 2 ... 29 30 31 32 33 34 35 »
    1. Cross-Lingual Retrieval of Identical News Events by Near-Duplicate Video Segment Detection

      Recently, for reusing large quantities of accumulated news video, technology for news topic searching and tracking has become necessary. Moreover, since we need to understand a certain topic from various viewpoints, we focus on identical event detection in various news programs from different countries. Currently, text information is generally used to retrieve news video. However, cross-lingual retrieval is complicated by machine translation performance and different viewpoints and cultures. In this paper, we propose a cross-lingual retrieval method for detecting identical news events that exploits image information together with text information. In an experiment, we verified the effectiveness of making use ...
      Read Full Article
      Mentions: Japan Tokyo Nagoya
    2. Learning Word Segmentation Rules for Tag Prediction

      In our previous work we introduced a hybrid, GA&ILP-based; approach for learning of stem-suffix segmentation rules from an unmarked list of words. Evaluation of the method was made difficult by the lack of word corpora annotated with their morphological segmentation. Here the hybrid approach is evaluated indirectly, on the task of tag prediction. A pair of stem-tag and suffix-tag lexicons is obtained by the application of that approach to an annotated lexicon of word-tag pairs. The two lexicons are then used to predict the tags of unseen words in two ways, (1) by using only the stem and suffix generated ...
      Read Full Article
    3. Joint Inference in Information Extraction

      Hoifung Poon Pedro Domingos Department of Computer Science and Engineering University of Washington Seattle, WA 98195-2350, U.S.A. {hoifung, pedrod}@cs.washington.edu Abstract The goal of information extraction is to extract database records from text or semi-structured sources. Traditionally, information extraction proceeds by first segmenting each candidate record separately, and then merging records that refer to the same entities. While computational
      Read Full Article
    4. A Chinese Segmentation and Tagging Module Based on the Interpolated Probabilistic Model

      Chinese is a challenging language in natural language processing. Unlike other languages like English, Portuguese, the first step in Chinese text processing is the segmentation because there are no delimiters in a Chinese sentence for identifying the words boundaries in it. And there are many ambiguity problems during Chinese processing like segmentation ambiguities, unknown words problem, part-of-speech ambiguities, etc. In segmentation and tagging, one of the main tasks is to identify unknown words and recognize proper nouns. In the research, efforts are being paid on this particular problem. In this paper, an integrated application with segmentation and tagging ability has ...
      Read Full Article
      Mentions: Macau
    5. Compression of annotated nucleotide sequences.

      Related Articles Compression of annotated nucleotide sequences. IEEE/ACM Trans Comput Biol Bioinform. 2007 Jul-Sep;4(3):447-57 Authors: Korodi G, Tabus I This article introduces an algorithm for the lossless compression of DNA files, which contain annotation text besides the nucleotide sequence. First a grammar is specifically designed to capture the regularities of the annotation text. A revertible transformation uses the grammar rules in order to equivalently represent the original file as a collection of parsed segments and a sequence of decisions made by the grammar parser. This decomposition enables the efficient use of state-of-the-art encoders for processing the ...
      Read Full Article
      Mentions: Finland Tampere
    6. Text extraction and document image segmentation using matched wavelets and MRF model.

      Related Articles Text extraction and document image segmentation using matched wavelets and MRF model. IEEE Trans Image Process. 2007 Aug;16(8):2117-28 Authors: Kumar S, Gupta R, Khanna N, Chaudhury S, Joshi SD In this paper, we have proposed a novel scheme for the extraction of textual areas of an image using globally matched wavelet filters. A clustering-based technique has been devised for estim ating globally matched wavelet filters using a collection of groundtruth images. We have extended our text extraction scheme for the segmentation of document images into text, background, and picture components (which include graphics and continuous ...
      Read Full Article
    7. Large-scale evaluation of a medical cross-language information retrieval system.

      Related Articles Large-scale evaluation of a medical cross-language information retrieval system. Medinfo. 2007;12(Pt 1):392-6 Authors: Markó K, Daumke P, Schulz S, Klar R, Hahn U We propose an approach to multilingual medical document retrieval in which complex word forms are segmented according to medically relevant morpho-semantic criteria. At its core lies a multilingual dictionary, in which entries are equivalence classes of subwords, i.e. semantically minimal units. Using two different standard test collections for the medical domain, we evaluate our approach for six languages covered by our system. PMID: 17911746 [PubMed - indexed for MEDLINE]
      Read Full Article
    8. Finding Temporal Order in Discharge Summaries

      Finding Temporal Order in Discharge Summaries Philip Bramsen†, Pawan Deshpande†, Yoong Keok Lee‡, MS and Regina Barzilay†, PhD Massachusetts Institute of Technology (MIT), Cambridge, MA† DSO National Laboratories, Singapore‡ Abstract A method for automatic analysis of time-oriented clinical narratives would be of significant practical import for medical decision making, data modeling and biomedical research. This paper proposes a robust corpus-based approach for temporal analysis of medical disc
      Read Full Article
    9. Minimum Cut Model for Spoken Lecture Segmentation

      Minimum Cut Model for Spoken Lecture Segmentation Igor Malioutov Massachusetts Institute of Technology igorm@csail.mit.edu Regina Barzilay Massachusetts Institute of Technology regina@csail.mit.edu Abstract We consider the task of unsupervised lecture segmentation. We formalize segmentation as a graph-partitioning task that optimizes the normalized cut criterion. Our approach moves beyond localized comparisons and takes into account longrange cohesion dependencies. Our results demonstrate tha
      Read Full Article
    10. Translation correlation device

      A confirmation link edition unit receives a confirmation link specified by a user. A paragraph correlation unit respectively divides an English text and a Japanese text into a plurality of paragraphs according to the specified confirmation link. A segment correlation calculation unit correlates an English segment to a Japanese segment for each paragraph. A correlation edition unit provides a user the correspondence obtained by the segment correlation calculation unit, and edits the correspondence according to a correction instruction from the user if any.
      Read Full Article
    11. A Discriminative Model Corresponding to Hierarchical HMMs

      Hidden Markov Models (HMMs) are very popular generative models for sequence data. Recent work has, however, shown that on many tasks, Conditional Random Fields (CRFs), a type of discriminative model, perform better than HMMs. We propose Hierarchical Hidden Conditional Random Fields (HHCRFs), a discriminative model corresponding to hierarchical HMMs (HHMMs). HHCRFs model the conditional probability of the states at the upper levels given observations. The states at the lower levels are hidden and marginalized in the model definition. We have developed two algorithms for the model: a parameter learning algorithm that needs only the states at the upper levels in ...
      Read Full Article
    12. Systems and methods for providing online fast speaker adaptation in speech recognition

      A system (230) performs speaker adaptation when performing speech recognition. The system (230) receives an audio segment and identifies the audio segment as a first audio segment or a subsequent audio segment associated with a speaker turn. The system (230) then decodes the audio segment to generate a transcription associated with the first audio segment when the audio segment is the first audio segment and estimates a transformation matrix based on the transcription associated with the first audio segment. The system (230) decodes the audio segment using the transformation matrix to generate a transcription associated with the subsequent audio segment ...
      Read Full Article
    13. Method and apparatus for browsing document content

      A computer-implemented method is provided that includes receiving a document and determining a file type for the document. In addition, the document is segmented into blocks of text as a function of the file type and at least one keyword and a summary is generated for the document.
      Read Full Article
    14. Tokenizer for a natural language processing system

      The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokeinzer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.
      Read Full Article
    15. Device system and method for determining document similarities and differences

      Two documents are processed to facilitate visual mapping and comparison. These documents comprise document subsections and the subsections comprise document subsection headers associated therewith. At least one of the first document subsection headers is juxtaposed relative to an output of second document subsection headers mapping thereto, to visually emphasize a header mapping. This header mapping is established by: mapping the first document subsections relative to the second document subsections based on identifying substantial similarities therebetween, to establish a subsection mapping therebetween; and, in relation to the subsection mapping and the association between the document subsections and the subsection headers, further ...
      Read Full Article
    16. Method of extracting important terms, phrases, and sentences

      A computer extracts important terms, phrases or sentences from a document that it segments. The computer generates a square sum matrix from the document segments. The computer determines the importance of a given term, phrase or sentence on the basis of eigenvectors and eigenvalues of the matrix. The computer thereby selects the important terms, phrases or sentences related to the central concepts of the document.
      Read Full Article
      Mentions: lamda
    17. Knowledge based computer aided diagnosis

      A method of extracting computer graphical objects including at least one vessel structure from a data volume of a portion of an anatomy, the method comprising the steps of: utilizing knowledge based image processing to locate centerlines and utilizing an active surface technique to extract the outer surface of said vessel structure; and storing co-ordinate information of said outer surface for subsequent display.
      Read Full Article
      Mentions: Andrew Blake
    18. Evaluating distinctiveness of document

      Two document sets are compared in natural language processing and the distinctiveness of each constituent element (such as a sentence, term or phrase) of one document set is evaluated by dividing both the target and comparison documents into document segments, constructing the sentence vector of each document segment whose components are the occurring frequencies of terms occurring in the document segment, and projecting all the sentence vectors of both the documents on a projection axis to find a projection axis which maximizes a ratio equal to: (squared sum of projected values originating from the target document)/(squared sum of projected ...
      Read Full Article
    19. Method for named-entity recognition and verification

      A method for named-entity (NE) recognition and verification is provided. The method can extract at least one to-be-tested segments from an article according to a text window, and use a predefined grammar to parse the at least one to-be-tested segments to remove ill-formed ones. Then, a statistical verification model is used to calculate the confidence measurement of each to-be-tested segment to determine where the to-be-tested segment has a named-entity or not. If the confidence measurement is less than a predefined threshold, the to-be-tested segment will be rejected. Otherwise, it will be accepted.
      Read Full Article
      Mentions: Taipei
    20. 7171350

      A method for named-entity (NE) recognition and verification is provided. The method can extract at least one to-be-tested segments from an article according to a text window, and use a predefined grammar to parse the at least one to-be-tested segments to remove ill-formed ones. Then, a statistical verification model is used to calculate the confidence measurement of each to-be-tested segment to determine where the to-be-tested segment has a named-entity or not. If the confidence measurement is less than a predefined threshold, the to-be-tested segment will be rejected. Otherwise, it will be accepted.

      Read Full Article
    21. Method and apparatus for expanding dictionaries during parsing

      A method is provided for parsing text in a corpus. The method includes hypothesizing a possible new entry for a dictionary based on a first segment of text. A successful parse is then formed for the first segment of text using the possible new entry. Based on the successful parse, the dictionary is changed to include the new entry. The new entry in the dictionary is then used to parse a second segment of text.
      Read Full Article
    745-768 of 834 « 1 2 ... 29 30 31 32 33 34 35 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (39 articles) Microsoft
    2. (33 articles) Google
    3. (20 articles) Nuance Communications
    4. (20 articles) Apac
    5. (19 articles) Intel
    6. (19 articles) SMEs
    7. (18 articles) Healthcare
    8. (18 articles) Service
    9. (18 articles) IBM
    10. (17 articles) IBM Corporation
    11. (17 articles) Bfsi
    12. (15 articles) NLP
  4. Locations in the News

    1. (29 articles) India
    2. (23 articles) Japan
    3. (22 articles) China
    4. (19 articles) Pune
    5. (18 articles) New York
    6. (14 articles) Canada
    7. (13 articles) Germany
    8. (12 articles) Africa
    9. (12 articles) France
    10. (9 articles) Washington
    11. (9 articles) Massachusetts
    12. (9 articles) California
  5. People in the News

    1. (3 articles) Laura Wood