1. Articles in category: Segmentation

    841-864 of 919 « 1 2 ... 33 34 35 36 37 38 39 »
    1. Minimum Cut Model for Spoken Lecture Segmentation

      Minimum Cut Model for Spoken Lecture Segmentation Igor Malioutov Massachusetts Institute of Technology igorm@csail.mit.edu Regina Barzilay Massachusetts Institute of Technology regina@csail.mit.edu Abstract We consider the task of unsupervised lecture segmentation. We formalize segmentation as a graph-partitioning task that optimizes the normalized cut criterion. Our approach moves beyond localized comparisons and takes into account longrange cohesion dependencies. Our results demonstrate tha
      Read Full Article
    2. Translation correlation device

      A confirmation link edition unit receives a confirmation link specified by a user. A paragraph correlation unit respectively divides an English text and a Japanese text into a plurality of paragraphs according to the specified confirmation link. A segment correlation calculation unit correlates an English segment to a Japanese segment for each paragraph. A correlation edition unit provides a user the correspondence obtained by the segment correlation calculation unit, and edits the correspondence according to a correction instruction from the user if any.
      Read Full Article
    3. A Discriminative Model Corresponding to Hierarchical HMMs

      Hidden Markov Models (HMMs) are very popular generative models for sequence data. Recent work has, however, shown that on many tasks, Conditional Random Fields (CRFs), a type of discriminative model, perform better than HMMs. We propose Hierarchical Hidden Conditional Random Fields (HHCRFs), a discriminative model corresponding to hierarchical HMMs (HHMMs). HHCRFs model the conditional probability of the states at the upper levels given observations. The states at the lower levels are hidden and marginalized in the model definition. We have developed two algorithms for the model: a parameter learning algorithm that needs only the states at the upper levels in ...
      Read Full Article
    4. Systems and methods for providing online fast speaker adaptation in speech recognition

      A system (230) performs speaker adaptation when performing speech recognition. The system (230) receives an audio segment and identifies the audio segment as a first audio segment or a subsequent audio segment associated with a speaker turn. The system (230) then decodes the audio segment to generate a transcription associated with the first audio segment when the audio segment is the first audio segment and estimates a transformation matrix based on the transcription associated with the first audio segment. The system (230) decodes the audio segment using the transformation matrix to generate a transcription associated with the subsequent audio segment ...
      Read Full Article
    5. Method and apparatus for browsing document content

      A computer-implemented method is provided that includes receiving a document and determining a file type for the document. In addition, the document is segmented into blocks of text as a function of the file type and at least one keyword and a summary is generated for the document.
      Read Full Article
    6. Tokenizer for a natural language processing system

      The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokeinzer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.
      Read Full Article
    7. Device system and method for determining document similarities and differences

      Two documents are processed to facilitate visual mapping and comparison. These documents comprise document subsections and the subsections comprise document subsection headers associated therewith. At least one of the first document subsection headers is juxtaposed relative to an output of second document subsection headers mapping thereto, to visually emphasize a header mapping. This header mapping is established by: mapping the first document subsections relative to the second document subsections based on identifying substantial similarities therebetween, to establish a subsection mapping therebetween; and, in relation to the subsection mapping and the association between the document subsections and the subsection headers, further ...
      Read Full Article
    8. Method of extracting important terms, phrases, and sentences

      A computer extracts important terms, phrases or sentences from a document that it segments. The computer generates a square sum matrix from the document segments. The computer determines the importance of a given term, phrase or sentence on the basis of eigenvectors and eigenvalues of the matrix. The computer thereby selects the important terms, phrases or sentences related to the central concepts of the document.
      Read Full Article
      Mentions: lamda
    9. Knowledge based computer aided diagnosis

      A method of extracting computer graphical objects including at least one vessel structure from a data volume of a portion of an anatomy, the method comprising the steps of: utilizing knowledge based image processing to locate centerlines and utilizing an active surface technique to extract the outer surface of said vessel structure; and storing co-ordinate information of said outer surface for subsequent display.
      Read Full Article
      Mentions: Andrew Blake
    10. Evaluating distinctiveness of document

      Two document sets are compared in natural language processing and the distinctiveness of each constituent element (such as a sentence, term or phrase) of one document set is evaluated by dividing both the target and comparison documents into document segments, constructing the sentence vector of each document segment whose components are the occurring frequencies of terms occurring in the document segment, and projecting all the sentence vectors of both the documents on a projection axis to find a projection axis which maximizes a ratio equal to: (squared sum of projected values originating from the target document)/(squared sum of projected ...
      Read Full Article
    11. Method for named-entity recognition and verification

      A method for named-entity (NE) recognition and verification is provided. The method can extract at least one to-be-tested segments from an article according to a text window, and use a predefined grammar to parse the at least one to-be-tested segments to remove ill-formed ones. Then, a statistical verification model is used to calculate the confidence measurement of each to-be-tested segment to determine where the to-be-tested segment has a named-entity or not. If the confidence measurement is less than a predefined threshold, the to-be-tested segment will be rejected. Otherwise, it will be accepted.
      Read Full Article
      Mentions: Taipei
    12. 7171350

      A method for named-entity (NE) recognition and verification is provided. The method can extract at least one to-be-tested segments from an article according to a text window, and use a predefined grammar to parse the at least one to-be-tested segments to remove ill-formed ones. Then, a statistical verification model is used to calculate the confidence measurement of each to-be-tested segment to determine where the to-be-tested segment has a named-entity or not. If the confidence measurement is less than a predefined threshold, the to-be-tested segment will be rejected. Otherwise, it will be accepted.

      Read Full Article
    13. Method and apparatus for expanding dictionaries during parsing

      A method is provided for parsing text in a corpus. The method includes hypothesizing a possible new entry for a dictionary based on a first segment of text. A successful parse is then formed for the first segment of text using the possible new entry. Based on the successful parse, the dictionary is changed to include the new entry. The new entry in the dictionary is then used to parse a second segment of text.
      Read Full Article
    14. Streaming video bookmarks

      A method, apparatus and systems for bookmarking an area of interest of stored video content is provided. As a viewer is watching a video and finds an area of interest, they can bookmark the particular segment of the video and then return to that segment with relative simplicity. This can be accomplished by pressing a button, clicking with a mouse or otherwise sending a signal to a device for marking a particular location of the video that is of interest. Frame identifiers can also be used to select a desired video from an index and to then retrieve the video ...
      Read Full Article
      Mentions: Boston Intel San Jose
    15. Sentence segmentation method and sentence segmentation apparatus, machine translation system, and program product using sentence segmentation method

      To provide a highly accurate sentence segmentation process in natural language processing by estimating parts of speech of words in text to be processed. Dictionary data is used to perform a sentence segmentation process on a text to be processed. If it cannot be determined through a user of the dictionary data whether the text should be broken into sentences, the parts of speech of words constituting the text are estimated and a further sentence segmentation process is performed based on the result of the estimation.
      Read Full Article
      Mentions: Italy Digital Viterbi
    16. Systems and methods for determining the topic structure of a portion of text

      Systems and methods for determining the topic structure of a document including text utilize a Probabilistic Latent Semantic Analysis (PLSA) model and select segmentation points based on similarity values between pairs of adjacent text blocks. PLSA forms a framework for both text segmentation and topic identification. The use of PLSA provides an improved representation for the sparse information in a text block, such as a sentence or a sequence of sentences. Topic characterization of each text segment is derived from PLSA parameters that relate words to "topics", latent variables in the PLSA model, and "topics" to text segments. A system ...
      Read Full Article
    17. Method and apparatus for adapting a class entity dictionary used with language models

      A method and apparatus are provided for augmenting a language model with a class entity dictionary based on corrections made by a user. Under the method and apparatus, a user corrects an output that is based in part on the language model by replacing an output segment with a correct segment. The correct segment is added to a class of segments in the class entity dictionary and a probability of the correct segment given the class is estimated based on an n-gram probability associated with the output segment and an n-gram probability associated with the class. This estimated probability is ...
      Read Full Article
    18. Method and system for segmenting and identifying events in images using spoken annotations

      A method for automatically organizing digitized photographic images into events based on spoken annotations comprises the steps of: providing natural-language text based on spoken annotations corresponding to at least some of the photographic images; extracting predetermined information from the natural-language text that characterizes the annotations of the images; segmenting the images into events by examining each annotation for the presence of certain categories of information which are indicative of a boundary between events; and identifying each event by assembling the categories of information into event descriptions. The invention further comprises the step of summarizing each event by selecting and arranging ...
      Read Full Article
    19. Systems and methods for displaying interactive topic-based text summaries

      Techniques for displaying interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text. A hierarchical ...
      Read Full Article
    20. Method and apparatus for determining unbounded dependencies during syntactic parsing

      A method is provided for identifying non-local relationships between licensing elements in a text segment and a word or phrase external to the text segment during a syntactic parse. Under the method, certain syntactic rules for combining words or phrases with text segments indicate that there is a possibility that the word or phrase being combined with the text segment will fill a gap in a relationship within the text segment. Based on this possibility, the text segment is searched to determine if there are any unfilled gaps in the text segment. Under some embodiments, if an unfilled gap is ...
      Read Full Article
    21. Tokenizer for a natural language processing system

      The present invention is a segmenter used in a natural language processing system. The segmenter segments a textual input string into tokens for further natural language processing. In accordance with one feature of the invention, the segmenter includes a tokenizer engine that proposes segmentations and submits them to a linguistic knowledge component for validation. In accordance with another feature of the invention, the segmentation system includes language-specific data that contains a precedence hierarchy for punctuation. If proposed tokens in the input string contain punctuation, they can illustratively be broken into subtokens based on the precedence hierarchy.
      Read Full Article
    22. Identifying, processing and caching object fragments in a web environment

      A method, apparatus and computer program product for identifying and creating persistent object fragments from a named object. For example, a digital content description of a named digital object can be dynamically parsed, and persistent fragment identities created and maintained to facilitate caching. Named digital objects include but are not limited to: Web pages described in XML, SGML, and HTML. The object description is revised by replacing each object fragment with its newly created persistent identity. The revised object description is then sent to the requesting node. Depending upon the properties of a fragment, this can either enable the fragment ...
      Read Full Article
      Mentions: Microsoft Newark
    23. Task/domain segmentation in applying feedback to command control

      An apparatus for responding to a current user command associated with one of a plurality of task/domains. The apparatus comprises: a digital storage device that stores cumulative feedback data gathered from multiple users during previous operations of the apparatus and segregated in accordance with the plurality of task/domains; a first digital logic device that determines the current task/domain with which the current user command is associated; a second digital logic device that determines a current response to the current user command on the basis of that portion of the stored cumulative feedback data associated with the current ...
      Read Full Article
    24. Method, computer program product, and system for automatic class generation with simultaneous customization and interchange capability

      A database definition, logical database view, extended field definition and control statement information are accessed to build an in-memory representation of selective information contained therein. Utilizing this in-memory representation, a class in one form is automatically generated and customized wherein this class is used to access a hierarchical database responsive to a hierarchical database access request from an application.
      Read Full Article
    841-864 of 919 « 1 2 ... 33 34 35 36 37 38 39 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (24 articles) NLP
    2. (22 articles) Microsoft
    3. (15 articles) IBM
    4. (14 articles) Apac
    5. (13 articles) USD
    6. (13 articles) Cagr
    7. (13 articles) Service
    8. (12 articles) Market Data Tables
    9. (12 articles) Intel
    10. (12 articles) SMEs
    11. (12 articles) Google
    12. (10 articles) Region
  4. Locations in the News

    1. (29 articles) India
    2. (20 articles) Japan
    3. (19 articles) Germany
    4. (17 articles) Pune
    5. (14 articles) China
    6. (13 articles) France
    7. (10 articles) Canada
    8. (10 articles) Mexico
    9. (8 articles) Africa
    10. (8 articles) Spain
    11. (6 articles) South Korea
    12. (5 articles) Brazil