1. Articles in category: Segmentation

    529-548 of 548 « 1 2 ... 20 21 22 23
    1. Bootstrapping sense characterizations of occurrences of polysemous words in dictionary representations of a lexical knowledge base in computer memory

      The present invention is directed to characterizing the sense of an occurrence of a polysemous word in a representation of a dictionary. In a preferred embodiment, the representation of the dictionary is made up of a plurality of text segments containing word occurrences having a word sense characterization and word occurrences not having a word sense characterization. The embodiment first selects a plurality of the dictionary text segments that each contain a first word. The embodiment then identifies from among the selected text segments a first and a second occurrence of a second word. The identified second occurrence of the ...
      Read Full Article
      Mentions: Inventiona
    2. Identifying language and character set of data representing text

      The present invention provides a facility for identifying the unknown language of text represented by a series of data values in accordance with a character set that associates character glyphs with particular data values. The facility first generates a characterization that characterizes the series of data values in terms of the occurrence of particular data values on the series of data values. For each of a plurality of languages, the facility then retrieves a model that models the language in terms of the statistical occurrence of particular data values in representative samples of text in that language. The facility then ...
      Read Full Article
    3. Machine assisted translation tools utilizing an inverted index and list of letter n-grams

      A translation memory for computer assisted translation based upon an aligned file having a number of source language text strings paired with target language text strings. A posting vector file includes a posting vector associated with each source language text string in the aligned file. Each posting vector includes a document identification number corresponding to a selected one of the source language text strings in the aligned file and a number of entropy weight values, each of the number of weight values corresponding to a unique letter n-gram that appears in the selected source language text string. Preferably, the translation ...
      Read Full Article
      Mentions: Unicode Trados
    4. Bootstrapping sense characterizations of occurrences of polysemous words in dictionaries

      The present invention is directed to characterizing the sense of an occurrence of a polysemous word in a representation of a dictionary. In a preferred embodiment, the representation of the dictionary is made up of a plurality of text segments containing word occurrences having a word sense characterization and word occurrences not having a word sense characterization. The embodiment first selects a plurality of the dictionary text segments that each contain a first word. The embodiment then identifies from among the selected text segments a first and a second occurrence of a second word. The identified second occurrence of the ...
      Read Full Article
      Mentions: Inventiona
    5. Bootstrapping sense characterizations of occurrences of polysemous words

      The present invention is directed to characterizing the sense of an occurrence of a polysemous word in a representation of a dictionary. In a preferred embodiment, the representation of the dictionary is made up of a plurality of text segments containing word occurrences having a word sense characterization and word occurrences not having a word sense characterization. The embodiment first selects a plurality of the dictionary text segments that each contain a first word. The embodiment then identifies from among the selected text segments a first and a second occurrence of a second word. The identified second occurrence of the ...
      Read Full Article
      Mentions: Inventiona
    6. Character strings reading device

      A character strings reading device for reading character strings from input image data comprises cut-out recognition means for cutting out a segment corresponding to one character from the image data to perform individual character recognition every segment, a recognition result buffer for storing a recognition result of the cut-out recognition means, word searching means for searching a word string candidate corresponding to a combination of character candidates in the recognition result buffer, a word string candidate buffer for storing a search result of the word searching means, check portion determining means for determining a check target portion and a presumed ...
      Read Full Article
    7. Prosodic databases holding fundamental frequency templates for use in speech synthesis

      Prosodic databases hold fundamental frequency templates for use in a speech synthesis system. Prosodic database templates may hold fundamental frequency values for syllables in a given sentence. These fundamental frequency values may be applied in synthesizing a sentence of speech. The templates are indexed by tonal pattern markings. A predicted tonal marking pattern is generated for each sentence of text that is to be synthesized, and this predicted pattern of tonal markings is used to locate a best-matching template. The templates are derived by calculating fundamental frequencies on a pursuable basis for sentences that are spoken by a human trainer ...
      Read Full Article
    8. Method and apparatus for creating a searchable digital video library and a system and method of using such a library

      An apparatus and method of creating a digital library from audio data and video images. The method includes the steps of transcribing the audio data and marking the transcribed audio data with a first set of time-stamps and indexing the transcribed audio data. The method also includes the steps of digitizing the video data and marking the digitized video data with a second set of time-stamps related to the first set of time-stamps and segmenting the digitized video data into paragraphs according to a set of rules. The steps of storing the indexed audio data and the digitized video data ...
      Read Full Article
    9. Methods for controlling the generation of speech from text representing one or more names

      Improved automated synthesis of human audible speech from text is disclosed. Performance enhancement of the underlying text comprehensibility is obtained through prosodic treatment of the synthesized material, improved speaking rate treatment, and improved methods of spelling words or terms for the system user. Prosodic shaping of text sequences appropriate for the discourse in large groupings of text segments, with prosodic boundaries developed to indicate conceptual units within the text groupings, is implemented in a preferred embodiment.
      Read Full Article
    10. System with collaborative interface agent

      The present invention relates to a discourse manager which permits effective collaboration between a user and a computer agent. The system operates according to a theory of collaborative discourse between humans, with the computer agent playing the same role as a human collaborator. The present invention does not concern the internal operation of a particular agent, but relates rather to the structures for managing a collaborative discourse between any type of agent and the user. The discourse manager includes a memory in which application-specific recipes are stored and a memory in which the discourse state is stored. Each recipe specifies ...
      Read Full Article
    11. Compilation of weighted finite-state transducers from decision trees

      A method for automatically converting a decision tree into one or more weighted finite-state transducers. Specifically, the method in accordance with an illustrative embodiment of the present invention processes one or more terminal (i.e., leaf) nodes of a given decision tree to generate one or more corresponding weighted rewrite rules. Then, these weighted rewrite rules are processed to generate weighted finite-state transducers corresponding to the one or more terminal nodes of the decision tree. In this manner, decision trees may be advantageously compiled into weighted finite-state transducers, and these transducers may then be used directly in various speech and ...
      Read Full Article
    12. Text processor

      A text enhancement method and apparatus for the presentation of text for improved human reading. The method includes extracting text specific attributes from machine readable text and varying the text presentation in accordance with the attributes. The preferred embodiment of the method: extracts parts of speech and punctuation from a sentence, applies folding rules which use the parts of speech to determine folding points, uses the folding points to divide the sentence into text segments, applies horizontal displacement rules to determine horizontal displacement for the text segments, and presents the text segments each on a new line and having the ...
      Read Full Article
    13. Knowledge-based speech recognition system and methods having frame length computed based upon estimated pitch period of vocalic intervals

      Knowledge based speech recognition apparatus and methods are provided for translating an input speech signal to text. The speech recognition apparatus captures an input speech signal, segments it based on the detection of pitch period, and generates a series of hypothesized acoustic feature vectors for the input speech signal that characterizes the signal in terms of primary acoustic events, detectable vowel sounds and other acoustic features. The apparatus and methods employ a largely speaker-independent dictionary based upon the application of phonological and phonetic/acoustic rules to generate acoustic event transcriptions against which the series of hypothesized acoustic feature vectors are ...
      Read Full Article
    14. Dialogue-sound processing apparatus and method

      A dialogue-sound processing appratus of the present invention generates discourse structure representing the flow of dialogue from fragmentary spoken utterances. In the dialogue-sound processing apparatus, the speech fragments of the dialogue-sound is inputted through a sound input section. A clue extraction section extracts clue which is a word or prosodic feature representing flow of dialogue from the speech fragments. An utterance function rule memory section memorizes utterance function rule which is correspondence relation between the clue and the utterance function representing pragmatic effect for the flow of dialogue. An utterance function extraction section assigns the utterance function to the clue ...
      Read Full Article
      Mentions: Tokyo Osaka
    15. Speech synthesizer having an acoustic element database

      A speech synthesis method employs an acoustic element database that is established from phonetic sequences occurring in an interval of a speech signal. In establishing the database, trajectories are determined for each of the phonetic sequences containing a phonetic segment that corresponds to a particular phoneme. A tolerance region is then identified based on a concentration of trajectories that correspond to different phoneme sequences. The acoustic elements for the database are formed from portions of the phonetic sequences by identifying cut points in the phonetic sequences which correspond to time points along the respective trajectories proximate the tolerance region. In ...
      Read Full Article
    16. Machine assisted translation tools

      A translation memory for computer assisted translation based upon an aligned file having a number of source language text strings paired with target language text strings. A posting vector file includes a posting vector associated with each source language text string in the aligned file. Each posting vector includes a document identification number corresponding to a selected one of the source language text strings in the aligned file and a number of entropy weight values, each of the number of weight values corresponding to a unique letter n-gram that appears in the selected source language text string. Preferably, the translation ...
      Read Full Article
      Mentions: Unicode Trados
    17. System and method for skimming digital audio/video data

      A method for skimming digital audio and video data, wherein the video data is partitioned into video segments and the audio data has been transcribed, is comprised of the steps of selecting representative frames from each of the video segments. The representative frames are combined to form an assembled video sequence. Keywords contained in the corresponding transcribed audio data are identified and extracted. The extracted keywords are assembled into an audio track. The assembled video sequence and audio track are output together. An apparatus for carrying out the disclosed method is also disclosed.
      Read Full Article
    18. Robust language processor for segmenting and parsing-language containing multiple instructions

      Apparatus and method are provided for segmenting, parsing, interpreting and formatting the content of instructions such as air traffic control instructions. Output from a speech recognizer is so processed to produce such instructions in a structured format such as for input to other software. There are two main components: an instruction segmenter and a robust parser. In the instruction segmenter, the recognized text produced by the speech recognizer is segmented into independent instructions and each instruction is processed. The instruction segmenter receives a recognized air traffic control or other instruction, and segments it into individual commands. Utterances or other language ...
      Read Full Article
    19. Natural language processing system and method for parsing a plurality of input symbol sequences into syntactically or pragmatically correct word messages

      A Natural Language Processing System utilizes a symbol parsing layer in combination with an intelligent word parsing layer to produce a syntactically or pragmatically correct output sentence or other word message. Initially, a plurality of polysemic symbol sequences are input through a keyboard segmented into a plurality of semantic, syntactic, or pragmatic segments including agent, action and patient segments, for example. One polysemic symbol sequence, including a plurality of polysemic symbols, is input from each of the three segments of the keyboard. A symbol parsing device, in a symbol parsing layer, then parses each of the plurality of symbols in ...
      Read Full Article
    20. Method for segmenting a text into words

      A method of segmenting a text into words in which a dictionary search is made while using a character string in the text as a search key, and it is checked whether a word retrieved from the dictionary can be grammatically connected to another word adjacent thereto or not. Segmentation processing is carried out using only words registered in a word dictionary, processing for identifying an unknown word is carried out when the segmentation processing comes to a deadlock, and then the segmentation processing is continued for that portion of the text which follows the identified unknown word.
      Read Full Article
      Mentions: Prior ArtIn
    529-548 of 548 « 1 2 ... 20 21 22 23
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (11 articles) Microsoft
    2. (8 articles) IBM
    3. (7 articles) Bfsi
    4. (7 articles) Google
    5. (5 articles) IBM Corporation
    6. (5 articles) NLP
    7. (4 articles) Cagr
    8. (4 articles) SAS Institute Inc.
    9. (4 articles) Oracle Corporation
    10. (4 articles) Healthcare
    11. (3 articles) Defense
    12. (3 articles) Apple Inc.
  4. Locations in the News

    1. (8 articles) India
    2. (7 articles) Asia
    3. (4 articles) Africa
    4. (3 articles) China
    5. (3 articles) Japan
    6. (3 articles) Pune
    7. (3 articles) Germany
    8. (3 articles) Maharashtra
    9. (3 articles) Porter
    10. (2 articles) Amazon
    11. (2 articles) New York
    12. (2 articles) Tokyo