1. Articles in category: Segmentation

    529-552 of 692 « 1 2 ... 20 21 22 23 24 25 26 ... 27 28 29 »
    1. Secure Distributed Human Computation

      In Peha’s Financial Cryptography 2004 invited talk, he described the Cyphermint PayCash system (see www.cyphermint.com), which allows people without bank accounts or credit cards (a sizeable segment of the U.S. population) to automatically and instantly cash checks, pay bills, or make Internet transactions through publicly-accessible kiosks. Since PayCash offers automated financial transactions and since the system uses (unprotected) kiosks, security is critical. The kiosk must decide whether a person cashing a check is really the person to whom the check was made out, so it takes a digital picture of the person cashing the check and ...
      Read Full Article
    2. Blade clearance control

      A turbine engine has a circumferentially segmented shroud within a case structure. Each shroud segment is mounted for movement between an inboard position and an outboard position. One or more springs bias the shroud segments toward their inboard positions. One or more valves are positioned to vent one or more volumes so as to counter the spring bias to shift the shroud segments to their outboard positions.
      Read Full Article
      Mentions: HPC
    3. Cleaning, Segmenting, and Spell-Checking Text

      When extracting text from different sources, you commonly end up with “noise” characters and unwanted whitespace. So you need tools to help you clean up this extracted text. For many applications, you’ll also want to segment text by identifying the boundaries of sentences and to spell-check text using a single suggestion or a list of suggestions. In this chapter, you’ll learn how to remove HTML tags, extract full text from an XML file, segment text into sentences, perform stemming and spell-checking, and recognize and remove noise characters. Content Type Book ChapterDOI 10.1007/978-1-4302-2352-8_2 Book Scripting IntelligenceDOI 10 ...
      Read Full Article
    4. Building a Morphosyntactic Lexicon and a Pre-syntactic Processing Chain for Polish

      This paper introduces a new set of tools and resources for Polish which cover all the steps required to transform a raw unrestricted text into a reasonable input for a parser. This includes (1) a large-coverage morphological lexicon, developed thanks to the IPI PAN corpus as well as a lexical acquisition techique, and (2) multiple tools for spelling correction, segmentation, tokenization and named entity recognition. This processing chain is also able to deal with the XCES format both as input and output, hence allowing to improve XCES corpora such as the IPI PAN corpus itself. This allows us to give ...
      Read Full Article
    5. Method of vector analysis for a document

      The invention provides a document representation method and a document analysis method including extraction of important sentences from a given document and/or determination of similarity between two documents.The inventive method detects terms that occur in the input document, segments the input document into document segments, each segment being an appropriately sized chunk and generates document segment vectors, each vector including as its element values according to occurrence frequencies of the terms occurring in the document segments. The method further calculates eigenvalues and eigenvectors of a square sum matrix in which a rank of the respective document segment vector ...
      Read Full Article
      Mentions: Tokyo Inventiona
    6. Towards the quantification of the semantic information encoded in written language. (arXiv:0907.1558v2 [physics.soc-ph] Cross Listed)

      Written language is a complex communication signal capable of conveying information encoded in the form of ordered sequences of words. Beyond the local order ruled by grammar, semantic and thematic structures affect long-range patterns in word usage. Here, we show that a direct application of information theory quantifies the relationship between the statistical distribution of words and the semantic content of the text. We show that there is a characteristic scale, roughly around a few thousand words, which establishes the typical size of the most informative segments in written language. Moreover, we find that the words whose contributions to the ...
      Read Full Article
    7. Google Translator Kit: Automated Translation Meets Crowdsourcing

      Google Translator Kit: Automated Translation Meets Crowdsourcing
      Only a handful of blogs picked up on Google's fresh Translator Toolkit , which the company launched yesterday by means of a blog post , but this new service really deserves a second look, if only because Wikimedia apparently sees the tool as something that could "change the way Wikipedia grows in other languages" . You can read an extensive review of the product over at Google Blogoscoped , but ...
      Read Full Article
    8. Google Puts Human Touch Into Machine Translation

      Google Puts Human Touch Into Machine Translation
      Google has announced the launch of the Google Translator toolkit , an editor designed to give translators an easy means of bringing the "human touch" to machine translation, which everybody knows is often flawed. Michael Galvez and Sanjay Bhansali of the Google Translator Toolkit team explain how the toolkit works: For example, if an Arabic-speaking reader wants to translate a Wikipedia™ article ...
      Read Full Article
    9. The importance of input representations

      The importance of input representations
      As some of you know, I run a (machine learning) reading group every semester. This summer we're doing "assorted" topics, which basically means students pick a few papers from the past 24 months that are related and present on them. The week before I went out of town, we read two papers about inferring features from raw data; one was a deep learning approach; the other was more Bayesian. (As a total aside, I found it funny that in the latter paper they talk a lot about trying to find independent features, but in all cog sci papers I ...
      Read Full Article
    10. Using source-channel models for word segmentation

      BACKGROUND OF THE INVENTIONThe present invention relates to segmenting text. In particular, the present invention relates to segmenting text that is not delimited by spaces.In many languages, such as Chinese and Japanese, it is difficult to segment characters into words because the words are not delimited by spaces. Methods used in the past to perform such segmentation can roughly be classified intodictionary-based methods or statistical-based methods.In dictionary-based methods, substrings of c
      Read Full Article
    11. Conceptual world representation natural language understanding system and method

      A portion of the disclosure recited in the specification contains material which is subject to copyright protection. This application includes a compact diskappendix containing source code listings that list instructions for a system and method by which the present invention may be practiced in a computer system. Two identical copies of the source code listing, volume name L&C;, comprising 959 files,6,598,598 bytes, are provided on compact disks created on Jul. 4, 2002. The copyright owner has n
      Read Full Article
    12. Image-based document indexing and retrieval

      A system that facilitates document retrieval and/or indexing is provided. A component receives an image of a document, and a search component searches data store(s) for a match to the document image. The match is performed over word-level topological properties of images of documents stored in the data store(s).
      Read Full Article
    13. Word processing with artificial language validation

      BACKGROUNDThe present invention relates to data processing by digital computer, and more particularly to word processing.Word processing systems (also referred to as word processors) allow users to create documents, primarily textual documents that might otherwise be prepared on a typewriter. Users can also edit, print or save the documents using the wordprocessor. Such documents will be referred to as word processing documents.Modern word processors offer a greater range of functions than the f
      Read Full Article
    14. Chinese character-based parser

      BACKGROUND OF THEINVENTION1. Technical FieldThe present invention relates to data processing and, in particular, to parsing Chinese character streams. Still more particularly, the present invention provides word segmentation, part-of-speech tagging and parsing for Chinese characters.2. Description of Related ArtThere are many natural language processing (NLP) applications, such as machine translation (MT) and question answering systems, that use structural information of a sentence. As word segm
      Read Full Article
    15. Effects of Repair Support Agent for Accurate Multilingual Communication

      Translation repair plays an important role in intercultural communication using machine translation. It can be used to create messages that have very few translation mistakes. However, translation repair is a laborious task. It is important to carry out translation repair efficiently. Therefore, we propose a repair support agent that provides the segments that have not been translated accurately. We perform experiments on the translation repair efficiency to evaluate the effectiveness of the repair support agent. The results of these experiments are as follows. (1) Providing inaccurately translated segments improves the ability to detect inaccurate segments. (2) The inaccurate-judgment rate can ...
      Read Full Article
    16. Method and apparatus for window matching in delta compressors

      The present invention relates generally to data compression and, more particularly, to a method for efficient window partition matching indelta compressors to enhance compression performance based on the idea of modeling a dataset with the frequencies of its n-grams.BACKGROUND OF THE INVENTIONCompression programs routinely limit the data to be compressed together in segments called windows. The process of doing this is called windowing. Delta compression techniques were developed to compress a t
      Read Full Article
      Mentions: lamda Bell Labs
    17. Development and evaluation of a clinical note section header terminology.

      Related Articles Development and evaluation of a clinical note section header terminology. AMIA Annu Symp Proc. 2008;:156-60 Authors: Denny JC, Miller RA, Johnson KB, Spickard A Clinical documentation is often expressed in natural language text, yet providers often use common organizations that segment these notes in sections, such as history of present illness or physical examination. We developed a hierarchical section header terminology, supporting mappings to LOINC and other vocabularies; it contained 1109 concepts and 4332 synonyms. Physicians evaluated it compared to LOINC and the Evaluation and Management billing schema using a randomly selected corpus of history and physical ...
      Read Full Article
    18. Systems and methods for interactive topic-based text summarization

      INCORPORATION BY REFERENCEThis Application incorporates by reference: entitled "SYSTEMS AND METHODS FOR DETERMINING THE TOPIC STRUCTURE OF A PORTION OF TEXT" by I. Tsochantaridis et al., filed Mar.22, 2002 as U.S. patent application Ser. No. 10/103,053; entitled"SYSTEMS AND METHODS FOR DISPLAYING INTERACTIVE TOPIC BASED TEXT SUMMARIES" by F. Chen et al., filed Dec. 16, 2002, as U.S. patent application Ser. No. 10/319,545; entitled "SYSTEMS AND METHODS FOR SENTENCE BASED INTERACTIVE TOPIC BASED T
      Read Full Article
    529-552 of 692 « 1 2 ... 20 21 22 23 24 25 26 ... 27 28 29 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (7 articles) Dolbey Systems , Inc.
    2. (7 articles) Healthcare
    3. (6 articles) Nuance Communications
    4. (6 articles) Cagr
    5. (6 articles) Cerner Corporation
    6. (6 articles) McKesson Corporation
    7. (6 articles) 3M Health Information Systems
    8. (5 articles) Service
    9. (5 articles) Apac
    10. (4 articles) Microsoft
    11. (3 articles) SMEs
    12. (3 articles) RCM
  4. Locations in the News

    1. (8 articles) India
    2. (5 articles) China
    3. (5 articles) Japan
    4. (4 articles) Pune
    5. (3 articles) Germany
    6. (3 articles) London
    7. (3 articles) Albany
    8. (2 articles) USA
    9. (2 articles) Canada
    10. (2 articles) Africa
    11. (1 articles) Mexico
    12. (1 articles) Brazil