1. Articles in category: Machine Translation

    7705-7728 of 8022 « 1 2 ... 319 320 321 322 323 324 325 ... 333 334 335 »
    1. Stat-XFER: A General Search-Based Syntax-Driven Framework for Machine Translation

      The CMU Statistical Transfer Framework (Stat-XFER) is a general framework for developing search-based syntax-driven machine translation (MT) systems. The framework consists of an underlying syntax-based transfer formalism along with a collection of software components designed to facilitate the development of a broad range of MT research systems. The main components are a general language-independent runtime transfer engine and decoder, along with several different tools for creating the various underlying language-pair-specific resources that are required for building a specific MT system for any given language pair. We describe the general framework, its unique properties and features, and its application to the ...
      Read Full Article
    2. Bilingual Segmentation for Alignment and Translation

      We propose a method that bilingually segments sentences in languages with no clear delimiter for word boundaries. In our model, we first convert the search for the segmentation into a sequential tagging problem, allowing for a polynomial-time dynamic-programming solution, and incorporate a control to balance monolingual and bilingual information at hand. Our bilingual segmentation algorithm, the integration of a monolingual language model and a statistical translation model, is devised to tokenize sentences more suitably for bilingual applications such as word alignment and machine translation. Empirical results show that bilingually-motivated segmenters outperform pure monolingual one in both the word-aligning (12% reduction ...
      Read Full Article
    3. Dynamic Translation Memory: Using Statistical Machine Translation to Improve Translation Memory Fuzzy Matches

      Professional translators of technical documents often use Translation Memory (TM) systems in order to capitalize on the repetitions frequently observed in these documents. TM systems typically exploit not only complete matches between the source sentence to be translated and some previously translated sentence, but also so-called fuzzy matches, where the source sentence has some substantial commonality with a previously translated sentence. These fuzzy matches can be very worthwhile as a starting point for the human translator, but the translator then needs to manually edit the associated TM-based translation to accommodate the differences with the source sentence to be translated. If ...
      Read Full Article
    4. Statistical Machine Translation into a Morphologically Complex Language

      In this paper, we present the results of our investigation into phrase-based statistical machine translation from English into Turkish – an agglutinative language with very productive inflectional and derivational word-formation processes. We investigate different representational granularities for morphological structure and find that (i) representing both Turkish and English at the morpheme-level but with some selective morpheme-grouping on the Turkish side of the training data, (ii) augmenting the training data with “sentences” comprising only the content words of the original training data to bias root word alignment, and with highly-reliable phrase-pairs from an earlier corpus-alignment (iii) re-ranking the n-best morpheme-sequence outputs of ...
      Read Full Article
    5. n-Best Reranking for the Efficient Integration of Word Sense Disambiguation and Statistical Machine Translation

      Although it has been always thought that Word Sense Disambiguation (WSD) can be useful for Machine Translation, only recently efforts have been made towards integrating both tasks to prove that this assumption is valid, particularly for Statistical Machine Translation (SMT). While different approaches have been proposed and results started to converge in a positive way, it is not clear yet how these applications should be integrated to allow the strengths of both to be exploited. This paper aims to contribute to the recent investigation on the usefulness of WSD for SMT by using n-best reranking to efficiently integrate WSD with ...
      Read Full Article
    6. Translation Paraphrases in Phrase-Based Machine Translation

      In this paper we present an analysis of a phrase-based machine translation methodology that integrates paraphrases obtained from an intermediary language (French) for translations between Spanish and English. The purpose of the research presented in this document is to find out how much extra information (i.e. improvements in translation quality) can be found when using Translation Paraphrases (TPs). In this document we present an extensive statistical analysis to support conclusions. Content Type Book ChapterDOI 10.1007/978-3-540-78135-6_33Authors Francisco Guzmán, ITESM Campus Monterrey Center for Intelligent Systems MexicoLeonardo Garrido, ITESM Campus Monterrey Center for Intelligent Systems Mexico Book Series Lecture ...
      Read Full Article
      Mentions: Mexico
    7. Learning Finite State Transducers Using Bilingual Phrases

      Statistical Machine Translation is receiving more and more attention every day due to the success that the phrase-based alignment models are obtaining. However, despite their power, state-of-the-art systems using these models present a series of disadvantages that lessen their effectiveness in working environments where temporal or spacial computational resources are limited. A finite-state framework represents an interesting alternative because it constitutes an efficient paradigm where quality and realtime factors are properly integrated in order to build translation devices that may be of help for their potential users. Here, we describe a way to use the bilingual information in a phrase-based ...
      Read Full Article
      Mentions: Valencia
    8. Combining Textual and Visual Information for SemanticLabeling of Images and Videos

      Semantic labeling of large volumes of image and video archives is difficult, if not impossible, with the traditional methods due to the huge amount of human effort required for manual labeling used in a supervised setting. Recently, semi-supervised techniques which make use of annotated image and video collections are proposed as an alternative to reduce the human effort. In this direction, different techniques, which are mostly adapted from information retrieval literature, are applied to learn the unknown one-to-one associations between visual structures and semantic descriptions. When the links are learned, the range of application areas is wide including better retrieval ...
      Read Full Article
      Mentions: Ankara
    7705-7728 of 8022 « 1 2 ... 319 320 321 322 323 324 325 ... 333 334 335 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (13 articles) Google
    2. (6 articles) Adobe
    3. (6 articles) TransPerfect
    4. (6 articles) UPS
    5. (5 articles) Google Translate
    6. (5 articles) Unicode
    7. (5 articles) Microsoft
    8. (4 articles) MIT
    9. (4 articles) Nato
    10. (4 articles) McKinsey Global Institute
    11. (4 articles) McKinsey
    12. (4 articles) CTO
  4. Locations in the News

    1. (11 articles) India
    2. (10 articles) China
    3. (10 articles) New York
    4. (9 articles) London
    5. (7 articles) Chicago
    6. (7 articles) Indian
    7. (6 articles) San Francisco
    8. (6 articles) Germany
    9. (5 articles) Iceland
    10. (5 articles) Hollywood
    11. (5 articles) Amazon
    12. (5 articles) Hong Kong
  5. People in the News

    1. (5 articles) Phil Shawe
    2. (5 articles) Liz Elting
    3. (4 articles) Tommi Jaakkola
    4. (2 articles) Kevin Knight
    5. (1 articles) Geoffrey Hinton
    6. (1 articles) Eric Schmidt
    7. (1 articles) Tomas Mikolov
    8. (1 articles) Tony O'Dowd
    9. (1 articles) Andrew Ng