1. Articles in category: Segmentation

    553-576 of 675 « 1 2 ... 21 22 23 24 25 26 27 28 29 »
    1. Question Answering from Lecture Videos Based on Automatically-Generated Learning Objects

      In the past decade, we have witnessed a dramatic increase in the availability of online academic lecture videos. There are technical problems in the use of recorded lectures for learning: the problem of easy access to the multimedia lecture video content and the problem of finding the semantically appropriate information very quickly. The retrieval of audiovisual lecture recordings is a complex task comprising many objects. In our solution, speech recognition is applied to create a tentative and deficient transcription of the lecture video recordings. The transcription and the words from the power point slides are sufficient to generate semantic metadata ...
      Read Full Article
      Mentions: Germany Potsdam
    2. A Joint Segmenting and Labeling Approach for Chinese Lexical Analysis

      This paper introduces an approach which jointly performs a cascade of segmentation and labeling subtasks for Chinese lexical analysis, including word segmentation, named entity recognition and part-of-speech tagging. Unlike the traditional pipeline manner, the cascaded subtasks are conducted in a single step simultaneously, therefore error propagation could be avoided and the information could be shared among multi-level subtasks. In this approach, Weighted Finite State Transducers (WFSTs) are adopted. Within the unified framework of WFSTs, the models for each subtask are represented and then combined into a single one. Thereby, through one-pass decoding the joint optimal outputs for multi-level processes will ...
      Read Full Article
    3. An unsupervised machine learning approach to segmentation of clinician-entered free text.

      An unsupervised machine learning approach to segmentation of clinician-entered free text. AMIA Annu Symp Proc. 2007;:811-5 Authors: Wrenn J, Stetson PD, Johnson SB Natural language processing, an important tool in biomedicine, fails without successful segmentation of words and sentences. Tokenization is a form of segmentation that identifies boundaries separating semantic units, for example words, dates and numbers, within a text. We sought to construct a highly generalizeable tokenization algorithm with no prior knowledge of characters or their function, based on the inherent statistical properties of token and sentence boundaries. Tokenizing clinician-entered free text, we achieved precision and recall of ...
      Read Full Article
    4. System for identifying paraphrases using machine translation

      BACKGROUND OF THE INVENTIONThe present invention deals with identifying paraphrases in text. More specifically, the present invention deals with using machine translation techniques to identify and generate paraphrases.The recognition and generation of paraphrases is a key facet to many applications of natural language processing systems. Being able to identify that two different pieces of text are equivalent in meaning enables a system to behave much moreintelligently. A fundamental goal of wor
      Read Full Article
    5. Evaluating machine translation with LFG dependencies

      Abstract  In this paper we show how labelled dependencies produced by a Lexical-Functional Grammar parser can be used in Machine Translation evaluation. In contrast to most popular evaluation metrics based on surface string comparison, our dependency-based method does not unfairly penalize perfectly valid syntactic variations in the translation, shows less bias towards statistical models, and the addition of WordNet provides a way to accommodate lexical differences. In comparison with other metrics on a Chinese–English newswire text, our method obtains high correlation with human scores, both on a segment and system level. Content Type Journal ArticleDOI 10.1007/s10590-008-9038-1Authors Karolina ...
      Read Full Article
    6. An Evidence-Based Approach to Handle Semantic Heterogeneity in Interoperable Distributed User Models

      Nowadays, the idea of personalization is regarded as crucial in many areas. This requires quick and robust approaches for developing reliable user models. The next generation user models will be distributed (segments of the user model will be stored by different applications) and interoperable (systems will be able to exchange and use user model fractions to enrich user experiences). We propose a new approach to deal with one of the key challenges of interoperable distributed user models - semantic heterogeneity. The paper presents algorithms to automate the user model exchange across applications based on evidential reasoning and advances in the Semantic ...
      Read Full Article
    7. The canonical processes of a dramatized approach to information presentation

      Abstract  This paper describes the application “Carletto the spider” in terms of the mapping with the canonical processes of media production. “Carletto the spider” is a character-based guide to a historical site and implements the Dramatour approach for the design of drama-based interactive presentations. Dramatization makes presentations more engaging, thus improving the reception of the content by the user. The major technical issue of the approach is the segmentation of the presentation into audiovisual units that are edited on-the-fly in a way that guarantees dramatic continuity while adapting to the user response. We describe the workflow of the application and ...
      Read Full Article
    8. System and method for performing analysis on word variants

      BACKGROUND OF THE INVENTIONThe present invention is related to natural language processing. More particularly, the present invention is related to natural language systems and methods for processing words and associated word-variant forms, such as verb-clitic forms, inone of a range of languages, for example Spanish, when they are encountered in a textual input.Numerous natural language processing applications rely upon a lexicon for operation. Such applications include word breaking (for search
      Read Full Article
    9. Dynamic Browsing of Audiovisual Lecture Recordings Based on Automated Speech Recognition

      The number of digital lecture video recordings has increased dramatically since recording technology became available. The accessibility and the search inside of this large archive are limited and difficult. Manual annotation and segmentation is time-consuming and useless. A promising approach is based on using the audio layer of a lecture recording to get information about the lecture contents. In this paper, we are presenting a retrieval method and a user-interface based on existing recorded lectures. A deficient transcription from a speech recognition engine (SRE) is sufficient for browsing in the video-archive. A user-interface for dynamic browsing of the e-learning contents ...
      Read Full Article
      Mentions: Germany Potsdam GmbH
    10. Chinese Word Segmentation for Terrorism-Related Contents

      In order to analyze security and terrorism related content in Chinese, it is important to perform word segmentation on Chinese documents. There are many previous studies on Chinese word segmentation. The two major approaches are statistic-based and dictionary-based approaches. The pure statistic methods have lower precision, while the pure dictionary-based method cannot deal with new words and are restricted to the coverage of the dictionary. In this paper, we propose a hybrid method that avoids the limitations of both approaches. Through the use of suffix tree and mutual information (MI) with the dictionary, our segmenter, called IASeg, achieves a high ...
      Read Full Article
    11. Speech and sliding text aided sign retrieval from hearing impaired sign news videos

      Abstract  The objective of this study is to automatically extract annotated sign data from the broadcast news recordings for the hearing impaired. These recordings present an excellent source for automatically generating annotated data: In news for the hearing impaired, the speaker also signs with the hands as she talks. On top of this, there is also corresponding sliding text superimposed on the video. The video of the signer can be segmented via the help of either the speech or both the speech and the text, generating segmented, and annotated sign videos. We call this application as Signiary, and aim to ...
      Read Full Article
      Mentions: Berlin Heidelberg
    12. System and method for semantic video segmentation based on joint audiovisual and text analysis

      System and method for partitioning a video into a series of semantic units where each semantic unit relates to a generally complete thematic topic. A computer implemented method for partitioning a video into a series of semantic units wherein each semantic unit relates to a theme or a topic, comprises dividing a video into a plurality of homogeneous segments, analyzing audio and visual content of the video, extracting a plurality of keywords from the speech content of each of the plurality of homogeneous segments of the video, and detecting and merging a plurality of groups of semantically related and temporally ...
      Read Full Article
    13. A Survey of Chinese Text Similarity Computation

      There is not a natural delimiter between words in Chinese texts. Moreover, Chinese is a semotactic language with complicated structures focusing on semantics. Its differences from Western languages bring more difficulties in Chinese word segmentation and more challenges in Chinese natural language understanding. How to compute the Chinese text similarity with high precision, recall and low cost is a very important but challenging task. Many researchers have studied it for long time. In this paper, we examine existing Chinese text similarity measures, including measures based on statistics and semantics. Our work provides insights into the advantages and disadvantages of each ...
      Read Full Article
    14. Semi-joint Labeling for Chinese Named Entity Recognition

      Named entity recognition (NER) is an essential component of text mining applications. In Chinese sentences, words do not have delimiters; thus, incorporating word segmentation information into an NER model can improve its performance. Based on the framework of dynamic conditional random fields, we propose a novel labeling format, called semi-joint labeling which partially integrates word segmentation information and named entity tags for NER. The model enhances the interaction of segmentation tags and NER achieved by traditional approaches. Moreover, it allows us to consider interactions between multiple chains in a linear-chain model. We use data from the SIGHAN 2006 NER bakeoff ...
      Read Full Article
      Mentions: Taipei Taiwan Hsinchu
    15. Recognizing Biomedical Named Entities in Chinese Research Abstracts

      Most research on biomedical named entity recognition has focused on English texts, e.g., MEDLINE abstracts. However, recent years have also seen significant growth of biomedical publications in other languages. For example, the Chinese Biomedical Bibliographic Database has collected over 3 million articles published after 1978 from 1600 Chinese biomedical journals. We present here a Conditional Random Field (CRF) based system for recognizing biomedical named entities in Chinese texts. Viewing Chinese sentences as sequences of characters, we trained and tested the CRF model using a manually annotated corpus containing 106 research abstracts (481 sentences in total). The features we used ...
      Read Full Article
    16. A Statistical Model for Topic Segmentation and Clustering

      This paper presents a statistical model for discovering topical clusters of words in unstructured text. The model uses a hierarchical Bayesian structure and it is also able to identify segments of text which are topically coherent. The model is able to assign each segment to a particular topic and thus categorizes the corresponding document to potentially multiple topics. We present some initial results indicating that the word topics discovered by the proposed model are more consistent compared to other models. Our early experiments show that our model clustering performance compares well with other clustering models on a real text corpus ...
      Read Full Article
    17. Method and apparatus for implementing Q&A; function and computer-aided authoring

      The present invention provides a method for implementing Q&A; function for an electronic document, a method for computer-aided authoring, a method for browsing an electronic document, a computer-aided authoring apparatus, a browser capable of providing Q&A; function, a method for providing Q&A; service utilizing computers and a system for providing Q&A; service. Said method for implementing Q&A; function for an electronic document includes: when the writer is writing an electronic document, generating Q&A; information used for Q&A; function so that the reliability of the generated Q&A; information is ensured by the writer; saving said Q&A; information in correspondence with said ...
      Read Full Article
    18. Systems and methods for sentence based interactive topic-based text summarization

      Techniques for determining sentence based interactive topic-based summarization are provided. A text to be summarized is segmented. Discrete keyword, key-phrase, n-gram, sentence and other sentence constituent based summaries are generated based on statistical measures for each text segment. Interactive topic-based summaries are displayed with human sensible omitted text indicators such as alternate colors, fonts, sounds, tactile elements or other human sensible display characteristics useful in indicating omitted text. Individual and/or combinations of discrete keyword, key-phrase, n-gram, sentence, noun phrase and sentence constituent based summaries are dynamically displayed to provide an overview of topic and subtopic development within a text ...
      Read Full Article
    19. Video segmentation using Maximum Entropy Model

      Abstract   Detecting objects of interest from a video sequence is a fundamental and critical task in automated visual surveillance. Most current approaches only focus on discriminating moving objects by background subtraction whether or not the objects of interest can be moving or stationary. In this paper, we propose layers segmentation to detect both moving and stationary target objects from surveillance video. We extend the Maximum Entropy (ME) statistical model to segment layers with features, which are collected by constructing a codebook with a set of codewords for each pixel. We also indicate how the training models are used for the ...
      Read Full Article
    20. Pattern-Based Extraction of Addresses from Web Page Content

      Extraction of addresses and location names from Web pages is a challenging task for search engines. Traditional information extraction and natural processing models remain unsuccessful in the context of the Web because of the uncontrolled heterogenous nature of the Web resources as well as the effects of HTML and other markup tags. We describe a new pattern-based approach for extraction of addresses from Web pages. Both HTML and vision-based segmentations are used to increase the quality of address extraction. The proposed system uses several address patterns and a small table of geographic knowledge to hit addresses and then itemize them ...
      Read Full Article
    21. Power shifts in web-based translation memory

      Abstract   Web-based translation memory (TM) is a recent and little-studied development that is changing the way localisation projects are conducted. This article looks at the technology that allows for the sharing of TM databases over the internet to find out how it shapes the translator’s working environment. It shows that so-called pre-translation—until now the standard way for clients to manage translation tasks with freelancers—is giving way to web-interactive translation. Thus, rather than interacting with their own desktop databases as before, translators now interface with each other through server-based translation memories, so that a newly entered term or ...
      Read Full Article
      Mentions: Australia
    22. Hybrid Markov Logic Networks

      Jue Wang Pedro Domingos Department of Computer Science and Engineering University of Washington Seattle, WA 98195-2350, U.S.A. {juewang, pedrod}@cs.washington.edu Abstract Markov logic networks (MLNs) combine first-order logic and Markov networks, allowing us to handle the complexity and uncertainty of real-world problems in a single consistent framework. However, in MLNs all variables and features are discrete, while most real-world applications also contain continu
      Read Full Article
    23. A Clustering Analysis for Target Group Identification by Locality in Motor Insurance Industry

      A deep understanding of different aspects of business performance and operations is necessary for a leading insurance company to maintain its position on the market and make further development. This chapter presents a clustering analysis for target group identification by locality, based on a case study in the motor insurance industry. Soft computing techniques have been applied to understand the business and customer patterns by clustering data sets sourced from policy transactions and policyholders’ profiles. Self organizing map clustering and k-means clustering are used to perform the segmentation tasks in this study. Such clustering analysis can also be employed as ...
      Read Full Article
    24. Segmentation-Driven Offline Handwritten Chinese and Arabic Script Recognition

      The market of handwriting recognition applications is increasing rapidly due to continuous advancement in OCR technology. This paper summarizes our recent efforts on offline handwritten Chinese script recognition using a segmentation-driven approach. We address two essential problems, namely isolated character recognition and establishment of the probabilistic segmentation model. To improve the isolated character recognition accuracy, we propose a heteroscedastic linear discriminant analysis algorithm to extract more discrimination information from original character features, and implement a minimum classification error learning scheme to optimize classifier parameters. In the segmentation stage, information from three different sources, namely geometric layout, character recognition confidence, and ...
      Read Full Article
    553-576 of 675 « 1 2 ... 21 22 23 24 25 26 27 28 29 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (22 articles) Microsoft
    2. (16 articles) Google
    3. (13 articles) IBM
    4. (11 articles) NLP
    5. (11 articles) Cagr
    6. (10 articles) IBM Corporation
    7. (10 articles) Healthcare
    8. (10 articles) Apac
    9. (8 articles) Bfsi
    10. (7 articles) Voice Recognition
    11. (7 articles) Intel
    12. (7 articles) PR Newswire
  4. Locations in the News

    1. (21 articles) China
    2. (16 articles) India
    3. (15 articles) Japan
    4. (10 articles) Africa
    5. (9 articles) Dublin
    6. (9 articles) New York
    7. (8 articles) Canada
    8. (8 articles) Germany
    9. (6 articles) Albany
    10. (4 articles) USA
    11. (3 articles) Pune
    12. (3 articles) Brazil
  5. People in the News

    1. (4 articles) Laura Wood