    1. Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives.

      J Am Med Inform Assoc. 2013 Mar-Apr;20(2):356-62

      Authors: Jindal P, Roth D

      Abstract OBJECTIVE: This paper presents a coreference resolution system for clinical narratives. Coreference resolution aims at clustering all mentions in a single document to coherent entities. MATERIALS AND METHODS: A knowledge-intensive approach for coreference resolution is employed.

    2. Methods for automated essay analysis

      Systems and methods for creating a mathematical model for use in identifying discourse elements are described. A plurality of first essays relating to a particular subject are received, where each first essay is in an electronic format. Annotations for each first essay are received, where each annotation identifies at least one discourse element. Features are identified with a processor, where each feature is exhibited by at least one identified discourse element. Empirical frequencies are computed with a processor, where each empirical frequency relates to the presence of a feature with respect to the identified discourse elements across the plurality of ...
    3. Automatic discourse connective detection in biomedical text.

      Related Articles Automatic discourse connective detection in biomedical text. J Am Med Inform Assoc. 2012 Sep-Oct;19(5):800-8 Authors: Ramesh BP, Prasad R, Miller T, Harrington B, Yu H Abstract OBJECTIVE: Relation extraction in biomedical text mining systems has largely focused on identifying clause-level relations, but increasing sophistication demands the recognition of relations at discourse level. A first step in identifying discourse relations involves the detection of discourse connectives: words or phrases used in text to express discourse relations. In this study supervised machine-learning approaches were developed and evaluated for automatically identifying discourse connectives in biomedical text. MATERIALS AND ...
    4. A Novel System for Unlabeled Discourse Parsing in the RST Framework

      This paper presents UDRST, an unlabeled discourse parsing system in the RST framework. UDRST consists of a segmentation model and a parsing model. The segmentation model exploits subtree features to rerank N-best outputs of a base segmenter, which uses syntactic and lexical features in a CRF framework. In the parsing model, we present two algorithms for building a discourse tree from a segmented text: an incremental algorithm and a dual decomposition algorithm. Our system achieves 77.3% in the unlabeled score on the standard test set of the RST Discourse Treebank corpus, which improves 5.0% compared to HILDA [6 ...
    5. Conceptual recurrence plots: revealing patterns in human discourse.

      Related Articles Conceptual recurrence plots: revealing patterns in human discourse. IEEE Trans Vis Comput Graph. 2012 Jun;18(6):988-97 Authors: Angus D, Smith A, Wiles J Abstract Human discourse contains a rich mixture of conceptual information. Visualization of the global and local patterns within this data stream is a complex and challenging problem. Recurrence plots are an information visualization technique that can reveal trends and features in complex time series data. The recurrence plot technique works by measuring the similarity of points in a time series to all other points in the same time series and plotting the results ...
    6. On the Syntax and Translation of Finnish Discourse Clitics

      Finnish has a set of morphemes called discourse clitics, which attach to words and express things like contrasting and reminding. This paper builds a formal grammar to specify the syntax and morphology of these clitics. The grammar is written in GF, Grammatical Framework, which has a distinction between abstract syntax (tree structures) and concrete syntax (surface structures such as strings). The abstract syntax of clitics defines their contribution to the discourse semantics of sentences, in particular the topic-focus structure. The concrete syntax defines the realization in Finnish. We also show another concrete syntax, for English, which makes it possible to ...
    7. A computational model for the identification and assessment of structural similarities in argumentative discourses

      Abstract  Contemporary argumentation systems provide limited or no support for argument and related information processing. This paper presents a generic computational model that is able to identify and assess structural similarities in argumentative discourses. Focusing on the structure of such discourses, we sketch representative scenarios where the proposed model can be applied to a wide range of argumentation systems in order to define, elaborate and mine meaningful argumentation patterns. We argue that the proposed model contributes to both theoretical and practical aspects of argumentation. Content Type Journal ArticlePages 1-23DOI 10.1007/s10844-012-0212-9Authors George Gkotsis, Industrial Management and Information Systems Lab ...
    8. Multi-Dimensional Analysis of Political Language

      This paper presents a method for the valuation of discourses from different linguistic perspectives: lexical, syntactic and semantic. We describe a platform discourse analysis tool (DAT) which integrates a range of language processing tools with the intent to build complex characterisations of the political discourse. The idea behind this construction is that the vocabulary and the clause structure of the sentence betray the speaker’s level of culture, while the semantic classes mentioned in a speech characterises the speaker’s orientation. When the object of study is the political discourse, an investigation on these dimensions could put in evidence features ...
    9. A Computational Analysis of Collective Discourse. (arXiv:1204.3498v1 [cs.SI])

      This paper is focused on the computational analysis of collective discourse, a collective behavior seen in non-expert content contributions in online social media. We collect and analyze a wide range of real-world collective discourse datasets from movie user reviews to microblogs and news headlines to scientific citations. We show that all these datasets exhibit diversity of perspective, a property seen in other collective systems and a criterion in wise crowds. Our experiments also confirm that the network of different perspective co-occurrences exhibits the small-world property with high clustering of different perspectives. Finally, we show that non-expert contributions in collective discourse ...
    10. Multiple Level of Referents in Information State

      As we strive for sophisticated machine translation and reliable information extraction, we have launched a subproject pertaining to the practical elaboration of “intensional” levels of discourse referents in the framework of a representational dynamic discourse semantics, the DRT-based [14] [2], and the implementation of resulting representations within a complete model of communicating interpreters’ minds as it is captured formally in by means of functions σ, α, λ and κ [5]. We show analyses of chiefly Hungarian linguistic data, which range from revealing complex semantic contribution of small affixes through pointing out the multiply intensional nature of certain (pre)verbs to ...
    11. A Summarization Strategy of Chinese News Discourse

      Due to the problem of information overloading, automatic text summarization is becoming more and more necessary. This paper proposes a strategy for Chinese news discourse summarization based on veins theory. This method can produce a summary of an original text without requiring its full semantic interpretation, but instead relying on the discourse structure. Content Type Book ChapterPages 389-394DOI 10.1007/978-3-642-28308-6_53Authors Deliang Wang, School of Foreign Languages and Literatures, Beijing Normal University, Beijing, China Book Series Advances in Intelligent and Soft ComputingPrint ISSN 1867-5662 Book Series Volume Volume 145/2012 Book Proceedings of the 2011 2nd International Congress on Computer ...
    12. Contested Collective Intelligence: Rationale, Technologies, and a Human-Machine Annotation Study

      Abstract  We propose the concept of Contested Collective Intelligence (CCI) as a distinctive subset of the broader Collective Intelligence design space. CCI is relevant to the many organizational contexts in which it is important to work with contested knowledge, for instance, due to different intellectual traditions, competing organizational objectives, information overload or ambiguous environmental signals. The CCI challenge is to design sociotechnical infrastructures to augment such organizational capability. Since documents are often the starting points for contested discourse, and discourse markers provide a powerful cue to the presence of claims, contrasting ideas and argumentation, discourse and rhetoric provide an annotation ...
    13. Processing Coordinated Structures in PENG Light

      PENG Light is a controlled natural language designed to write unambiguous specifications that can be translated automatically via discourse representation structures into a formal target language. Instead of writing axioms in a formal language, an author writes a specification and the associated background axioms directly in controlled natural language. In this paper, we first review the controlled natural language PENG Light and show how a discourse representation structure is generated for sentences written in PENG Light. We then discuss two different solutions of how discourse representation structures can be implemented for coordinated structures. Finally, we show how an efficient implementation ...
    14. Discourse Segmentation for Sentence Compression

      Earlier studies have raised the possibility of summarizing at the level of the sentence. This simplification should help in adapting textual content in a limited space. Therefore, sentence compression is an important resource for automatic summarization systems. However, there are few studies that consider sentence-level discourse segmentation for compression task; to our knowledge, none in Spanish. In this paper, we study the relationship between discourse segmentation and compression for sentences in Spanish. We use a discourse segmenter and observe to what extent the passages deleted by annotators fit in discourse structures detected by the system. The main idea is to ...
    15. Processing Text-Technological Resources in Discourse Parsing

      Discourse parsing of complex text types such as scientific research articles requires the analysis of an input document on linguistic and structural levels that go beyond traditionally employed lexical discourse markers. This chapter describes a text-technological approach to discourse parsing. Discourse parsing with the aim of providing a discourse structure is seen as the addition of a new annotation layer for input documents marked up on several linguistic annotation levels. The discourse parser generates discourse structures according to the Rhetorical Structure Theory. An overview of the knowledge sources and components for parsing scientific journal articles is given. The parser’s ...
    16. The biomedical discourse relation bank.

      The biomedical discourse relation bank. BMC Bioinformatics. 2011;12:188 Authors: Prasad R, McRoy S, Frid N, Joshi A, Yu H Abstract BACKGROUND: Identification of discourse relations, such as causal and contrastive relations, between situations mentioned in text is an important task for biomedical text-mining. A biomedical text corpus annotated with discourse relations would be very useful for developing and evaluating methods for biomedical discourse processing. However, little effort has been made to develop such an annotated resource. RESULTS: We have developed the Biomedical Discourse Relation Bank (BioDRB), in which we have annotated explicit and implicit discourse relations in 24 ...
    17. Learning Content Selection Rules for Generating Object Descriptions in Dialogue. (arXiv:1109.2136v1 [cs.CL])

      A fundamental requirement of any task-oriented dialogue system is the ability to generate object descriptions that refer to objects in the task domain. The subproblem of content selection for object descriptions in task-oriented dialogue has been the focus of much previous work and a large number of models have been proposed. In this paper, we use the annotated COCONUT corpus of task-oriented design dialogues to develop feature sets based on Dale and Reiters (1995) incremental model, Brennan and Clarks (1996) conceptual pact model, and Jordans (2000b) intentional influences model, and use these feature sets in a machine learning experiment to ...
    18. On the Explicit and Implicit Spatiotemporal Architecture of Narratives of Personal Experience

      Expanding on recent research into the predictability of explicit linguistic spatial information relative to features of discourse structure, we present the results of several machine learning studies which leverage rhetorical relations, events, temporal information, text sequence, and both explicit and implicit linguistic spatial information in three different corpora of narrative discourses. On average, classifiers predict figure, ground, spatial verb and preposition and frame of reference to 75% accuracy, rhetorical relations to 72% accuracy, and events to 76% accuracy (all values have statistical significance above majority class baselines). These results hold independent of the number of authors, subject matter, length and ...
    19. Event in Compositional Dynamic Semantics. (arXiv:1108.5017v1 [cs.CL])

      We present a framework which constructs an event-style dis- course semantics. The discourse dynamics are encoded in continuation semantics and various rhetorical relations are embedded in the resulting interpretation of the framework. We assume discourse and sentence are distinct semantic objects, that play different roles in meaning evalua- tion. Moreover, two sets of composition functions, for handling different discourse relations, are introduced. The paper first gives the necessary background and motivation for event and dynamic semantics, then the framework with detailed examples will be introduced.
    20. Event in Compositional Dynamic Semantics

      We present a framework which constructs an event-style discourse semantics. The discourse dynamics are encoded in continuation semantics and various rhetorical relations are embedded in the resulting interpretation of the framework. We assume discourse and sentence are distinct semantic objects, that play different roles in meaning evaluation. Moreover, two sets of composition functions, for handling different discourse relations, are introduced. The paper first gives the necessary background and motivation for event and dynamic semantics, then the framework with detailed examples will be introduced. Content Type Book ChapterPages 219-234DOI 10.1007/978-3-642-22221-4_15Authors Sai Qian, LORIA & INRIA Nancy Grand-Est, BP 239, 54506 ...
    21. From Connectives to Argumentative Markers: A Quest for Markers of Argumentative Moves and of Related Aspects of Argumentative Discourse

      Abstract  In this paper, I explore the potential of systematically studying the linguistic surface of discourse for the purposes of identifying markers of argumentative moves and other related categories, such as types of arguments and argumentative strategies. Such a list of argumentative markers can prove useful for the (semi)automatic treatment of a large corpus of texts. After reviewing literature on the linguistic realization of argumentative moves as well as literature on the subject of discourse markers, it becomes clear that the search for representative items of argumentative markers cannot be restricted to those elements marking relations but that it ...
    22. Towards a Discourse-driven Taxonomic Inference Model

      This chapter describes ongoing work, the goal of which is to create a discourse-driven inference model, as well as to construct resources using such a model. The data process consists of texts from two encyclopedias of the medical domain–stylistic properties characteristic of encyclopedia entries constitute the mechanisms underlying the inference model, such as layout-based features alongside with semantic (conceptual) document structuring. Three parts of the model are explained in detail, providing experimental results that are based on language processing techniques: (i) identifying taxonomic document structure by machine learning; (ii) discourse-driven construction of text–hypothesis pairs for examining types of ...
