1. Articles in category: Semantic

    3961-3984 of 4058 « 1 2 ... 163 164 165 166 167 168 169 »
    1. Semantic representation of Korean numeral classifier and its ontology building for HLT applications

      Abstract  The complexity of Korean numeral classifiers demands semantic as well as computational approaches that employ natural language processing (NLP) techniques. The classifier is a universal linguistic device, having the two functions of quantifying and classifying nouns in noun phrase constructions. Many linguistic studies have focused on the fact that numeral classifiers afford decisive clues to categorizing nouns. However, few studies have dealt with the semantic categorization of classifiers and their semantic relations to the nouns they quantify and categorize in building ontologies. In this article, we propose the semantic recategorization of the Korean numeral classifiers in the context of ...
      Read Full Article
    2. A large-scale classification of English verbs

      Abstract  Lexical classifications have proved useful in supporting various natural language processing (NLP) tasks. The largest verb classification for English is Levin’s (1993) work which defines groupings of verbs based on syntactic and semantic properties. VerbNet (VN) (Kipper et al. 2000; Kipper-Schuler 2005)—an extensive computational verb lexicon for English—provides detailed syntactic-semantic descriptions of Levin classes. While the classes included are extensive enough for some NLP use, they are not comprehensive. Korhonen and Briscoe (2004) have proposed a significant extension of Levin’s classification which incorporates 57 novel classes for verbs not covered (comprehensively) by Levin. Korhonen and ...
      Read Full Article
    3. Categorizing Unknown Words: A Decision Tree-Based Misspelling Identifier

      This paper introduces a robust, portable system for categorizing unknown words. It is based on a multi- component architecture where each component is responsible for identifying one class of unknown words. The focus of this paper is the component that identifies spelling errors. The misspelling identifier uses a decision tree architecture to combine multiple types of evidence about the unknown word. The misspelling identifier is evaluated using data from live closed captions - a genre replete with a wide variety of unknown words. Content Type Book ChapterDOI 10.1007/3-540-46695-9_11Authors Janine Toole, Simon Fraser University Natural Language Lab, School of Computing ...
      Read Full Article
    4. Correction of Medical Handwriting OCR Based on Semantic Similarity

      In the paper a method of the correction of handwriting Optical Character Recognition (OCR) based on the semantic similarity is presented. Different versions of the extraction of semantic similarity measures from a corpus are analysed, with the best results achieved for the combination of the text window context and Rank Weight Function. An algorithm of the word sequence selection with the high internal similarity is proposed. The method was trained and applied to a corpus of real medical documents written in Polish. Content Type Book ChapterDOI 10.1007/978-3-540-77226-2_45Authors Bartosz Broda, Institute of Applied Informatics, Wrocław University of Technology PolandMaciej ...
      Read Full Article
    5. Automatic building of an ontology on the basis of text corpora in Thai

      Abstract  This paper presents a methodology for automatic learning of ontologies from Thai text corpora, by extraction of terms and relations. A shallow parser is used to chunk texts on which we identify taxonomic relations with the help of cues: lexico-syntactic patterns and item lists. The main advantage of the approach is that it simplify the task of concept and relation labeling since cues help for identifying the ontological concept and hinting their relation. However, these techniques pose certain problems, i.e. cue word ambiguity, item list identification, and numerous candidate terms. We also propose the methodology to solve these ...
      Read Full Article
    6. Semantic authoring, runtime and training environment

      A system for developing semantic schema for natural language processing has a semantic runtime engine and a semantic authoring tool. The semantic runtime engine is adapted to map a natural language input to a semantic schema and to return the mapped results to an application domain. The semantic authoring tool is adapted to receive user input for defining the semantic schema and to interact with the semantic runtime engine to test the semantic schema against a query.
      Read Full Article
    7. Automatic Character Assignation

      This article outlines a simple method for parsing an ASCII-format dramatic work from the Project Gutenberg Corpus into separate characters. The motivation for the program is a upcoming study in computational stylistics and characterization in drama. Various previous approaches involving interactive media are examined and the parser is evaluated by comparing the output to data annotated by hand and parsed automatically by the Opensourceshakespeare.org project parser. An acceptable level of accuracy is achieved, and it is identified how to improve accuracy to extremely high levels. Content Type Book ChapterDOI 10.1007/978-1-84800-094-0_25Authors Gerard Lynch, Computational Linguistics Group Department of ...
      Read Full Article
    8. Semantic Web Technologies for Enhancing Intelligent DSS Environments

      The next generation Web, called the Semantic Web (SW), is receiving much attention lately from the research and development communities globally, Many software designers, developers, and vendors have recently begun exploring the use of SW technologies within the context of developing intelligent Web-based Decision Support Systems (DSS) since they provide an attractive, application-neutral, platform-neutral, Web environment that operates on top of the existing Web without having to modify it. They are envisioned to provide machine interpretation and processing capability of the existing Web information. With these powerful potential advantages, there is a need for the DSS designers and developers to ...
      Read Full Article
    9. A Flexible Framework To Experiment With Ontology Learning Techniques

      Ontology learning refers to extracting conceptual knowledge from several sources and building an ontology from scratch, enriching, or adapting an existing ontology. It uses methods from a diverse spectrum of fields such as Natural Language Processing, Artificial Intelligence and Machine learning. However, a crucial challenging issue is to quantitatively evaluate the usefulness and accuracy of both techniques and combinations of techniques, when applied to ontology learning. It is an interesting problem because there are no published comparative studies. We are developing a flexible framework for ontology learning from text which provides a cyclical process that involves the successive application of ...
      Read Full Article
    10. Content-Based Recommendation Services for Personalized Digital Libraries

      This paper describes the possible use of advanced content-based recommendation methods in the area of Digital Libraries. Content-based recommenders analyze documents previously rated by a target user, and build a profile exploited to recommend new interesting documents. One of the main limitations of traditional keyword-based approaches is that they are unable to capture the semantics of the user interests, due to the natural language ambiguity. We developed a semantic recommender system, called ITem Recommender, able to disambiguate documents before using them to learn the user profile. The Conference Participant Advisor service relies on the profiles learned by ITem Recommender to ...
      Read Full Article
    11. Effectiveness of Methods for Syntactic and Semantic Recognition of Numeral Strings: Tradeoffs Between Number of Features and Length of Word N-Grams

      This paper describes and compares the use of methods based on N-grams (specifically trigrams and pentagrams), together with five features, to recognise the syntactic and semantic categories of numeral strings representing money, number, date, etc., in texts. The system employs three interpretation processes: word N-grams construction with a tokeniser; rule-based processing of numeral strings; and N-gram-based classification. We extracted numeral strings from 1,111 online newspaper articles. For numeral strings interpretation, we chose 112 (10%) of 1,111 articles to provide unseen test data (1,278 numeral strings), and used the remaining 999 articles to provide 11,525 numeral strings ...
      Read Full Article
    12. A Within-Frame Ontological Extension on FrameNet: Application in Predicate Chain Analysis and Question Answering

      An ontological extension on the frames in FrameNet is presented in this paper. The general conceptual relations between frame elements, in conjunction with existing characteristics of this lexical resource, suggest more sophisticated semantic analysis of lexical chains (e.g. predicate chains) exploited in many text understanding applications. In particular, we have investigated its benefit for meaning-aware question answering when combined with an inference strategy. The proposed knowledge representation mechanism on the frame elements of FrameNet has been shown to have an impact on answering natural language questions on the basis of our case analysis. Content Type Book ChapterDOI 10.1007 ...
      Read Full Article
    13. Using Clustering for Web Information Extraction

      This paper introduces an approach that achieves automated data extraction from semi-structured Web pages by clustering. Both HTML tags and the textual features of text tokens are considered for similarity comparison. The first clustering process groups similar text tokens into the same text clusters, and the second clustering process groups similar data tuples into tuple clusters. A tuple cluster is a strong candidate of a repetitive data region. Content Type Book ChapterDOI 10.1007/978-3-540-76928-6_43Authors Le Phong Bao Vuong, School of Mathematics, Statistics and Computer Science, Victoria University of Wellington, PO Box 600, Wellington New ZealandXiaoying Gao, School of Mathematics ...
      Read Full Article
    14. A Knowledge-Based Approach to Named Entity Disambiguation in News Articles

      Named entity disambiguation has been one of the main challenges to research in Information Extraction and development of Semantic Web. Therefore, it has attracted much research effort, with various methods introduced for different domains, scopes, and purposes. In this paper, we propose a new approach that is not limited to some entity classes and does not require well-structured texts. The novelty is that it exploits relations between co-occurring entities in a text as defined in a knowledge base for disambiguation. Combined with class weighting and coreference resolution, our knowledge-based method outperforms KIM system in this problem. Implemented algorithms and conducted ...
      Read Full Article
    15. Learning Implicit User Interests Using Ontology and Search History for Personalization

      The key for providing a robust context for personalized information retrieval is to build a library which gathers the long term and the short term user’s interests and then using it in the retrieval process in order to deliver results that better meet the user’s information needs. In this paper, we present an enhanced approach for learning a semantic representation of the underlying user’s interests using the search history and a predefined ontology. The basic idea is to learn the user’s interests by collecting evidence from his search history and represent them conceptually using the concept ...
      Read Full Article
    16. OntoGame: Towards Overcoming the Incentive Bottleneck in Ontology Building

      Despite significant advancement in ontology learning, building ontologies remains a task that highly depends on human intelligence, both as a source of domain expertise and for producing a consensual conceptualization. This means that individuals need to contribute time, and sometimes other resources, to an ontology project. Now, we can observe a sharp contrast in user interest in two branches of Web activity: While the “Web 2.0” movement lives from an unprecedented amount of contributions from Web users, we witness a substantial lack of user involvement in ontology projects for the Semantic Web. We assume that one cause of the ...
      Read Full Article
    17. Automatic Annotation in Data Integration Systems

      CWSD (Combined Word Sense Disambiguation) is an algorithm for the automatic annotation of structured and semi-structured data sources. Instead of being targeted to textual data sources like most of the traditional WSD algorithms, CWSD can exploit knowledge from the structure of data sources together with the lexical knowledge associated with schema elements (terms in the following). We integrated CWSD in the MOMIS system (Mediator EnvirOment forMultiple Information Sources) [1], which is an framework designed for the integration of data sources, where the lexical annotation of terms was performed manually by the user. CWSD combines a structural disambiguation algorithm, that starts ...
      Read Full Article
    18. Taxonomy Construction Using Compound Similarity Measure

      Taxonomy learning is one of the major steps in ontology learning process. Manual construction of taxonomies is a time-consuming and cumbersome task. Recently many researchers have focused on automatic taxonomy learning, but still quality of generated taxonomies is not satisfactory. In this paper we have proposed a new compound similarity measure. This measure is based on both knowledge poor and knowledge rich approaches to find word similarity. We also used Neural Network model for combination of several similarity methods. We have compared our method with simple syntactic similarity measure. Our measure considerably improves the precision and recall of automatic generated ...
      Read Full Article
    19. Semantic Matching Based on Enterprise Ontologies

      Semantic Web technologies have in recent years started to also find their way into the world of commercial enterprises. Enterprise ontologies can be used as a basis for determining the relevance of information with respect to the enterprise. The interests of individuals can be expressed by means of the enterprise ontology. The main contribution of our approach is the integration of point set distance measures with a modified semantic distance measure for pair-wise concept distance calculation. Our combined measure can be used to determine the intra-ontological distance between sub-ontologies. Content Type Book ChapterDOI 10.1007/978-3-540-76848-7_76Authors Andreas Billig, Jönköping University ...
      Read Full Article
      Mentions: Heidelberg
    20. Labeling Data Extracted from the Web

      We consider finding descriptive labels for anonymous, structured datasets, such as those produced by state-of-the-art Web wrappers. We give a probabilistic model to estimate the affinity between attributes and labels, and describe a method that uses a Web search engine to populate the model. We discuss a method for finding good candidate labels for unlabeled datasets. Ours is the first unsupervised labeling method that does not rely on mining the HTML pages containing the data. Experimental results with data from 8 different domains show that our methods achieve high accuracy even with very few search engine accesses. Content Type Book ...
      Read Full Article
    21. Automatic Feeding of an Innovation Knowledge Base Using a Semantic Representation of Field Knowledge

      In this paper, by considering a particular application field, the innovation, we propose an automatic system to feed an innovation knowledge base (IKB) starting from texts located on the Web. To facilitate the extraction of concepts from texts we distinguished in our work two knowledge types: primitive knowledge and definite knowledge. Each one is separately represented. Primitive knowledge is directly extracted from natural language texts and temporally organized in a specific base called TKB (Temporary Knowledge Base). The entry of the base IKB is the knowledge filtered from the TKB by some specified rules. After each filtering step, the TKB ...
      Read Full Article
      Mentions: Heidelberg
    22. Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors

      An apparatus and method are disclosed for producing a semantic representation of information in a semantic space. The information is first represented in a table that stores values which indicate a relationship with predetermined categories. The categories correspond to dimensions in the semantic space. The significance of the information with respect to the predetermined categories is then determined. A trainable semantic vector (TSV) is constructed to provide a semantic representation of the information. The TSV has dimensions equal to the number of predetermined categories and represents the significance of the information relative to each of the predetermined categories. Various types ...
      Read Full Article
      Mentions: Eprom
    3961-3984 of 4058 « 1 2 ... 163 164 165 166 167 168 169 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles

  3. Organizations in the News

    1. (1 articles) Coder