1. Articles in category: WSD

    73-96 of 369 « 1 2 3 4 5 6 7 ... 14 15 16 »
    1. Methods for Automatic WSD

      Research in automatic word sense disambiguation has a long history on a par with computational linguistics itself. In this chapter, we take a two-dimensional approach to review the development and state of the art of the field, by the knowledge sources used for disambiguation on the one hand, and the algorithmic mechanisms with which the knowledge sources are actually deployed on the other. The trend for the latter is relatively clear, correlating closely with the historical development of many other natural language processing subtasks, where conventional knowledge-based methods gradually give way to scalable, corpus-based statistical and supervised methods. While the ...
      Read Full Article
    2. The Psychology of WSD

      How do humans resolve semantically ambiguous words? It happens that we will not find a direct answer from psycholinguistic studies. Nevertheless, through probing the organisation of words in the mental lexicon and the access of words, particularly those with multiple meanings, in the human mind, useful hints might be found. In this chapter, we focus our attention on the cognitive aspects of word sense disambiguation. We first review the psychological findings on the mental lexicon, including the storage of words, the representation of meanings, and sense distinction. Mechanisms of lexical access will then be discussed, especially with reference to the ...
      Read Full Article
    3. Sense Concreteness and Lexical Activation

      Psycholinguistic evidence has thus suggested the differential processing of concrete and abstract concepts by the human mind. This chapter further explores the mental lexicon with respect to the concreteness and abstractness of concepts based on word association data. Since lexical resources including computational semantic lexicons play a critical role in automatic word sense disambiguation, we aim at investigating to what extent such concreteness distinction is modelled in existing lexical resources. It was observed that concrete and abstract noun senses tend to exhibit consistently different lexical activation patterns, and the results suggest that sense concreteness may serve as a possible alternative ...
      Read Full Article
    4. Lessons Learned from Evaluation

      The performance evaluation of word sense disambiguation systems has only been more or less standardised in the last decade with the first three SENSEVAL and the more recent SEMEVAL exercises. These exercises have pointed to the superiority of supervised methods using multiple knowledge sources and ensembles of classifiers. Behind the apparently plateaued performance of state-of-the-art systems, some fundamental issues including sense granularity, sparseness of sense-tagged data, and contribution to real applications, still remain. But more importantly, evaluation results also suggest that there is something about the target words themselves which is responsible for the differential performance among systems trained on ...
      Read Full Article
    5. Lexical Sensitivity of WSD: An Outlook

      We have tried to show from our discussion in the previous chapters that while ensembles of classifiers based on supervised learning methods trained on multiple contextual features have proved to perform superiorly in current mainstream automatic word sense disambiguation, and their performance might have apparently reached a plateau, there are still considerable unknowns as far as the lexical sensitivity of the task is concerned. We have also suggested that these under-explored parts cannot be adequately addressed from the computational perspective alone, as they probably involve some intrinsic properties of words and senses, like concept concreteness, which may be cognitively based ...
      Read Full Article
    6. Lexical Functions and Their Applications

      As a concept, lexical function (LF) was introduced in the frame of the Meaning-Text Theory (MTT) presented in (Mel’čuk, 1974, 1996) in order to describe lexical restrictions and preferences of words in choosing their “companions” when expressing certain meanings in text. Here we will give a brief account of the fundamental concepts and statements of MTT as the context of LFs. Actually, the formalism of lexical functions has been one of those parts of MTT which attracted most attention of specialists in general linguistics and, in particular, computational linguistics. A lot of research began in the area of natural ...
      Read Full Article
    7. Performance Analysis of Case Based Word Sense Disambiguation with Minimal Features Using Neural Network

      In this paper, the performance of case based word sense disambiguation attained by two different set of knowledge features, such as bigram and trigram are analyzed for identifying the best for word sense disambiguation. To uncover the ambiguous of a word, so many knowledge features like part of features (PoS), collocation, bag of words, noun-verb relation etc. Here, ambiguity of a word is removed with only two and three elements referred as bigram and trigram. Two different representations of bigram, pre-bigram and post-bigram and three different forms of trigram, pre-trigram, in-trigram and post-trigram are considered for disambiguation. Relevant knowledge features ...
      Read Full Article
    8. Semantic Similarity Functions in Word Sense Disambiguation

      This paper presents a method of improving the results of automatic Word Sense Disambiguation by generalizing nouns appearing in a disambiguated context to concepts. A corpus-based semantic similarity function is used for that purpose, by substituting appearances of particular nouns with a set of the most closely related similar words. We show that this approach may be applied to both supervised and unsupervised WSD methods and in both cases leads to an improvement in disambiguation accuracy. We evaluate the proposed approach by conducting a series of lexical sample WSD experiments on both domain-restricted dataset and a general, balanced Polish-language text ...
      Read Full Article
    9. An Efficient Feature Frequency-Based Approach to Tackle Cross-Lingual Word Sense Disambiguation

      The Cross-Lingual Word Sense Disambiguation (CLWSD) problem is a challenging Natural Language Processing (NLP) task that consists of selecting the correct translation of an ambiguous word in a given context. Different approaches have been proposed to tackle this problem, but they are often complex and need tuning and parameter optimization. In this paper, we propose a new classifier, Selected Binary Feature Combination (SBFC), for the CLWSD problem. The underlying hypothesis of SBFC is that a translation is a good classification label for new instances if the features that occur frequently in the new instance also occur frequently in the training ...
      Read Full Article
    10. Kannada Word Sense Disambiguation Using Association Rules

      Disambiguating the polysemous word is one of the major issues in the process of Machine Translation. The word may have many senses, selecting the most appropriate sense for an ambiguous word in a sentence is a central problem in Machine Translation. Because, each sense of a word in a source language sentence may generate different target language sentences. Knowledge and corpus based methods are usually applied for disambiguation task. In the present paper, we propose an algorithm to disambiguate Kannada polysemous words using association rules. We built Kannada corpora using web resources. The corpora are divided in to training and ...
      Read Full Article
    11. An Automatic Approach for Mapping Product Taxonomies in E-Commerce Systems

      The recent explosion of Web shops has made the user task of finding the desired products an increasingly difficult one. One way to solve this problem is to offer an integrated access to product information on the Web, for which an important component is the mapping of product taxonomies. In this paper, we introduce CMAP, an algorithm that can be used to map one product taxonomy to another product taxonomy. CMAP employs word sense disambiguation techniques and lexical and structural similarity measures in order to find the best matching categories. The performance on precision, recall, and the F 1-measure is ...
      Read Full Article
    12. An Automated Approach to Product Taxonomy Mapping in E-Commerce

      Due to the ever-growing amount of information available on Web shops, it has become increasingly difficult to get an overview of Web-based product information. There are clear indications that better search capabilities, such as the exploitation of annotated data, are needed to keep online shopping transparent for the user. For example, annotations can help present information from multiple sources in a uniform manner. This paper proposes an algorithm that can autonomously map heterogeneous product taxonomies forWeb shop data integration purposes. The proposed approach uses word sense disambiguation techniques, approximate lexical matching, and a mechanism that deals with composite categories. Our ...
      Read Full Article
    13. A Linguistic Approach for Semantic Web Service Discovery

      We propose a Semantic Web Service Discovery framework for finding semantically annotated Web services by using natural language processing techniques. The framework searches through a set of annotated Web services for matches with a user query, which consists of keywords, so that knowledge about semantic languages is not required. For matching keywords with Semantic Web service descriptions given in Web Service Modeling Ontology (WSMO), techniques like part-of-speech tagging, lemmatization, and word sense disambiguation are used. Three different matching algorithms are defined and evaluated for their ability to do exact matching and approximate matching between the user query and Web Service ...
      Read Full Article
    14. Applying Deep Belief Networks to Word Sense Disambiguation. (arXiv:1207.0396v1 [cs.CL])

      In this paper, we applied a novel learning algorithm, namely, Deep Belief Networks (DBN) to word sense disambiguation (WSD). DBN is a probabilistic generative model composed of multiple layers of hidden units. DBN uses Restricted Boltzmann Machine (RBM) to greedily train layer by layer as a pretraining. Then, a separate fine tuning step is employed to improve the discriminative power. We compared DBN with various state-of-the-art supervised learning algorithms in WSD such as Support Vector Machine (SVM), Maximum Entropy model (MaxEnt), Naive Bayes classifier (NB) and Kernel Principal Component Analysis (KPCA). We used all words in the given paragraph, surrounding ...
      Read Full Article
      Mentions: Bayes WSD Kpca
    15. Word sense disambiguation using emergent categories

      Disclosed herein is a computer implemented method and system for word sense disambiguation in a natural language sentence. The natural language sentence is parsed for identifying possible parts of speech for each term and identifying possible phrase structures. Terms comprising one or more linguistic roles are identified. The possible sense combinations for the terms with linguistic roles are identified. Emergent categories are applied to identify possible valid senses for each of the terms with identified linguistic roles. Linguistic role pairs are identified from among the terms identified with linguistic roles. The correspondence functions with the correspondence function types matching the ...
      Read Full Article
    16. Word Sense Disambiguation as an Integer Linear Programming Problem

      We present an integer linear programming model of word sense disambiguation. Given a sentence, an inventory of possible senses per word, and a sense relatedness measure, the model assigns to the sentence’s word occurrences the senses that maximize the total pairwise sense relatedness. Experimental results show that our model, with two unsupervised sense relatedness measures, compares well against two other prominent unsupervised word sense disambiguation methods. Content Type Book ChapterPages 33-40DOI 10.1007/978-3-642-30448-4_5Authors Vicky Panagiotopoulou, Department of Informatics, Athens University of Economics and Business, GreeceIraklis Varlamis, Department of Informatics and Telematics, Harokopio University, Athens, GreeceIon Androutsopoulos, Department of ...
      Read Full Article
    17. FASTSUBS: An Efficient Admissible Algorithm for Finding the Most Likely Lexical Substitutes Using a Statistical Language Model. (arXiv:1205.5407v1 [cs.CL])

      Lexical substitutes have found use in the context of word sense disambiguation, unsupervised part-of-speech induction, paraphrasing, machine translation, and text simplification. Using a statistical language model to find the most likely substitutes in a given context is a successful approach, but the cost of a naive algorithm is proportional to the vocabulary size. This paper presents the Fastsubs algorithm which can efficiently and correctly identify the most likely lexical substitutes for a given context based on a statistical language model without going through most of the vocabulary. The efficiency of Fastsubs makes large scale experiments based on lexical substitutes feasible ...
      Read Full Article
      Mentions: Penn Treebank WSJ
    18. An RDF-Based Model for Linguistic Annotation

      This paper proposes the application of the RDF framework to the representation of linguistic annotations. We argue that RDF is a suitable data model to capture multiple annotations on the same text segment, and to integrate multiple layers of annotations. As well as using RDF for this purpose, the main contribution of the paper is an OWL ontology, called TELIX (Text Encoding and Linguistic Information eXchange), which models annotation content. This ontology builds on the SKOS XL vocabulary, a W3C standard for representation of lexical entities as RDF graphs. We extend SKOS XL in order to capture lexical relations between ...
      Read Full Article
      Mentions: France RDF Skos
    19. LODifier: Generating Linked Data from Unstructured Text

      The automated extraction of information from text and its transformation into a formal description is an important goal in both Semantic Web research and computational linguistics. The extracted information can be used for a variety of tasks such as ontology generation, question answering and information retrieval. LODifier is an approach that combines deep semantic analysis with named entity recognition, word sense disambiguation and controlled Semantic Web vocabularies in order to extract named entities and relations between them from text and to convert them into an RDF representation which is linked to DBpedia and WordNet. We present the architecture of our ...
      Read Full Article
    20. Schema - An Algorithm for Automated Product Taxonomy Mapping in E-commerce

      This paper proposes SCHEMA, an algorithm for automated mapping between heterogeneous product taxonomies in the e-commerce domain. SCHEMA utilises word sense disambiguation techniques, based on the ideas from the algorithm proposed by Lesk, in combination with the semantic lexicon WordNet. For finding candidate map categories and determining the path-similarity we propose a node matching function that is based on the Levenshtein distance. The final mapping quality score is calculated using the Damerau-Levenshtein distance and a node-dissimilarity penalty. The performance of SCHEMA was tested on three real-life datasets and compared with PROMPT and the algorithm proposed by Park & Kim. It is ...
      Read Full Article
    21. Exploiting domain information for Word Sense Disambiguation of medical documents.

      Exploiting domain information for Word Sense Disambiguation of medical documents. J Am Med Inform Assoc. 2012 Mar-Apr;19(2):235-40 Authors: Stevenson M, Agirre E, Soroa A Abstract OBJECTIVE: Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears. DESIGN: The authors proposed and implemented several methods to extract lists of key terms ...
      Read Full Article
      Mentions: WSD
    22. Improving the Efficiency of Document Clustering and Labeling Using Modified FPF Algorithm

      Document clustering is an effective tool to manage information overload. By grouping similar documents together, we enable a human observer to quickly browse large document collections, make it possible to easily grasp the distinct topics and subtopics. In this Paper we survey the most important problems and techniques related to text information retrieval: document pre-processing and filtering, word sense disambiguation, Further we present text clustering using Modified FPF algorithm and comparison of our clustering algorithms against FPF, which is the most used algorithm in the text clustering context. Further we introduce the problem of cluster labeling: Cluster labeling is achieved ...
      Read Full Article
    23. Creating a system for lexical substitutions from scratch using crowdsourcing

      Abstract  This article describes the creation and application of the Turk Bootstrap Word Sense Inventory for 397 frequent nouns, which is a publicly available resource for lexical substitution. This resource was acquired using Amazon Mechanical Turk. In a bootstrapping process with massive collaborative input, substitutions for target words in context are elicited and clustered by sense; then, more contexts are collected. Contexts that cannot be assigned to a current target word’s sense inventory re-enter the bootstrapping loop and get a supply of substitutions. This process yields a sense inventory with its granularity determined by substitutions as opposed to psychologically ...
      Read Full Article
    24. Identifying Concepts on Specific Domain by a Unsupervised Graph-Based Approach

      This paper presents an unsupervised approach to Word Sense Disambiguation on a specific domain to automatically to assign the right sense to a given ambiguous word. The approach proposed relies on integration of two source information: context and semantic similarity information. The experiments were carried on English test data of SemEval 2010 and evaluated with a variety of measures that analyze the connectivity of graph structure. The obtained result were evaluated using precision and recall measures and compared with the results of SemEval 2010 the approach is currently under test with another semantic similarity measures, preliminary results look promising. Content ...
      Read Full Article
    73-96 of 369 « 1 2 3 4 5 6 7 ... 14 15 16 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD
  2. Popular Articles