1. 1-24 of 74 1 2 3 »
    1. .Ps

      Statistical Methods in Natural Language Processing Michael Collins AT&T Labs-Research Overview Some NLP problems: Information extraction (Named entities, Relationships between entities, etc.) Finding linguistic structure Part-of-speech tagging, ``Chunking'', Parsing Techniques: Log-linear (maximum-entropy) taggers Probabilistic context-free grammars (PCFGs) PCFGs with enriched non-terminals Discriminative methods: Conditional MRFs, Perceptron algorithms, Kernel methods
      Read Full Article
    2. Slides

      Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with the Perceptron Algorithm Michael Collins AT&T Labs-Research THE BASIC PROBLEM Hispaniola/NNP quickly/RB became/VB an/DT important/JJ base/?? from which Spain expanded its empire into the rest of the Western Hemisphere . There are many possible tags in the position ?? Need to learn a function from (context, tag) pairs to a real value indicating the ``plausibility'' of the tag in this context
      Read Full Article
    3. .Pdf

      Statistical Methods in Natural Language Processing Michael Collins AT&T Labs-Research Overview Some NLP problems: ¯ Information extraction (Named entities, Relationships between entities, etc.) ¯ Finding linguistic structure Part-of-speech tagging, “Chunking”, Parsing Techniques: ¯ Log-linear (maximum-entropy) taggers ¯ Probabilistic context-free grammars (PCFGs) PCFGs with enriched non-terminals ¯ Discriminative methods: Conditional MRFs, Perceptron algorithms, Kernel methods Some
      Read Full Article
    4. Acl/Eacl97

      THREE GENERATIVE, LEXICALISED MODELS FOR STATISTICAL PARSING MICHAEL COLLINS UNIVERSITY OF PENNSYLVANIA BACKGROUND ffl S = a sentence. ffl T = a parse tree for the sentence. ffl A statistical model estimates P(T jS). ffl The best parse under the model is then T best = argmax T P (T jS) ffl Parameters trained from an annotated corpus (e.g. the Penn Treebank) PROBABILISTIC CONTEXT FREE GRAMMARS ffl T best = arg max T P(T j S) = arg max T P(T;S) P(S) = arg max T P(T; S) ffl For a tree with n co
      Read Full Article
    5. EM (Expectation Maximization) Algorithm

      The EM Algorithm Michael Collins In fulfillment of the Written Preliminary Exam II requirement, September 1997 1 Contents 1 Introduction 3 2 Preliminaries 4 2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Maximum-likelihood Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2.1 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.3 Sufficient Statistics . . . . . . . . . . . . . .
      Read Full Article
    6. Parsing with a Single Neuron: Convolution Kernels for NaturalLanguage Problems

      Parsing with a Single Neuron: Convolution Kernels for Natural Language Problems Michael Collinsy and Nigel Du yz yAT&T Labs-Research, Florham Park, New Jersey. mcollins@research.att.com zDepartment of Computer Science, University of California at Santa Cruz. nigeduff@cs.ucsc.edu Abstract This paper introduces new train- ing criteria and algorithms for NLP problems, based on the Support Vec- tor Machine (SVM) approach to classi cation problems. SVMs can be applied e ciently to a feature vector re
      Read Full Article
    7. Three Generative, Lexicalised Models for Statistical Parsing

      Michael Collins \Lambda Dept. of Computer and Information Science University of Pennsylvania Philadelphia, PA, 19104, U.S.A. mcollins@gradient.cis.upenn.edu Abstract In this paper we first propose a new statistical parsing model, which is a generative model of lexicalised context-free grammar. We then extend the model to include a probabilistic treatment of both subcategorisation and wh-movement. Results on Wall Street Journal text sho
      Read Full Article
    8. A New Statistical Parser Based on Bigram Lexical Dependencies

      Michael John Collins \Lambda Dept. of Computer and Information Science University of Pennsylvania Philadelphia, PA, 19104, U.S.A. mcollins@gradient.cis.upenn.edu Abstract This paper describes a new statistical parser which is based on probabilities of dependencies between head-words in the parse tree. Standard bigram probability estimation techniques are extended to calculate probabilities of dependencies between pairs of words. Tests
      Read Full Article
    9. Prepositional Phrase Attachment through a Backed-off Model

      Prepositional Phrase Attachment through a Backed-Off Model Michael Collins and James Brooks Department of Computer and Information Science University of Pennsylvania Philadelphia, PA 19104 fmcollins, jbrooksg@gradient.cis.upenn.edu Abstract Recent work has considered corpus-based or statistical approaches to the problem of prepositional phrase attachment ambiguity. Typically, ambiguous verb phrases of the form v np1 p np2 are resolved through a model which considers values of the four head words
      Read Full Article
    10. Semantic Tagging using a Probabilistic Context Free Grammar

      \Lambda Michael Collins Dept. of Computer and Information Science University of Pennsylvania 200 South 33rd Street, Philadelphia, PA 19104 mcollins@gradient.cis.upenn.edu Scott Miller BBN Technologies 70 Fawcett Street Cambridge, MA 02138 szmiller@bbn.com Abstract This paper describes a statistical model for extraction of events at the sentence level, or ``semantic tagging'', typically the first level of processing in Information Extrac
      Read Full Article
    11. A Statistical Parser for Czech

      \Lambda Michael Collins AT&T Labs--Research, Shannon Laboratory, 180 Park Avenue, Florham Park, NJ 07932 mcollins@research.att.com Jan Haji c Institute of Formal and Applied Linguistics Charles University, Prague, Czech Republic hajic@ufal.mff.cuni.cz Lance Ramshaw BBN Technologies, 70 Fawcett St., Cambridge, MA 02138 lramshaw@bbn.com Christoph Tillmann Lehrstuhl f ur Informatik VI, RWTH Aachen D-52056 Aachen, Germany tillmann@informatik.rwth-aachen.de Abstract Th
      Read Full Article
    12. Answer Extraction.

      Answer Extraction Steven Abney Michael Collins Amit Singhal AT&T Shannon Laboratory 180 Park Ave. Florham Park, NJ 07932 fabney,mcollins,singhalg@research.att.com Abstract Information retrieval systems have typically concen- trated on retrieving a set of documents which are rel- evant to a user's query. This paper describes a sys- tem that attempts to retrieve a much smaller section of text, namely, a direct answer to a user's question. The SMART IR system is used to extract a ranked set of pass
      Read Full Article
    13. Unsupervised Models for Named Entity Classification

      Michael Collins and Yoram Singer AT&T Labs--Research, 180 Park Avenue, Florham Park, NJ 07932 fmcollins,singerg@research.att.com Abstract This paper discusses the use of unlabeled examples for the problem of named entity classification. A large number of rules is needed for coverage of the domain, suggesting that a fairly large number of labeled examples should be required to train a classifier. However, we show that the use of unlabeled data c
      Read Full Article
    14. Logistic Regression, AdaBoost and Bregman Distances

      Proceedings of the Thirteenth Annual Conference onComputational-6880 Theory, 2000. Michael Collins AT&T Labs \Gamma Research Shannon Laboratory 180 Park Avenue, Room A253 Florham Park, NJ 07932 mcollins@research.att.com Robert E. Schapire AT&T Labs \Gamma Research Shannon Laboratory 180 Park Avenue, Room A203 Florham Park, NJ 07932 schapire@research.att.com Yoram Singer School of Computer Science & Engineering Hebrew University, Jerusalem 91904
      Read Full Article
    15. A Generalization of Principal Component Analysis to the ExponentialFamily

      A Generalization of Principal Component Analysis to the Exponential Family Michael Collins Sanjoy Dasgupta Robert E. Schapire AT&T Labs \Gamma Research 180 Park Avenue, Florham Park, NJ 07932 fmcollins, dasgupta, schapireg@research.att.com Abstract Principal component analysis (PCA) is a commonly applied technique for dimensionality reduction. PCA implicitly minimizes a squared loss function, which may be inappropriate for data that is not real-valued, such as binary-valued data. This paper draw
      Read Full Article
    16. Parameter Estimation for Statistical Parsing Models: Theory andPractice of Distribution-Free Methods

      Parameter Estimation for Statistical Parsing Models: Theory and Practice of Distribution-Free Methods Michael Collins AT&T Labs-Research. mcollins@research.att.com Abstract A fundamental problem in statistical parsing is the choice of criteria and algorithms used to estimate the parameters in a model. The predominant approach in computational linguistics has been to use a parametric model with some variant of maximum-likelihood estimation. The assumptions under which maximum-likelihood estimatio
      Read Full Article
    17. Discriminative Reranking for Natural Language Parsing

      Michael Collins MCOLLINS@RESEARCH.ATT.COM AT&T Labs--Research, Rm A-253, Shannon Laboratory, 180 Park Avenue, Florham Park, NJ 07932 Abstract This paper considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial
      Read Full Article
    18. Convolution Kernels for Natural Language

      Michael Collins AT&T Labs--Research 180 Park Avenue, New Jersey, NJ 07932 mcollins@research.att.com Nigel Duffy Department of Computer Science University of California at Santa Cruz nigeduff@cse.ucsc.edu Abstract We describe the application of kernel methods to Natural Language Processing (NLP) problems. In many NLP tasks the objects being modeled are strings, trees, graphs or other discrete structures which require some mechanism to convert them into fea
      Read Full Article
    19. Discriminative Training Methods for Hidden Markov Models: Theoryand Experiments with Perceptron Algorithms.

      Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms Michael Collins AT&T Labs-Research, Florham Park, New Jersey. mcollins@research.att.com Abstract We describe new algorithms for training tagging models, as an alternative to maximum-entropy models or conditional random fields (CRFs). The algorithms rely on Viterbi decoding of training examples, combined with simple additive updates. We describe theory justifying the algorithms through a mod
      Read Full Article
    20. New Ranking Algorithms for Parsing and Tagging: Kernels overDiscrete Structures, and the Voted Perceptron

      New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron Michael Collins AT&T Labs-Research, Florham Park, New Jersey. mcollins@research.att.com Nigel Duffy iKuni Inc., 3400 Hillview Ave., Building 5, Palo Alto, CA 94304. nigeduff@cs.ucsc.edu Abstract This paper introduces new learning algorithms for natural language processing based on the perceptron algorithm. We show how the algorithms can be efficiently applied to exponential sized representa
      Read Full Article
    21. Ranking Algorithms for Named-Entity Extraction: Boosting and theVoted Perceptron.

      Ranking Algorithms for Named--Entity Extraction: Boosting and the Voted Perceptron Michael Collins AT&T Labs-Research, Florham Park, New Jersey. mcollins@research.att.com Abstract This paper describes algorithms which rerank the top N hypotheses from a maximum-entropy tagger, the application being the recovery of named-entity boundaries in a corpus of web data. The first approach uses a boosting algorithm for ranking problems. The second approach uses the voted perceptron algorithm. Both algorit
      Read Full Article
    22. Machine Learning Methods in Natural Language Processing

      Machine Learning Methods in Natural Language Processing Michael Collins MIT CSAIL Some NLP Problems Information extraction -- Named entities -- Relationships between entities Finding linguistic structure -- Part-of-speech tagging -- Parsing Machine translation Common Themes Need to learn mapping from one discrete structure to another -- Strings to hidden state sequences Named-entity extraction, part-of-speech tagging -- Strings to strings Machine translation -- Strings to underlying
      Read Full Article
    23. Machine Learning Methods in Natural Language Processing

      Machine Learning Methods in Natural Language Processing Michael Collins MIT CSAIL Some NLP Problems Information extraction – Named entities – Relationships between entities Finding linguistic structure – Part-of-speech tagging – Parsing Machine translation Common Themes Need to learn mapping from one discrete structure to another – Strings to hidden state sequences Named-entity extraction, part-of-speech tagging – Strings to strings Machine translation – Strings to underlying trees Pa
      Read Full Article
    24. Corrective language modeling for large vocabulary ASR with theperceptron algorithm.

      CORRECTIVE LANGUAGE MODELING FOR LARGE VOCABULARY ASR WITH THE PERCEPTRON ALGORITHM Brian Roark† , Murat Saraclar† , and Michael Collins‡ AT&T Labs-Research, 180 Park Ave., Florham Park, NJ 07932, USA ‡ MIT Artificial Intelligence Laboratory, Room NE43-723 200 (545) Technology Square, MIT Building NE43, Cambridge, MA 02139 † ‡ {roark,murat}@research.att.com mcollins@ai.mit.edu ABSTRACT This paper investigates error-corrective language modeling using the perceptron algorithm on word lattices. The
      Read Full Article
    1-24 of 74 1 2 3 »
  1. Categories

    1. Default:

      Discourse, Entailment, Machine Translation, NER, Parsing, Segmentation, Semantic, Sentiment, Summarization, WSD