赞
踩
目录
1. What is Part of Speech (POS)? 词性是什么
2. Information Extraction 信息提取
2.2 POS Closed Classes (English)
2.4 POS Ambiguity in News Headlines
3.3 Derived Tags (Closed Class)
4.1 Why Automatically POS tag?
AKA word classes, morphological classes, syntactic categories 又名词类,形态词,句法类
Nouns, verbs, adjective, etc
POS tells us quite a bit about a word and its neighbours POS告诉我们关于一个词和它的邻居的相当多的信息:
Given this:
Obtain this:
Many steps involved but first need to know nouns (Brasilia, capital), adjectives (Brazilian), verbs (founded) and numbers (1960).
Open vs closed classes: how readily do POS categories take on new words? Just a few open classes: 开放类与封闭类:POS类别多容易接受新词?只有几个开放类。
Nouns
Verbs
Adjectives
Adverbs
Prepositions (in, on, with, for, of, over,…)
Particles
Determiners
Pronouns
Conjunctions
Modal verbs
And some more…
Many word types belong to multiple classes
POS depends on context
Compare:
A compact representation of POS information
Major English tagsets
At least one tagset for all major languages
NN noun
VB verb
JJ adjective
RB adverb
DT determiner
CD cardinal number
IN preposition
PRP personal pronoun
MD modal
CC coordinating conjunction
RP particle
WH wh-pronoun
TO to
NN (noun singular, wombat)
VB (verb infinitive, eat)
JJ (adjective, nice)
RB (adverb, fast)
PRP (pronoun personal, I)
WP (Wh-pronoun, what):
Important for morphological analysis, e.g. lemmatisation
For some applications, we want to focus on certain POS
Very useful features for certain classification tasks
POS tags can offer word sense disambiguation
Can use them to create larger structures (parsing; lecture 14–16)
Rule-based taggers
Statistical taggers
Typically starts with a list of possible tags for each word
Often includes other lexical information, e.g. verb subcategorisation (its arguments)
Apply rules to narrow down to a single tag
Large systems have 1000s of constraints
Assign most common tag to each word type
Requires a corpus of tagged words
“Model” is just a look-up table
But actually quite good, ~90% accuracy
Often considered the baseline for more complex approaches
Use a standard discriminative classifier (e.g. logistic regression, neural network), with features:
But can suffer from error propagation: wrong predictions from previous steps affect the next ones
A basic sequential (or structured) model
Like sequential classifiers, use both previous tag and lexical evidence
Unlike classifiers, considers all possibilities of previous tag
Unlike classifiers, treat previous tag evidence and lexical evidence as independent from each other
Next lecture!
Huge problem in morphologically rich languages (e.g. Turkish)
Can use things we’ve seen only once (hapax legomena) to best guess for things we’ve never seen before
Can use sub-word representations to capture morphology (look for common affixes)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。