赞
踩
from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
text1.concordance("Dick")
Displaying 25 of 84 matches: Dick by Herman Melville 1851 ] ETYMOLOGY must be the same that some call Moby Dick ." " Moby Dick ?" shouted Ahab . " D e that some call Moby Dick ." " Moby Dick ?" shouted Ahab . " Do ye know the w Death and devils ! men , it is Moby Dick ye have seen -- Moby Dick -- Moby Di it is Moby Dick ye have seen -- Moby Dick -- Moby Dick !" " Captain Ahab ," sa ck ye have seen -- Moby Dick -- Moby Dick !" " Captain Ahab ," said Starbuck , Captain Ahab , I have heard of Moby Dick -- but it was not Moby Dick that too of Moby Dick -- but it was not Moby Dick that took off thy leg ?" " Who told my hearties all round ; it was Moby Dick that dismasted me ; Moby Dick that b s Moby Dick that dismasted me ; Moby Dick that brought me to this dead stump I white whale ; a sharp lance for Moby Dick !" " God bless ye ," he seemed to ha white whale ? art not game for Moby Dick ?" " I am game for his crooked jaw , l whaleboat ' s bow -- Death to Moby Dick ! God hunt us all , if we do not hun hunt us all , if we do not hunt Moby Dick to his death !" The long , barbed st owels to feel fear ! CHAPTER 41 Moby Dick . I , Ishmael , was one of that crew ividualizing tidings concerning Moby Dick . It was hardly to be doubted , that on must have been no other than Moby Dick . Yet as of late the Sperm Whale fis ident ignorantly gave battle to Moby Dick ; such hunters , perhaps , for the m g and piling their terrors upon Moby Dick ; those things had gone far to shake ies , which eventually invested Moby Dick with new terrors unborrowed from any rmen recalled , in reference to Moby Dick , the earlier days of the Sperm Whal ngs were ready to give chase to Moby Dick ; and a still greater number who , c was the unearthly conceit that Moby Dick was ubiquitous ; that he had actuall their superstitions ; declaring Moby Dick not only ubiquitous , but immortal ( shaped lower jaw beneath him , Moby Dick had reaped away Ahab ' s leg , as a
text2.similar("father")
mother sister brother heart wife own name family face house sex
feelings cousin son engagement mind head being attachment choice
text2.common_contexts(["monstrous","very"])
a_pretty is_pretty a_lucky am_glad be_glad
text1.dispersion_plot(["big", "very", "whale", "pretty"])
text4.dispersion_plot(["man","woman"])
number = sorted(set(text3))
len(number)
2789
from __future__ import division #确保使用浮点除法
len(text3)/len(set(text3))
16.050197203298673
100 * text5.count("lol")/len(text5)
1.5640968673628082
def lexical_diversity(text):
return len(text)/len(set(text))
def percentage(count, total):
return 100 * count / total
print lexical_diversity(text2)
print percentage(text2.count("sense"),len(text2))
20.7194497293
0.0218963666158
ex1= ['Monty', 'Python', 'and', 'the', 'Holy', 'Grail','.']
print sorted(ex1)
print len(ex1)
print lexical_diversity(ex1)
print percentage(ex1.count("."),len(ex1))
['.', 'Grail', 'Holy', 'Monty', 'Python', 'and', 'the']
7
1.0
14.2857142857
fdist1 = FreqDist(text2)
fdist1.keys()[:50]
[u'succour', u'four', u'woods', u'hanging', u'woody', u'conjure', u'looking', u'eligible', u'scold', u'unsuitableness', u'meadows', u'stipulate', u'leisurely', u'bringing', u'disturb', u'internally', u'hostess', u'mohrs', u'persisted', u'Does', u'succession', u'tired', u'cordially', u'pulse', u'elegant', u'second', u'sooth', u'shrugging', u'abundantly', u'errors', u'forgetting', u'contributed', u'fingers', u'increasing', u'exclamations', u'hero', u'leaning', u'Truth', u'here', u'china', u'hers', u'natured', u'substance', u'unwillingness', u'pretensions', u'reports', u'NOT', u'NOW', u'divide', u'sweetest']
fdist1['sweetest']
2
fdist1.plot(50, cumulative=True)
len(text2)
141576
fdist1.hapaxes()
[u'succour', u'woody', u'conjure', u'unsuitableness', u'meadows', u'stipulate', u'leisurely', u'hostess', u'mohrs', u'sooth', u'shrugging', u'abundantly', u'errors', u'forgetting', u'exclamations', u'hero', u'Truth', u'substance', u'reports', u'Three', u'summons', u'forbore', u'cherishing', u'impartiality', u'females', u'successful', u'irksome', u'pursue', u'complaining', u'significancy', u'feature', u'embellishments', u'hop', u'abstraction', u'OWN', u'entrusted', u'keeps', u'nonsensical', u'restriction', u'unexhilarating', u'unworthiness', u'concurrence', u'wrought', u'fir', u'unforeseen', u'recognised', u'auditors', u'endeavoring', u'admirable', u'enrich', u'wooded', u'rumour', u'blushes', u'ankle', u'uninfluenced', u'interfering', u'preceded', u'whip', u'toleration', u'literature', u'diction', u'excepting', u'mend', u'Vanity', u'sheet', u'estimate', u'unstudied', u'tempting', u'breed', u'Please', u'clamorous', u'blossoms', u'ranging', u'project', u'uncouth', u'amusing', u'Mansion', u'loitered', u'tempted', u'Extend', u'shuffling', u'theme', u'bliss', u'touched', u'stammered', u'relish', u'esteeming', u'expediency', u'unexpectedly', u'playfulness', u'rings', u'score', u'scorn', u'unobtrusiveness', u'sacrificing', u'refinement', u'demanded', u'argue', u'adapted', u'Willing', u'Comparisons', u'tallest', u'exactness', u'incommode', u'worked', u'foreplanned', u'conditioned', u'anticipating', u'Valley', u'dryness', u'fairly', u'boiled', u'qualifications', u'confusedly', u'redeem', u'echoed', u'misconstruction', u'beset', u'exclaiming', u'rushing', u'ham', u'obedient', u'disagreement', u'birth', u'replace', u'remind', u'misled', u'beneficial', u'honeysuckles', u'fox', u'witty', u'losing', u'memorable', u'bowing', u'Think', u'shaken', u'nought', u'Scotland', u'respectably', u'administer', u'beings', u'despised', u']', u'denoting', u'safeguard', u'humbled', u'mighty', u'juvenile', u'Sit', u'restorative', u'acacia', u'unlover', u'protested', u'solicitation', u'dotted', u'circumstanced', u'intend', u'intent', u'rolling', u'beamed', u'whoever', u'indolent', u'vex', u'recurrence', u'fullest', u'speculation', u'occurrence', u'celebrated', u'funeral', u'commonly', u'Born', u'Frosts', u'reclaim', u'ESTEEM', u'prefer', u'obstinacy', u'humouring', u'undergone', u'wicket', u'sisterly', u'resuscitation', u'Those', u'loving', u'refrain', u'odious', u'militated', u'Pope', u'Midsummer', u'believes', u'indolence', u'stole', u'deserves', u'poking', u'cavil', u'dove', u'stress', u'canvassing', u'inhabiting', u'swollen', u'Encouraged', u'unsolicited', u'LESS', u'quarrelled', u'tore', u'derive', u'haughty', u'Folly', u'bashful', u'overpowering', u'establishing', u'bowling', u'prudently', u'retailed', u'giddy', u'astray', u'investigation', u'Offended', u'complicated', u'remainder', u'patronage', u'crossness', u'undesirable', u'jumbled', u'conscientiously', u'habitation', u'partook', u'city', u'2', u'stuffed', u'JOHN', u'representing', u'Seven', u'Exert', u'depressed', u'coats', u'KNEW', u'tasted', u'jewels', u'tastes', u'lurking', u'BOTH', u'Half', u'coincide', u'opposite', u'discerning', u'horridly', u'impoverishing', u'blockhead', u'bright', u'transact', u'uppermost', u'dispute', u'dissimilar', u'condemning', u'Having', u'Unaccountable', u'borrow', u'landlord', u'noisier', u'CATCHING', u'mutton', u'refreshed', u'apiece', u'rapacious', u'thickly', u'cramps', u'plantation', u'stretch', u'west', u'braving', u'practised', u'hilarity', u'possessor', u'endured', u'.)--', u'regularity', u'fame', u'parliament', u'reanimate', u'unemployed', u'sterling', u'defy', u'entanglement', u'devolved', u'veal', u'judiciously', u'scrape', u'vague', u'doubtingly', u'stranger', u'discarded', u'militate', u'divine', u'restoring', u'destroys', u'biased', u";'", u'edtions', u'meal', u'practicable', u'image', u'widower', u'dawned', u'lounging', u'sealed', u'DRAW', u'imbibed', u'survived', u'buying', u'abused', u'pull', u'rage', u'abuses', u'darker', u'Hon', u'accents', u'stealing', u'associating', u'ay', u'blinded', u'Gracious', u'accosted', u'mass', u'original', u'curate', u'caused', u'reasoning', u'improperly', u'Biddy', u'Precious', u'Conversation', u'LOOK', u'desertion', u'honourably', u'Drury', u'outdone', u'mound', u'cessation', u'regiment', u'crowned', u'6', u'inquisitiveness', u'spoilt', u'Relate', u'foundations', u'keeping', u'gallop', u'unbiased', u'salts', u'reproachfully', u'respective', u'imminent', u'enlarge', u'relinquished', u'disgraced', u'sympathize', u'imagery', u'passages', u'incessantly', u'installed', u'signs', u'shuddering', u'propose', u'likeness', u'assemblies', u'Newton', u'truths', u'upstairs', u'disclaiming', u'grandmothers', u'exorbitant', u'embellishment', u'candlelight', u'silks', u'denoted', u'impudence', u'risen', u'rises', u'II', u'owners', u'decently', u'afflictions', u'instigation', u'rooted', u'transgressed', u'disordered', u'heads', u'threatening', u'Twill', u'demonstrations', u'reprobate', u'extorted', u'reliance', u'interrupting', u'unequal', u'accommodations', u'defended', u'surpassed', u'degraded', u'occupation', u'wrapt', u'detaining', u'waistcoats', u'blameable', u'remedy', u'closely', u'compass', u'cruelly', u'enemy', u'proclaim', u'potent', u'outwardly', u'seconded', u'hauteur', u'premises', u'Against', u'untouched', u'retarded', u'bely', u'archness', u'publishing', u'proprietor', u'39', u'hardness', u'slyly', u'Cold', u'Concern', u'indelicacy', u'beautifully', u'vanish', u'renounced', u'shorten', u'failure', u'doat', u'infamous', u'oddest', u'comprise', u'Ungracious', u'detract', u'THREE', u'conclusions', u'servilely', u'admission', u'parents', u'depravity', u'reverted', u'emergency', u'emergence', u'projects', u'stylish', u'disorder', u'palm', u'curious', u'eclat', u'novelty', u'religion', u'seclusion', u'discontented', u'unintentional', u'awaited', u'appropriate', u'repaid', u'spending', u'occupy', u'unknowingly', u'considerably', u'undeserving', u'patronised', u'nieces', u'unpremeditated', u'genial', u'reconcile', u'defined', u'presided', u'surveying', u'stiffly', u'invalid', u'condolence', u'livings', u'prosecution', u'anticipations', u'rescued', u'indecorous', u'scratch', u'broader', u'amiss', u'carelessly', u'resources', u'panting', u'detested', u'Dennison', u'inelegant', u'incoherently', u'imply', u'henceforth', u'flowing', u'scenery', u'rascally', u'gathered', u'Gibson', u'scornfully', u'desiring', u'cheered', u'encroachments', u'pencil', u'laboured', u'MADAM', u'bodily', u'foolishly', u'retreated', u'streamed', u'purposes', u'await', u'preferring', u'huswifes', u'counter', u'alloy', u'recreating', u'chosen', u'imperfection', u'spoiling', u'unbounded', u'forfeiting', u'billiard', u'conformity', u'undertaking', u'traced', u'scanty', u'slightingly', u'thirteen', u'irritation', u'wander', u'alighted', u'blown', u'alleged', u'Farm', u'malady', u'enforcement', u'stomach', u'HERS', u'tortured', u'torrent', u'ingenious', u'separations', u'gently', u'fourteenth', u'viewed', u'patroness', u'manor', u'courtesy', u'wounded', u'bedroom', u'unconnected', u'trivial', u'Grandeur', u'conciliate', u'quickened', u'riding', u'handle', u'undivided', u'Whether', u'familiar', u'listener', u'Once', u'contemptible', u'taxed', u'guessing', u'allusion', u'incurable', u'packed', u'illaudable', u'destiny', u'insulting', u'quickest', u'barbarously', u'observable', u'whispering', u'meantime', u'powered', u'poured', u'feather', u'>', u'hateful', u'banish', u'LUCY', u'westerly', u'Supported', u've', u':--"', u'immoderately', u'pangs', u'romance', u'feminine', u'covenant', u'ball', u'monopolize', u'expand', u'philippic', u'patterns', u'afflicted', u'clergyman', u'fluctuating', u'goings', u'incessant', u'descendant', u'chairs', u'inconstant', u'recognition', u'disbelief', u'apprehended', u'hoarded', u'undoubtingly', u'drift', u'repaired', u'merited', u'concession', u'diabolical', u'cultivated', u'roads', u'quarrelling', u'Early', u'prepossessing', u'stocks', u'justifying', u'encouragements', u'-?"', u'MIND', u'delivery', u'grate', u'chained', u'detestably', u'cordial', u'rightly', u'nurses', u'contribute', u'faintness', u'denote', u'disengagement', u'effected', u'expensiveness', u'efficacy', u'excellencies', u'trials', u'compares', u'behold', u'illusion', u'dismiss', u'surplice', u'abhor', u'unspeakable', u'saves', u'oldest', u'effectually', u'sellers', u'disobedient', u'immoveable', u'FAITH', u'Strange', u'incompatible', u'Sally', u'Brown', u'admirers', u'equals', u'Queen', u'Hush', u'angrily', u'uncivil', u'negative', u'knoll', u'administering', u'striving', u'ere', u'transparency', u'blooming', u'fanciful', u'unwarily', u'indubitable', u'taverns', u'nipped', u'frequency', u'befallen', u'production', u'uncordial', u'break', u'band', u'bank', u'rocks', u'lifted', u'Confess', u'Parrys', u'remorse', u'medicine', u'disagreements', u'mourning', u'disputes', u'forcibly', u'detecting', u'unfulfilled', u'caprice', u'festival', u'footsteps', u'Going', u'thick', u'pardonable', u'seduction', u'compromise', u'dogs', u'Preparation', u'splendidly', u'contrasted', u'interests', u'encumbered', u'maxims', u'completed', u'circles', u'exercised', u'extending', u'accounted', u'respectfully', u'wrung', u'painfully', u'guidance', u'deepest', u'adequate', u'warmest', u'yielded', u'covering', u'exchanged', u'rung', u'observer', u'discussions', u'draws', u'revealment', u'PARTIES', u'confessedly', u'blights', u'packages', u'climate', u'cod', u'negotiation', u'collecting', u'widen', u'seizure', u'dawdled', u'petty', u'attacked', u'inconsolable', u'deterred', u'Westminster', u'east', u'aim', u'Esteem', u'sting', u'slighting', u'1811', u'bedchamber', u'acquitting', u'fuss', u'inheritor', u'swell', u'hang', u'confiding', u'blamed', u'Mind', u'Mine', u'mingle', u'slighter', u'adding', u'belongs', u'tricking', u'retreat', u'invaluable', u'critique', u'Epicurism', u'comments', u'illustration', u'impenetrable', u'newspapers', u'banker', u'horizon', u'Priory', u'rendering', u'amidst', u'unchanging', u'editions', u'unusually', u'TWO', u'evidence', u'subsist', u'stake', u'holding', u'test', u'brothers', u'assiduous', u'paces', u'Bishop', u'beds', u'songs', u'contributing', u'mounted', u'disapproves', u'gigs', u'disapproved', u'puppyism', u'feebly', u'blast', u'feeble', u'Just', u'altering', u'agitate', u'niggardly', u'helpless', u'foregoing', u'uniform', u'Imagine', u'During', u'appeal', u'muslin', u'merriment', u'pillow', u'retired', u'captivate', u'Pity', u'club', u'ninny', u'clue', u'commissioned', u'Down', u'hears', u'gales', u'Dullness', u'malevolence', u'economy', u'superintend', u'?)', u'flourish', u'lifting', u'candles', u'crept', u'playful', u'vicinity', u'inflicted', u'Concealing', u'Norfolk', u'hall', u'wont', u'concerto', u'Misses', u'bursts', u'furnishing', u'em', u'directing', u'naming', u'shown', u'perfections', u'oftenest', u'disapprobation', u'temporizing', u'Impatient', u'intruding', u'counteracted', u'promotion', u'occupations', u'omitted', u'comprised', u'Till', u'forwarding', u'lightened', u'artless', u'rugged', u'respectability', u'renewing', u'spraining', u'twould', u'unconquerable', u'pattern', u'dispersed', u'ELINOR', u'whiled', u'honours', u'Pardon', u'protestations', u'suspects', u'3', u'despatching', u'prettiest', u'emigrant', u"'--", u'reasonableness', u'reluctantly', u'comer', u'shamefully', u'madness', u'dispatch', u'Eager', u'thistles', u'muttered', u'olives', u'external', u'countless', u'winks', u'trick', u'bias', u'hens', u'worry', u'northward', u'indefatigable', u'Supposing', u'thunderbolt', u'constitutional', u'charged', u'speed', u'politely', u'execution', u'miracle', u'verbal', u'zealously', u'duration', u'capability', u'passions', u'unlocked', u'garret', u'carefulness', u'tuition', u'fondness', u'Get', u'boldly', u'sedulously', u'persecution', u'Add', u".'--", u'stare', u'forwarded', u'start', u'cats', u'drains', u'pitched', u'copied', u'toned', u'intents', u'remembrances', u'humiliations', u'sheath', u'Law', u'THERE', u'endeavors', u'bulk', u'moonlight', u'Dearest', u'alleviation', u'expatiate', u'pique', u'bequeath', u'referring', u'confidential', u'souls', u'abatement', u'-?', u'streets', u'chuckle', u'induce', u'witnesses', u'Thunderbolts', u'apologized', u'witticisms', u'loose', u'answers', u'praises', u'inspired', u'soldier', u'Clarke', u'attendant', u'ash', u'injure', u'mysterious', u'Abundance', u'St', u'producing', u'nine', u'spontaneous', u'history', u'claimed', u'weakening', u'resettled', u'tries', u'imaginations', u'Cassino', u'daggers', u'contrives', u'dispersing', u'dream', u'systems', u'differed', u'friendliest', u'forbear', u'food', u'ye', u'atoning', u'OCCASION', u'Gentleman', ...]
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。