当前位置:   article > 正文

Coursera自然语言处理专项课程02:Natural Language Processing with Probabilistic Models笔记 Week01

Coursera自然语言处理专项课程02:Natural Language Processing with Probabilistic Models笔记 Week01

Natural Language Processing with Probabilistic Models

Course Certificate

在这里插入图片描述

本文是 Natural Language Processing with Probabilistic Models 这门课的学习笔记,如有侵权,请联系删除。

在这里插入图片描述

Week 01 Autocorrect and Minimum Edit Distance

Learn about autocorrect, minimum edit distance, and dynamic programming, then build your own spellchecker to correct misspelled words!

Learning Objectives


  • Word probabilities
  • Dynamic programming
  • Minimum edit distance
  • Autocorrect

Overview

You use auto-correct everyday. When you send your friend a text message, or when you make a mistake in a query, there is an autocorrect behind the scenes that corrects the sentence for you. This week you are also going to learn about minimum edit distance, which tells you the minimum amount of edits to change one word into another. In doing that, you will learn about dynamic programming which is an important programming concept which frequently comes up in interviews and could be used to solve a lot of optimization problems.

在这里插入图片描述

Autocorrect

Autocorrects are used everywhere. You use them in your phones, tablets, and computers.

在这里插入图片描述

To implement autocorrect in this week’s assignment, you have to follow these steps:

  • Identify a misspelled word

  • Find strings n edit distance away: (these could be random strings)

  • Filter candidates: (keep only the real words from the previous steps)

  • Calculate word probabilities: (choose the word that is most likely to occur in that context)

Building the model

  1. Identify the misspelled word

When identifying the misspelled word, you can check whether it is in the vocabulary. If you don’t find it, then it is probably a typo.

  1. Find strings n edit distance away

在这里插入图片描述

  1. Filter candidates

In this step, you want to take all the words generated above and then only keep the actual words that make sense and that you can find in your vocabulary.

在这里插入图片描述

Lab: Building the vocabulary

NLP Course 2 Week 1 Lesson : Building The Model - Lecture Exercise 01

Estimated Time: 10 minutes

Vocabulary Creation

Create a tiny vocabulary from a tiny corpus


It’s time to start small !

Imports and Data

# imports
import re # regular expression library; for tokenization of words
from collections import Counter # collections library; counter: dict subclass for counting hashable objects
import matplotlib.pyplot as plt # for data visualization
  • 1
  • 2
  • 3
  • 4
# the tiny corpus of text ! 
text = 'red pink pink blue blue yellow ORANGE BLUE BLUE PINK' # 
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Gausst松鼠会/article/detail/369341
推荐阅读
相关标签