NLTK pos_tag(): Get the Part-of-Speech of Words in Sentence – NLTK Tutorial

admin

4 years ago

In order to get the part-of-speech of a word in a sentence, we can use ntlk pos_tag() function. In this tutorial, we will introduce you how to use it.

Preliminary

In order to use post_tag() in nltk, we should import it.

from nltk import word_tokenize, pos_tag

Then we can start to extract the part-of-speech of a word.

Tokenizing words in sentence

We should split a sentence to some words. Here is an tutorial:

Tokenizing or Splitting Words and Sentences From String Using NLTK – NLTK Tutorial

s  ='TutorialExample.com is a programming tutorial site'

wx = word_tokenize(s)

Get the part-of-speech of a word

We will use nltk pos_tag() to extract.

pos = pos_tag(wx)
print(pos)

Run this code, we will get:

[('TutorialExample.com', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('programming', 'VBG'), ('tutorial', 'JJ'), ('site', 'NN')]

We will find pos is a python list, it contains some python tuples. Word and its part-of-speech is saved in it.

Notice

post_tag() can not get the part-of-speech of one word. Look at this example code:

pos = pos_tag('TutorialExample.com')
print(pos)

Run this code, it will output:

[('T', 'NNP'), ('u', 'JJ'), ('t', 'NN'), ('o', 'IN'), ('r', 'NN'), ('i', 'VBP'), ('a', 'DT'), ('l', 'NN'), ('E', 'NNP'), ('x', 'VBZ'), ('a', 'DT'), ('m', 'JJ'), ('p', 'NN'), ('l', 'NN'), ('e', 'NN'), ('.', '.'), ('c', 'VB'), ('o', 'JJ'), ('m', 'NN')]

We can find pos_tag() only receive a python list, a word will be processed by a sequence.