NLTK (Natural language Toolkit) is one of the leading Python package to work with Natural Language Processing. NLTK has introduced ease-to-use interfaces to over 50 corpora and lexical resources such as tokenization, lemmatization, stemming, WordNet, tagging, parsing, stopwords and text classification.
NLTK is a free open-source and available for Linux, Windows and macOS. This tutorial has explained about the installation of NLTK.
Installing NLTK
Open the command prompt and hit the following command
Installation with Python 2.X –
pip install nltk
Installation with Python 3.X –
pip3 install nltk
Installing NLTK Data
NLTK comes with many corpora, trained models, etc. After finishing the installation of the NLTK package, please install NLTK data by the following command.
First, open the Python interpreter by hit the command python2 or python3 in command prompt.
Hit the below command in the Python interpreter prompt.
>>> import nltk >>> nltk.download()
NLTK downloader window should open. You can download the individual package or collection by selecting it. And last hit the Download button. This will start to download the selected items.
After completion of the download of NLTK Data, Let’s ensure that NLTK works perfectly by using the following code.
>>> from nltk.tokenize import sent_tokenize, word_tokenize >>> data = "All work and no play makes jack a dull boy, all work and no play" >>> print(word_tokenize(data)) ['All', 'work', 'and', 'no', 'play', 'makes', 'jack', 'a', 'dull', 'boy', ',', 'all', 'work', 'and', 'no', 'play']
. . .