Introduction to the exciting world of Chatbots


Hi there ! Now that we have started to work our way through the NLP field I thought that it would be a good idea to start implementing those techniques in interactive tools such as a chatbot in order to start grasping the meaning and the potential of those different techniques that we have introduced in the earlier tutorials.

 

However, chatbot can become quite complex to code, that's why in order to not skip any steps we will first build in this tutorial a very basic chatbot based on the NLTK package to understand the fundamentals and then in future tutos keep on improving this first draft to the extent of building a little chatbot app in iOS.

 

But enough talking, let's start coding  😃. 

 

First, let's download the require libraries :

 

 

Now that we are set as in the previous NLP tutos, we have to define a scraper function in order to : 

  • retrieve raw text from any Wikipedia page
  • process these data into the right format
  • Convert our raw formatted text corpus into a list of sentences and a list of word

 

Then in order to clean in a satisfying manner our inputs and finalized the pre-process of our data we have to create two functions to perform lemmatization and retrieve any signs of punctuations that could introduce some noise later on when performing the summarization operation.

 

 

Now, that we have pre-process our data it's time to get down to the main part of the chatbot and see how we can allow it to answer queries from the end user. To do so, we will here utilize two basics principles called document similarity and cosine similarity through the use of the scikit learn library and generate response based on the similarity between words entered by the user and the words in the corpus.

 

In other words, for each query of the user, we will asks the computer to scan the query, tokenize it and based on the occurences of each of those tokens in the list of sentences obtained from our wiki article, create a ranking in order to generate a response in which those "query tokens" are prevalent.

 

Also in order to not have a too robot-like chatbot, it's cool to introduce some greeting phrases :

 

 

Finally, once everything is in place, the only thing left is to specify through an if else statement what we want our chatbot to respond at the start and at the end of the conversation depending on the user's query. 

 

And that's it you have now a ready to use chatbot !

 

( -> As always the full code is available here on the website github page 😉 )

 

However as I said in the introduction this chatbot is pretty basic and will vary greatly in accuracy depending on the wiki article that you choose and the questions that you input so if you want to pursue this project I invite you to go check out the coming up tutorials once they will be available, in the meantime take care and happy coding ! 😀