The following post is using SpaCy to do all of the NLP work. Throughout the following post we will cover a number of different NLP tasks that are foundational to the work that can be built upon to generate complex syntatical modelling.


This process takes any body of text (more specifically a Document) and breaks it down either to the sentence level or the word level.

import spacy 
nlp = spacy.load('en')
doc = nlp(u'I am flying to Frisco')
print([w.text for w in doc])


Lemmatization is taking the root form of a word. Below we have a better idea of what that might look like.

doc = nlp(u'this product integrates both libraries for downloading and applying patches')
for token in doc: 
  print(token.text, token.lemma_)

this this product product integrates integrate both both libraries library for for downloading download and and applying apply patches patch

The line integrates integrate is a perfect example.