3 Comments
User's avatar
BabbleOn's avatar

I couldnt figure out how to handle abbreviations (Dr., St., etc) without using an nlp model like spacy or nltk. Is there a simpler approach?

Expand full comment
Ardit Sulce's avatar

How about creating a list of common abbreviations and then checking if a word that ends with "." is contained in the abbreviations list and if not, it is the last word of a sentence.

abbreviations = {"dr.", "mr.", "mrs.", "ms.", "st.", "etc.", "e.g.", "i.e.", "vs."}

Expand full comment
BabbleOn's avatar

Good point. Looking into it there’s <100 English abbreviations and contractions. Not as bad as I initially thought.

Expand full comment