Try   HackMD

A (very) brief introduction to Natural Language Processing

NLP
Computational reading, deciphering and "understanding" of human languages through machine learning

Natural Language Processing (NLP) covers a broad range of techniques that apply computational analytical methods to textual content, which provide means of categorizing and quantifying text.

(Saldaña 2018)

Tokenization:
splitting text into meaningful elements
(Arnold and Tilton 2015)

Lemmatization:
"using the "dictionary form" of words, versus whatever inflected form is actually present in the text"
Quinn Dombrowski
https://github.com/multilingual-dh/nlp-resources

Named entity recognition (NER):
determining parts in a text or a corpus that can be associated with and categorized into predefined groups (e.g. people, places, organizations)

Sentiment analysis:
using natural language processing, text analysis, computational linguistics, to identify, extract, quantify, and study affective states and subjective information.

What are NLP packages and application useful for?

• Keyword extraction
• Named entity recognition
• Clustering documents in a corpus
• Comparing document(s) to a reference corpus
• Topic modelling
• Sentiment / opinion analysis

NLP packages and applications

Stanford NPL / Named Entity Recognizer

https://nlp.stanford.edu/software/CRF-NER.html

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

**Task: **try the Stanford Named Entity Recognizer online:
http://corenlp.run/

  • Open the "Once_Upon_a_Time_in_America" plain text file from the "Wikipedia Movie Summaries" dataset in a text editor
  • copy the text, paste it into NER's text box
  • click "Submit"
  • scroll down on your browser to identify parts of speech and named entities (e.g. people, places). Does it work?

Wordseer

Wordseer is a Java-based text analysis environment that combined NLP and visualization.
https://wordseer.berkeley.edu/

Image Not Showing Possible Reasons
  • The image file may be corrupted
  • The server hosting the image is unavailable
  • The image path is incorrect
  • The image format is not supported
Learn More →

Other NLP packages

Multilingual NLP