Natural Language Processing in Python using NLTK

LinuxFest 2008 Poster

Sean Boisen, <[myfirstname]@logos.com>

LinuxFest Northwest 2008, April 26


Slides at http://semanticbible.org/other/talks/2008/nltk/nltk.html


This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. Creative Commons License

But First, a Word from our Sponsors ...

Goals

Intersection of an entire technical field, a sophisticated programming language, and a complex toolkit: i can only do so much.

Intended Audience

Outline

Overview: Why Me?

Overview: Why Python?

Overview: What is Natural Language Processing?

Overview: What is Natural Language Processing? (2)

Why we need NLP

Annoying Questions (Computational) Linguists Hear

Overview: The Natural Language Toolkit

Overview: The Natural Language Toolkit (2)

So What Can You Do With NLTK?

So What Else Can You Do With NLTK?

(We won't have time to cover these today)

NLTK Corpora

Useful Utilities: nltk.probability

Useful Utilities: nltk.evaluate

Useful Utilities: nltk.evaluate (2)

c:\Python24\Lib\site-packages\nltk>evaluate.py
---------------------------------------------------------------------------
Reference = ['DET', 'NN', 'VB', 'DET', 'JJ', 'NN', 'NN', 'IN', 'DET', 'NN']
Test      = ['DET', 'VB', 'VB', 'DET', 'NN', 'NN', 'NN', 'IN', 'DET', 'NN']
Confusion matrix:
    | D         |
    | E I J N V |
    | T N J N B |
----+-----------+
DET | 3 0 0 0 0 |
 IN | 0 1 0 0 0 |
 JJ | 0 0 0 1 0 |
 NN | 0 0 0 3 1 |
 VB | 0 0 0 0 1 |
----+-----------+
(row = reference; col = test)
Accuracy: 0.8
---------------------------------------------------------------------------
Reference = set(['VB', 'DET', 'JJ', 'NN', 'IN'])
Test = set(['VB', 'DET', 'NN', 'IN'])
Precision: 1.0
Recall: 0.8
F-Measure: 0.888888888889

Chunking

Chatbots

Other NLP Appplications (not in NLTK)

Resources