CSC 9010: Special Topic
Natural Language Processing
Spring, 2005, Thurs, 6:15-9:00

Description: Does your software need to understand English? Natural language processing (NLP) has been a research topic of interest to computer scientists for many decades; it is one of the capabilities clearly required by the test for artificial intelligence proposed by Alan Turing in 1950. It is only in the past 10 years or so, however, that this research has matured enough to have significant practical application. The web has given NLP work a substantial impetus; it both increases by orders of magnitude the text material available electronically and highlights how impossible it is to deal with all of the material manually. NLP techniques now underlie and a variety of other functions.

Course Objectives: This course gives an overview of NLP, establishing a basic grounding in morphology, syntax, semantics, and pragmatics. Both knowledge-based and statistical approaches are explored. Practical applications, such as text mining will be considered and used to motivate some of the more theoretical material. Topics to be covered in more depth will be determined in part by the interests of the class. Hands-on exercises using Python and the Natural Language Toolkit (http://nltk.sourceforge.net) reinforce the material; student presentations give participants the opportunity to look in more depth at an area of interest.

Textbook: SPEECH and LANGUAGE PROCESSING: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by Daniel Jurafsky and James H. Martin

Prerequisite: Design and Analysis of Algorithms or permission of instructor.

Questions: email Dr. Papalaskari at map@villanova.edu or Dr. Matuszek at paula.a.matuszek@gsk.com