CSC 5930/9010, Fall, 2013
Special Topic: Text Mining
Thurs 6:15 - 9:00, Mendel G87
Dr. Paula Matuszek


This course is an ongoing exploration, and the syllabus may be modified significantly as the semester progresses.

Aug 29: Intro. Presentation.  Lab.  Assignment 1.

Sept 5: Text Features, NLTK. Presentation.  Lab.  Links from slides.  Assignment 2.

Sept 12:   Classifying documents. Presentation.  Lab.  Python Example, Input  Python Example, Classifying. PIAZZA info 

Sept 19:   More on classifying documents. Presentation.  Lab.  Assignment 3  Python Example, Classifying with BOW.

Sept 26:  Clustering. Presentation.  Lab.  KMeans Example  KMeans Longer Example. NLTK demo code for KMeans and GAAC 

Project Information 

Oct 3:   Information Extraction, GATE.   Information Extraction Presentation.  GATE Overview Presentation.  Lab.  Assignment 4 

Oct 10:  GATE, ANNIE, JAPE.   ANNIE Presentation.  Lab.  Assignment 5 

Oct 17: Fall break 

Oct 24:  Midterm  

Oct 31:  Machine Learning using GATE.  Presentation.  Link for Lab materials.  Assignment 6 

Paper Review Information (CSC 9010 only) 

Nov 7:   Sentiment Analysis Using GATE.  Presentation.  Link for Lab materials.  Assignment 7 

Nov 14:  Text Summarization, Conclusion. Summarization Presentation.  Lab.  Summarization Lab Links.  Conclusion. 

Project Descriptions.

Nov 21:  Projects

           Matt Kotwicki:Carrot2

           Michelle Duncombe:IBM Content Analytics-LanguageWare.

           Todd Giang: Text mining digital class notes

           Kelly Gremban: Information Visualization of Text Mining results

Dec 5:   Projects.

           Eric Hagman: Doing sentiment analysis on student reviews.

           Joseph Quadrino:Reddit Comment Analysis and Classification

           Akhila Yarlagadda: Oracle Data Miner.

           Dinesh Paladugu: Weka.

           Sri Varsha Devineni: Clustering methods demo.

           Rohitha Kiran Kodali: Demonstrating Weka.

           Harrison Stern: Analysis of presidential speeches.

Dec 12:   Projects.

           Mike Jancola: Clustering biblical texts.

           Sherin Ambrammadom Basheer: KNIME Text Mining plugins.

           Matt Marzin: Text Mining's Application in Cyber Security.

           Kurt Lehmer: Manual vs Automatic Classification - Case Study Using 10-K Reports.

           Philip Williams: Sentiment analysis of tweets related to four companies.

           Sreevidya Pothineni: Rapid Miner.

           Siva Sindhuri Yenamaladoddi: Orange Text Mining

Dec 19:   Final. Time is 6PM-8:30PM, in G87.