Name(s):

Date:

 

 

Lab 7 - Designing a Classifier

Evolution and Learning in Computational and Robotic Agents
MSE 2400   Dr. Tom Way

Introduction

Worth

What to Hand In

Due

Lab Steps

Part 1 - Create a Basic Gender Classifier

  1. Open the IDLE Python editor, type in the following program, and save it as "gender1.py":

    def gender_of(name):
        # make name all lowercase
        name = name.lower()

        # check last letter of name
        if name[-1] in ['a', 'e', 'i']:
            return 'female'
        elif name[-1] in ['k', 'n', 'o', 'r', 's', 't']:
            return 'male'
        else:
            return 'unknown'

    # Main program
    while True:
        # Have the user type in a name
        name = input('Enter a name (or exit)>')
        name = name.strip()

        # Exit if that's what they want
        if name.lower() == 'exit':
            break

        # Otherwise, print the name and gender
        else:
            print('{} is {}'.format(name, gender_of(name)))
     
  2. Run the program and try it out a number of times. Look over the code and try to understand what it is doing. (Hint: it is looking at a specific letter in each name and using it to tell whether it is a male or female name.) Write down some notes on how you think it is trying to determine the gender of a name.

Part 2 - Improving the Basic Gender Classifier

  1. Using the same program, see if you can determine why it does poorly sometimes and how you might make it better. Think about the names of people you know and what features of those names could be used to tell whether it is a male or female name. Do some quick research online about names if you like. Then, jot down a few notes about how you think the program could be more accurate:







     
  2. Write a brief hypothesis about how the accuracy of the program will do if you incorporated the ideas you came up with.







     
  3. Improve the program with one or more of your ideas by adding to or modifying the code. For assistance, refer to one of the online tutorials we looked at in a previous lab (Computer Science Circles) or do a Google search for easy Python tutorials (be sure to look for "Python 3" or "Python 3.4" rather than "Python 2" or "Python 2.7" as there are slight difference between the older and newer versions).
     
  4. How did your improved version do compared to the original? Write down whether or not your hypothesis was correct, including anything you noticed about the accuracy getting better or worse and why you think it happened.













     
  5. Demonstrate your new version to the instructor or TA and have them initial here: ____________

Part 3 - Using Naïve Bayes to Classify Gender

  1. Save the gender2.py program to a folder where you can find it. Then open it in IDLE.
  2. Briefly review the code, and notice it is more complex than the previous gender classification code. Read through the code, looking at the comments to get a better understanding of what it is doing.
  3. Run the program and observe its behavior. In particular, make a note here about the initial information: number of male and female names and the Classifier Accuracy:




     
  4. Continue testing the program by trying some of the same names you used in Part 2 (above) and write down how the accuracy of this version compares. Does it do a better job at classifying names by gender?






     
  5. Try modifying the TRAIN_SIZE and TEST_SIZE to be larger or smaller, and run the program again. Do this a few times with different sizes and compare the Classifier Accuracy of each, and create a list below of the TRAIN_SIZE, TEST_SIZE and Classifier Accuracy for each of your tests.









     
  6. What did you observe about the affects, if any, on Classifier Accuracy of changing the TRAIN_SIZE and TEST_SIZE? And if you saw an affect, why do you think is the reason?

Part 4 - Improving the Naïve Bayes Gender Classifier

  1. Write notes about how you might be able to improve the Classifier Accuracy of this classifier. Use a combination of common sense, careful thinking and analysis of what is happening, and information you might glean by looking over sections 1.1 and 1.2 of Chapter 6 of the NLTK Book.









     
  2. Modify the program, trying one or more of your ideas. You will probably need to experiment and try a few things to see what works, what helps, and what has no impact on accuracy.

    Focusing on selecting the best features to use for classification is an excellent approach. See if you can add to the gender_features function. There are ideas in section 1.2 of the NLTK Book and in the comment section all the way at the end of the gender2.py program.

    You might also decide to set the TRAIN_SIZE and TEST_SIZE to whatever results you found were best in the previous experiment.
     
  3. Test out a few of your ideas, and keep a record (a list) below of what you tried and how it changed the accuracy. Try to get the best accuracy you can. Your list should have a brief Description of the modification and the Classifier Accuracy that resulted from the modification.










     
  4. What was the approach you discovered that produce the best Classifier Accuracy and why do you think it worked? Be prepared to shared what you discovered during a class discussion.






     
  5. Demonstrate your new version to the instructor or TA and have them initial here: ____________

Part 5 - Building a Sentiment Analysis Classifier for Tweets

  1. Download and save the tweet_classify.py program.
     
  2. Run the program and observe its behavior, and note its Accuracy here:  ____________
     
  3. Experiment by changing the TWEET_PCT a number of times, anywhere between 0.0 and 100.0, and try to determine the best perctage of tweets to use for training. List the results of your experiments here, with the TWEET_PCT you used and the resulting Accuracy.















     
  4. Using the best TWEET_PCT you found, run the program again and record information about the top 5 Most Informative Features exactly as they are displayed. You will use these for comparison later. For more details, uncomment the call to show_tweet_results.









     

Part 6 - Improving the Tweet Data

  1. Look through the list of training and testing tweets, called tweets. Think about any experience you have with Twitter and writing or reading tweets, and edit the list of tweets to be more reflective of what real tweets are like. Change and add enough good tweets, some positive and some negative, so the list has at least 25 tweets in it.

    Demonstrate your enhanced version to the instructor or TA and get initials: ____________
     
  2. Make sure TWEET_PCT is set to the best value you found before, and run the program again with this new data, and record the new Accuracy.



     
  3. If your Accuracy was reduced, repeat the experiment above to find the TWEET_PCT and leave it set at that value, recording results of these additional experiments here (list TWEET_PCT and Accuracy).








     
  4. Again, record information about the top 5 Most Informative Features exactly as they are displayed. Compare with earlier important features and note any changes you observe.












     

Part 7 - Adding Data Automatically

  1. Now, you're going to add more training data automatically. Find the section in the code that is "commented out". In IDLE, highlight the entire section, starting with "### START HERE" all the way down to, and including, "### END HERE". Once highlighted, go to the menu and choose Format->Uncomment Region. The section of code will now be uncommented and ready to run.
     
  2. Run the program again and observe the additional output. Note any change in the Accuracy reported here:





     
  3. Because the additional training (and testing) data are from movie reviews, their content will be slightly different than a tweet. Don't be surprised if the results are worse and not significantly better. Now, experiment with at least 5 different values of the MORE variable, and see if raising or lowering that number (try values like 10, 50, 100, 200, 500, etc.) and see if you can find a "best" value for MORE, which is the number of additional items of training data that are added. Record your results here, listing the value for MORE and the corresponding Accuracy.















     
  4. What was the best value you found for MORE and what were the 5 Most Informative Features this time?



     
  5. Overall, how to you think the classifier does at reporting sentiment? How accurate is it, and do you feel that is good enough? Why or why not?












     
  6. Do you think this tool could be useful if you needed to evaluate a continuous stream of tweets in real time? Why or why not? Where do you think it does well and where does it fall short?


















     
  7. Finally, demonstrate your final version to the instructor or TA and get initials: ____________