AI, NLP & Robotics Group

The Artificial Intelligence, Natural Language Processing and Robotic Group strives to find applications of these areas to solve interesting and difficult problems.  We are looking for interesting projects to add to our efforts.  This groups shares many overlapping interests with the ACT Lab's Entertainment Technology Group.


Project: Sentiment Analysis & Sentiment Tracking

Purpose: Develop an analysis technique and web-based application to measure and track Internet-expressed sentiment on a particular topic or search term.

Researchers: Tom Way, Alexis Price, Carsen Schulz, Manav Thadani

Applications: Finance, political science, social networking, marketing, Mmass communication, psychology, etc.

Description:

Assuming that the Internet, or a cohesive subset of it, can be viewed as a cohesive entity, or entities, with a measurable point-of-view, this project aims to create software that measures that point-of-view by performing NLP-based sentiment analysis.

This project has an initial goal of analyzing how public opinion affects financial performance of top industry competitors. We will use data from Twitter, Facebook, and news sources discussing specific companies and feed that data into a sentiment analysis model. The output will compare the resulting sentiments to various financial performance indicators such as stock price, moving average, price to earnings ratio and others.

Secondarily, the processed data from searches searching for a particular term will be reused to reduce traffic to APIs of Twitter, Facebook, etc., enabling refinement of algorithms without impact to 3rd party data sources. Longitudinal data also will be collected so that attitudes and perceptions, as expressed collectively by the Internet (or subset), can be tracked and analyzed.

Incubator idea:

This idea was proposed on our Idea Incubator page for awhile:

How Are We Today? - Design a web-based application that gathers news content from a wide variety of sources, performs frequency analysis on the content and determines what the general mood of the world is on that day. Steps will include creating a web-application that retrieves the text from an online news source, ranks the occurrence of all words, displays a "word cloud," creates a list of criteria words that are used to measure the mood expressed by the retrieved news (happy, sad, etc.), can be configured and targeted to other domains beyond news, such as politics (what is the political mood of the web? what is the political bias of a web site?), celebrities (how does the web feel about Tiger Woods?), specific countries (what does Europe think of the U.S.?), etc. Similar project was called NewsMood.

Research projects:

  • Sentiment Tracker - Online tool developed that performs and graphs sentiment extracted from Twitter posts, created by Tom Carpenter. Current efforts are reimplementation to improve sentiment analysis results, archiving of results, and comparative graphing of multiple tracks.
  • Amazon Sentiment Analyzer - Tool under development by Rohitha Kodali to compare Amazon product review content with the number of stars assigned to each.
  • Web-based Meta-search application - Develop a web-application using Google or other malleable platform to search for a desired term, retrieve a sample of results, collate and process the results, and then display the results in a "word cloud" or other meaningful representation.
  • Database back-end for analysis - Construct a database back-end for the meta-search and analysis application so that daily or momentary results can be archived and later retrieved and compared. In addition, implement graphical and quantitative ways to display the changes in sentiment over time.

Related Papers

  • Daniel Loureiro, Goreti Marreiros, and Jose Neves. "Sentiment Analysis of News Titles; The Role of Entities and a New Affective Lexicon" - Previous sentiment analysis approaches rely on pre-set corpora or knowledge bases. This project tries to change that by combining information from Wikipedia with Facebook profile information [Maybe this sort of approach could help avoid inaccurate analysis of events like the “Seven Minutes of Terror”]. The idea is to create an affective lexicon via machine learning, not entering all the words by hand. References to tools:
    • Commonly used affective lexicons: General Inquirer and Ortony’s Affective Lexicon (p 2)
    • WordNet – “Lexical database for the English language with over 150,000 nouns, verbs, adjectives and adverbs organized by a number of semantic relations such a synonym sets (known as synsets), hierarchies, and others.” (p 3)
    • ConceptNet – “A common sense knowledge base with over a million facts organized as different concepts, interconnected through a set of relations, such as ‘UsedFor’, ‘IsA’, “Desires”, and more.” (p 3) (Classifies sentences into one of Ekman’s six basic categories of emotions)
    • SenseNet – “used WordNet to detect polarity values of words and sentence-level textual data … count[s] the positive and negative senses from the definitions in WordNet.” (p 3)
    • Stanford Parser – “typically used to examine the grammatical structure of sentences. The Stanford Parser is a probabilistic parser that produces the most likely analysis based on knowledge gained from hand-parsed sentences.” (p 5)
    • Affective Norms for English Words (ANEW) – “corpus provides an extensive word list (1034 words) with numerical ratings (1-9) compiled by several experimental focus groups. It considers three dimensions of affect: valence, arousal and dominance. Ratings for each range semantically from ‘pleasant’ to ‘unpleasant’, ‘excited’ to ‘calm’ and ‘controlling’ to ‘submissive’, respectively. … Low valences are interpreted as negative and high valences as positive” (p 7)

Resources


Project: Automatic Image Description

Purpose: Develop an image analysis and description generation system

Researchers: Tom Way

Researcher Alumni: Sandeep Vodapally

Applications: Assistive technology, target acquisition, security & monitoring

Description:

Blind and low-vision computer users are faced with a quandary.  The popularity of the Internet has led to an explosion of fancy graphics, beautiful (and ugly) web site layout and design, and easily available digital photographs numbering in the millions, none of which are accessible to this group of users.

Our current research plans are to again pursue the issues, technologies and solutions for providing efficient and meaningful access to graphical information for blind computer users.  Immediate goals are to explore automated image analysis and feature detection, combined with speech generation, to create a technique for automatic generation of image descriptions. One important component of this analysis is determining how to recognize 3-dimensional objects from the real-world in the inherently 2-dimensional images. This work will benefit blind and low-vision computer users, with extension to military and commercial low-light and nighttime navigation and communication.

Research projects:

  • Using 3-D image data in 2-D recognition tasks - Making use of our department's 3-D image capture equipment to generate data representations for 3-D objects, the goal is to create a database of objects that can be used in 2-D recognition tasks, and to discover algorithms for recognizing object in 2-D images when no 3-D data is available.
  • Implementation of a prototype image analysis application - Java program that will load an image, analyze the content of the image looking for visual clues that match objects in its visual database, infer positional and relational information about these detected objects, and generate a description of the objects.  An additional feature will be the use of speech synthesis output of the object descriptions.

Tasks:

  • Generate 3-D image data for common objects, and separate 2-D images of common objects
  • Identify and acquire available research software for image analysis
  • Research techniques for image description extraction
  • Design prototype system, implement, experiment
  • Identify target publications, write, submit

Resources


Project: Writing Analyzer

Purpose: Develop a writing level analyzer.

Researchers: Tom Way

Applications: Support for plagiarism detection and writing improvement tools.

Approach:

An effective way to evaluate writing is to measure the apparent difficulty of the content to better guide the writing to match the desired target difficulty level. This measure is also useful for analyzing textual material to locate inconsistencies within the document to determine the likelihood of plagiarism.

Resources:

Tasks:

  • Locate example source code or detailed algorithm explanations.
  • Create basic application framework
  • Create analyzer class or classes
  • Integrate and test
  • Refactor to make analyzer classes portable to other applications, such as the SNITCH plagiarism detection tool.

Project: Junkbots

Purpose: Develop robots using recycled materials.

Researchers: None at present

Researcher Alumni: Luis Ahumada

Applications: Promoting AI and robotics for fun and education

Approach:

With the technology boom has come an explosion of junked computer components, including monitors, keyboards, memory chips and mice.  This project will begin with the development of "Frankenmice," robots build using surplus analog computer mice.  Relying initially on web resources and articles in the popular press, this project will explore techniques for developing these and other robotics devices from recycles materials.  The end result of this exploration will be the development of a detailed guidebook, or perhaps even a kit, that provides step-by-step instructions on developing a robot from recycled parts.  Other avenues of exploration could lead to applications of AI to these robots with a goal of developing practical robotics for daily life, no small challenge.

Resources:

Tasks:

  • Acquire surplus mice
  • Create a prototype junkbot
  • Experiment with uses
  • Develop a guidebook
  • Conduct research using the robots with goal of a research paper targeted to an education-related journal or conference

 

updated: 09/27/21

actlab.csc.villanova.edu