Simulation & Tools Group

This group is exploring the use of various simulation techniques for research and education, and the development of independent software tools that solve problems in non-computer-science domains.


Project: Snitch! Spotting & Neutralizing Internet Theft by CHeaters

Purpose: Create an application that scans student technical research papers to detect instances of plagiarism from the Internet.

Researchers: Tom Way

Research Alumni: Sebastian Niezgoda, Joseph Bruno, Purushotham Ch

Description:

Snitch is a Java application that scans the text in a student paper, identifying passages that might be plagiarized, searching the Internet for matching web sites that contain the passages, and finally presenting an HTMLized version of the original student paper with embedded links to any plagiarized material.

Tools

Resources

References

Current Tasks

  • Develop Java class that performs Flesch-Kincaid Grade Level analysis of textual input. Test application should enable the user to select a text file, open it, analyze it and display appropriate statistics such as grade level for each sentence and paragraph, and counts of characters, words, sentences, and paragraphs.
  • Create example program that converts Microsoft Word document into text document
  • Create example program that converts PDF document into text document
  • Locate as many search APIs as possible and design examples of how to use them in a Java program.
  • Determine if there is a way to make use of the newer Google web application API (because they are phasing out the SOAP API, so it will not be feasible to make Google searches from a standard Java application). In other words, could SNITCH be made into a web-based application rather than a stand-alone application?
  • Explore use of Java 2 BreakIterator class for managing input tokenization
  • Investigate using MOSS to create a user interface for programming project plagiarism detection.

Project Plan

  • Download & install Jigloo to help with UI design and devel
  • Refine user interface: better design, better functionality, prettier, make buttons same size
  • Create better HTML report generation, add viewer
  • Refactor code as needed, make more object-oriented, improve windowing approach (document->paragraphs->sentences->words)
  • Find a way to handle Word doc and PDF input files
  • Devise better candidate selection algorithm, write up specification for it, see if Fleischer Scale is applicable, other analysis techniques
  • Research approaches to plagiarism detection, both automated and manual
  • Alpha release candidate goal: Summer 2008

Project: Algorithms & Data Structures for Business Analysis

Purpose: Develop the theory and framework for a proprietary business analysis approach

Researchers: Tom Way, Mike Peterson (Univ. of Delaware)

Description:

We have developed a k-layers, massively interconnected data structure and analysis framework for use in Dr. Peterson's organizational culture research.  This technology has been implemented in a software tool that provides a flexible and powerful means to manipulate large data sets, enabling a sophisticated, concept-cluster-based, stimulus-response analysis.  The analysis algorithm and data structure significantly improve upon early analysis methods, making it possible to conduct the complex task in a matter of hours rather than days or weeks.

Current plans are to fully develop the software prototype tool, and to refine the data structures and algorithms used in the analysis to improve the tool's efficiency.

Tasks:

  • Identify salient technical innovations from the project
  • Prepare write-up of the technical aspects

updated: 10/01/09

actlab.csc.villanova.edu