Rehabilitation Computing Group

Using computing to assist the communications needs of people with disabilities is an area with vast potential.  This group explores uses of computers to enable improved communication for blind and deaf students, although the solutions developed have applicability beyond this domain.


Project: An Accessible, Tremor-filtering, Pointing Device

Purpose: Develop an application that smoothes the jerky motion of a pointing device that is caused by an essential tremor or other tremor-causing conditions.

Researchers: Tom Way

Research Alumni: Tim Mizas (related Wiimote research), Andrew Miller, Anthony Dovelle, John Truitt

Applications: Assistive technology, assistive device

Description:

The incidence of tremors, resulting from an essential tremor, familial tremor, Parkinson's disease, or other conditions, is present in a significant portion of the population. In public speaking situations, such as when giving a PowerPoint-based presentation, a tremor can lead to a greatly reduced ability to make use of a laser pointing device.

Our research involves creating a computer application that can filter positional data from a Nintendo Wii remote control, or Wiimote, to smooth out the jerky motion induced by a tremor.

Research projects:

  • Motion filtering algorithms - Experiment with a variety of approaches to filtering coordinate data, such as positional comparison, averaging, and sampling, including development of adaptive approaches, to process input values with a short-term, high variance profile to produce smoothed output values.
  • Implementation of a prototype motion smoothing pointing device - Develop a C# program that processes mouse motion captured from a Wiimote being used as a pointing device, and produces smooth pointing behavior regardless of the degree of variance in the motion in the pointing device.

Resources:


Project: Affordable Speech Recognition for the Classroom

Purpose: Develop a real-time continuous speech recognition application for use by college students to assist with note-taking.

Researchers: Tom Way

Researcher Alumni: Richard Kheir, Louis Bevilacqua

Applications: Assistive technology, security & monitoring

Description:

A principal difficulty with speech recognition software, and therefore with its broad acceptance, has been its accuracy rate.  Even with an achievable 98% accuracy rate, automatic speech recognition (ASR) for general applications such as real-time transcription are unacceptable. Industry predictions are that accuracy will be significantly boosted within 5-10 years.

For the deaf and hard-of-hearing, the task of listening is difficult, requires an extreme amount of attention, and can be aided by a sign-language interpreter.  The use of software ASR is an attractive one, allowing more freedom and independence.  Although a 2% inaccuracy rate is dreadful in the business world, it is entirely workable in the context of an automatic speech recognition interpreter.  It is likely that higher degrees of inaccuracy, perhaps 10% or more, would not be unworkable, given the assistive (rather than transcriptive) nature of the task.

We are exploring the use of Java tools and off-the-shelf, low-cost ASR systems to solve this problem.

A prototype version has been developed for testing in a classroom lecture setting that will use a desktop or laptop computer, a wireless head-set microphone, and the Microsoft Speech Recognition software bundled with Office.  Experiments have been conducted, and additional experiments are planned, to compare speaker-trained ASR with and without use of an automatically captured domain dictionary.

Research projects:

  • Development of a Java-based of MS Speech Recognition-based continuous speech recognition application (see phases of research below).

References

Best of the related info:

Java development:

Other related info:

Equipment:

Publications & Conferences:

Phases of research:

  • Phase 1 - desktop & laptop version
    • Investigate Java or Microsoft automatic speech recognition software, underlying algorithms, etc.
    • Project: Create simple test Java code using COTS software as proof-of-concept
    • Create basic prototype implementation "from scratch", emphasize ease-of-use and maintenance
    • Enhance prototype to full working version, test in classroom conditions
    • Write up & submit
    • Create distribution for use by deaf & hearing-impaired users
  • Phase 2 - distributed client-server system
    • Develop a server and applet communication system to enable easy distribution of real-time text notes in lab-based or wireless-enabled classrooms.
    • Develop a client transmitter that sends the transcription output of the speech recognition system to a server (above) for distribution.
    • Conduct user experiments to measure effectiveness, accuracy, etc.
    • Write up & submit
    • Apply for funding, if applicable, and continue development
  • Phase 3 - PDA version
    • Identify and acquire Java support for PDAs
    • Continue to enhance our Java ASR code, try to develop as independent library (open source?)
    • Compact implementation to work under PDA architectural constraints (memory, power, speed, etc.)
    • Implement and experiment (lab experiments or human subjects or both)
    • Write up & submit
    • Create a distribution version
    • Continue to improve & develop as needed

Example training session (4/25/05):

This training was conducted in my office using the wireless headset microphone.

  • Saved untrained profile
  • Performed initial training using "Introduction to Microsoft Speech Recognition" default training session (10:00)
  • Performed full training using all Microsoft-provided training data (additional 60:00)
    • Aesop's Fables - 7:00
    • Bill Gates describes "The Road Ahead" second edition - 3:00
    • Excerpts from "The Problems of Philosophy" by Bertrand Russell - 14:00
    • Excerpts from "The Fall of the House of Usher" by Edgar Allen Poe - 9:00
    • Excerpts from "SUMMER" by Edith Wharton - 12:00
    • Excerpts from "The War of the Worlds" by H.G. Wells - 8:00
    • The Wizard of Oz by Frank Baum - 7:00
  • Performed domain training using text gathered from course-related material (additional 10:00)

Project: Firefox Imaging Plug-ins

Purpose: Develop a Firefox browser extension that creates tactile-ready images and textual descriptions of images

Researchers: Tom Way

Description:

Blind and low-vision computer users are faced with a quandary.  The popularity of the Internet has led to an explosion of fancy graphics, beautiful (and ugly) web site layout and design, and easily available digital photographs numbering in the millions, none of which are accessible to this group of users.  The lab director's M.S. thesis, "Automatic Generation of Tactile Images," was an early investigation of how images could be made accessible to blind and low-vision computer users through creation of tactile graphics or "tactics."  By applying sophisticated image processing techniques and using tactile imaging technology, significant strides were made in developing a framework for creating meaningful "tactic" representations.  This work that was started in 1995 has been carried on by Dr. Ken Barner and many of his students at the University of Delaware.

This work is involves using the Firefox plug-in extension API to create a brower extension that will perform appropriate processing of each image on a web page. The result will be the first step in providing broader access, automatically, to the blind and low-vision computer using population.

Secondarily, this work involves creating or extending a Firefox plug-in to generate textual descriptions of images. Initially, this is likely to be rudimentary, with the main goal to add some initial proof-of-concept level of textual description where none is present. The approach can be a combination of scraping and summarizing nearby text and alt text, and performing simple detection of objects within the image, or otherwise analyzing shapes and colors in the image. This is a very difficult problem, so the first step is important but will necessarily be incomplete.

Resources


updated: 01/17/12

actlab.csc.villanova.edu