CSC 4598-001 Machine Translation Fall 2018

SYLLABUS


Meetings
MW 3:00-4:15, Mendel Science Center TBD
Instructor Dr. Tom Way
160A Mendel Science Center

Email:   thomas.way@villanova.edu
Skype:  DrTomWay
Phone:  (610) 519-5033
Office hours (See my web site)
Teaching Assistant none
Textbook None. We will rely on online resources and handouts.
Web site
http://www.csc.villanova.edu/~tway and follow the link for CSC 4598
Catalog description
Exploration of the field of machine translation, the science behind Google Translate. Topics covered include automated computer systems that translate human language using statistical approaches and a wide variety of digital representation transformations, such as readability and sentiment analysis, spam filtering, plagiarism detection, and other natural language processing techniques. Students will gain hands-on experience building machine translation systems using real-world data and formulating and investigating research questions in machine translation. Course typically includes collaboration with a non-Computer Science course on interdisciplinary, team-based, student projects.
Course
description

This course is an introduction to the field of machine translation, including the related and more broad field of computer-aided translation. The course is novel in that it will also involve interdisciplinary learning with materials from and possibly team-based interactions with faculty in students in other, non-technical courses at Villanova. This unique offering affords all involved students and faculty the opportunity to accumulate and apply expertise from their respective disciplines to develop approaches and machine translation tools, much in the way such collaboration is done in academic research and the software industry.

Machine translation in the traditional sense, as a subfield of "computational linguistics," involves use of an automated computer system to convert, or translate, text in one human language to the equivalent text in a different human language. More broadly, the task of machine translation involves using an automated computer system to translate of any form of representation into another form, such as text into speech, speech into text, automatic description of images, cryptography, spam filtering, sentiment analysis, readability analysis, word clouds, or any other form of attempting to understand one representation of knowledge, especially language, in another form.

Machine translation uses computers to turn one form of information into another.

Among the approaches we will study are the three major paradigms of machine translation of language: word-based translation, phrase-based translation, and syntax-based translation. Students will gain hands-on experience with building machine translation systems and working with real-world data, and they will learn how to formulate and investigate research questions in machine translation.

Class meetings and assignments will vary, lecture, video, projects, case studies, hands-on programming, experimentation with software tools, presentations, seminar-style discussions.

The very best of the work done in the semester will be carried forward in collaboration with your instructor, and could lead to submission of research papers to prestigious conferences and journals. Successful publication in conferences may even lead to travel to conferences to present the results of your work.

Above all this course will attempt to be educational, fun, entertaining and meaningful. Class will always be a safe place to explore ideas, devise novel approaches, collaborate with others, make mistakes, solve problems and learn a whole lot while you create some pretty cool pieces of software and even pick up some non-Computer Science knowledge along the way.

Projects

The types of projects and activities we will pursue in class are likely to include:

  • Learn to program in the Python and Javascript programming languages.
  • Create a word-based translator for a language with very simple translation rules, such as Pig Latin, Verlan, Ubbi Dubbi, and Pirate.
  • Create more advanced translators for a variety of language translations.
  • Evaluate and use online tools that perform various translations, such as Google Translate and the many other similar tools.
  • Using the Natural Language Toolkit to create cool tools of all sorts, including phrase- and syntax-based translations, using the Python programming language.
  • Create research-oriented machine translation tools to assist students in FRE 1140 to perform a variety of language-related tasks from simple to complex.
  • Using Javascript and HTML5 to explore other approaches for creating web-based machine translation tools, such as sentiment analyzers, word clouds, readability analyzers and cryptography tools.
  • And  more...
Learning Objectives
    & Outcomes
Objective 1: Survey the subject of Machine Translation through lecture, reading, discussion, online web resources, and videos.
  • Outcome 1: Students will demonstrate an understanding of approaches and techniques used in Machine translation through critically evaluating online references, leading and participating in classroom discussions, completing homework assignments, and conducting brief lab activities.
  • Outcome 2: Students will apply the knowledge gained to complete a number of programming projects done individually, as part of a small team, and possibly in collaboration with students in a non-technical course.

Objective 2: Provide experience individually and as a team member on an interdisciplinary team in a number of practical tasks including researching topics, evaluating information, designing and conducting experiments, and programming software tools relating to one or more specific aspects of Machine Translation.

  • Outcome 1: Students will demonstrate an ability to collaborate with teams to explore, design and experiment.
  • Outcome 2: Students with a technical background (e.g. Computer Science) will demonstrate an understanding of software design and implementation of Machine Translation tools for use in applying computer science ideas to the research and analysis needs of topics within a non-technical course.
  • Outcome 3: Students with a non-technical background (e.g., a collaborating, non-technical course) will demonstrate an understanding of the use of various Machine Translation software tools and techniques to analyze, understand and possibly solve a variety of environmental issues, and will gain an improved understanding of computing and computational thinking while learning computer science and programming concepts.
Grading policy
Grading will be based cumulative learning activities, assignments, participation, contribution to class, and ultimately your personal productivity as it relates to projects. Participation in particular will be vital!

25%  Assignments (homework, lab, reading analysis, research tasks)
10%  Midterm exam or project
40%  Projects (individual, team, interdisciplinary and collaborative)
15%  Final project demonstrating overall understanding
10%  Participation (attendance, class discussion, intellectual contribution to class)

Final grades
92 A 88 B+ 78 C+ 68 D+
90 A- 84 B 74 C 64 D
80 B- 70 C- 60 D-
Attendance
Attendance is mandatory.
Makeup Policy
No missed or late assignments, exams or projects without prior excuse. Each case will be handled separately based on its own merits. Each student is responsible for what is covered and assigned in any classes which they miss. Abuse of this policy will result in a loss of leniency.
Late Assignment Policy
No assignments will be accepted late without the direct consent of the instructor prior to the due date of the assignment.  Typical penalty is 10% off for each day an assignment is late. Absolutely no assignments will be accepted beyond the date of the final exam.
Academic Integrity All students are expected to uphold Villanova’s Academic Integrity Policy and Code. Any incident of academic dishonesty will be reported to the Dean of the College of Liberal Arts and Sciences for disciplinary action. For the College’s statement on Academic Integrity, you should consult the Enchiridion. You may view the university’s Academic Integrity Policy and Code, as well as other useful information related to writing papers, at the Academic Integrity Gateway web site. Severe academic penalties will be imposed for violations of this policy, such as receiving at a minimum 0% credit for an assignment, or at the maximum a failing grade for the course, at the discretion of the instructor.
Office of Disabilities
and Learning Support
Services
It is the policy of Villanova to make reasonable academic accommodations for qualified individuals with disabilities. You must present verification and register with the Learning Support Office by contacting 610-519-5176 or via email. For physical access or temporary disabling conditions, please contact the Office of Disability Services at 610-519-4095 or email. Registration is needed in order to receive accommodations.

Last updated: 05/17/2018