CSC 9010-003 
Seminar: Big Data and Data Science

Fall, 2014

Table of Contents

About the Course

Welcome to the graduate seminar on big data and data science. This is a hot topic and a pretty new one. It's also the first time this course has been offered at Villanova, so some of the material on these web pages is tentative. As the semester rolls on, though, the planning will solidify.

The primary goal of this seminar is to become acquainted with various aspects of big data and to become literate in the field, rather than to become an expert in a particular system. Since it's a seminar, most of our meetings will consist of a student presenting a topic to her/his classmates and instructor. Of course, presenters should make every effort to acquire hand-on experience in their particular topic (and to help any interested classmates do the same). In this seminar setting we will be teaching one another and learning from our fellows.

As we progress in the course, important references (mostly websites) will be presented as called for by the upcoming presentation topics. But here's a good one to look at: ACM's website. Chase the links there to the Membership and Student Membership pages. We, as part of the Villanova community, have free access to ACM's Digital Library (and similar resources) free of charge. But student membership (should be under $20) also entitles you to many free online O'Reilly/Safari boooks, including, e.g., Hadoop: The Definitive Guide (3rd Ed), by Tom White.

Since this course is largely experimental, note that much of this syllabus information, including the grading rubric, is tentative and subject to change.

News Bulletins and Upcoming Deadlines

  • Here's some information about the final exam.

  • Instructor

    Dr. Don Goelman
    Mendel Science Center 162A 
    TWTh 9:30-11:00 and by appointment (Sorry if the times are inconvenient: I have a late afternoon class. We can figure something out if you need to see me in person.)


    CSC 8490 (Database Systems)


    There is no text book for this seminar.


    Your own presentation 100
    Your preparation for presentations by guests and classmates 40
    In-class group work 40
    Final examination (December 9) 100
    Other (participation, etc) 10
    TOTAL 290

    Typically, break points for letter grades are 90, 80, 70, etc. Upper ends of B and C ranges are "+" grades. Lower ends of A and B ranges are "-" grades.

    The Topic You Present

    Here are my expectations of you:

    1. References

      Submit links to at least two important references for your topic. Indicate their relevance to it. This needs to be done by the Tuesday before the week you present.

    2. Questions

      Submit six questions on the readings, for your classmates. Three should be easy, and three should be more challenging. I'll choose one of the easy ones for a quick quiz just before your presentation. And one of the six questions will appear on the final exam. The due date is the same as the one for the references.

    3. Answers to Questions

    4. Submit the six answers, just to me, at the same time as the questions. Please include the locations in the references where you've taken the answers from.

    5. Long Summary

      Summarize your topic and readings in your own words. The summary itself should be three typewritten pages (double-spaced), and the bibliography should begin on the fourth page. This is due on Friday, December 12 (after the final exam). However, you should probably have completed it the week after your presentation. You'll be graded on content and style. For the latter, you might want to consult with the Villanova Writing Center.

      Soon after they come in, I'll create and disseminate to the class a booklet for the seminar, including all these student writeups.

    6. Presentation Materials

      The day before your presentation, you should have created (and sent me the URL of) a simple website on which you've posted your materials (pdf files, ppt files, youtube clips, screen shots of implementations, whatever is relevant). I'll link it to this page for your classmates to access.

    7. Presentation Itself

      It should be engaging and informative. It should include ties to other topics, such as how it would be classified as a NoSQL system (e.g., key-value) (if relevant), and which CAP property/ies apply (if relevant). Note: if you will be presenting from a Mac (as opposed to a PC), you may need a connector for our room's projector. Let me know if you'd like me to bring one for you.

    For numbers 1, 2, 5 and 6 above, use the BigData/index.html file in your html subdirectory on our UNIX cluster. It's been set up for you (find it in the "Student Sites" link in the Table of Contents, above). Of course, you'll need to edit it to suit your material. Let Swathi and me know if you have questions about this.

    The Topics You Attend

    Here are my expectations of you:

    1. Summary

      Read the references provided by the speaker, and write a brief (two-paragraph) summary. This is due 6 PM on the day of her/his presentation. Again, I'll get back to you with instructions for submission.

    2. Quiz

      Just prior to the presentation, I'll administer a short (5-minute!) quiz on one of the three "easy" questions.

    3. Final Exam

      One of the six questions suggested by the presenter will appear on the final exam.

    Ethics for Computer Scientists

    You should familiarize yourself with the details of ACM's code of ethics and of Villanova University's policy on academic integrity. You can find the latter here, and also here, including the links to departmental and university policies. For our course, this would include not sharing with your classmates your written answers to the questions you submit about your presentation.

    Students with Disabilities

    It is the policy of Villanova to make reasonable academic accommodations for qualified individuals with disabilities. If you are a person with a disability, please check out the Office of Disability Services' Web site.

    About the Final Exam

    The final exam will consist of 20 questions, each worth 5 points. Here is the pool from which those questions will be chosen. The list now includes some contributions from Pete and Krishna's presentations.


    Thanks much to gurus Steve, Pete D., Anton, David, Juan and Najib.

    Last updated: Dec 2, 2014