Welcome to the graduate seminar on big data and data science. This is a hot topic and a pretty new one. It's also the first time this course has been offered at Villanova, so some of the material on these web pages is tentative. As the semester rolls on, though, the planning will solidify.
The primary goal of this seminar is to become acquainted with various aspects of big data and to become literate in the field, rather than to become an expert in a particular system. Since it's a seminar, most of our meetings will consist of a student presenting a topic to her/his classmates and instructor. Of course, presenters should make every effort to acquire hand-on experience in their particular topic (and to help any interested classmates do the same). In this seminar setting we will be teaching one another and learning from our fellows.
As we progress in the course, important references (mostly websites) will be presented as called for by the upcoming presentation topics. But here's a good one to look at: ACM's website. Chase the links there to the Membership and Student Membership pages. We, as part of the Villanova community, have free access to ACM's Digital Library (and similar resources) free of charge. But student membership (should be under $20) also entitles you to many free online O'Reilly/Safari boooks, including, e.g., Hadoop: The Definitive Guide (3rd Ed), by Tom White.
Since this course is largely experimental, note that much of this syllabus information, including the grading rubric, is tentative and subject to change.
Mendel Science Center 162A
TWTh 9:30-11:00 and by appointment (Sorry if the times are inconvenient: I have a late afternoon class. We can figure something out if you need to see me in person.)
There is no text book for this seminar.
|Your own presentation||100|
|Your preparation for presentations by guests and classmates||40|
|In-class group work||40|
|Final examination (December 9)||100|
|Other (participation, etc)||10||TOTAL||290|
Typically, break points for letter grades are 90, 80, 70, etc. Upper ends of B and C ranges are "+" grades. Lower ends of A and B ranges are "-" grades.
Here are my expectations of you:
Submit links to at least two important references for your topic. Indicate their relevance to it. This needs to be done by the Tuesday before the week you present.
Submit six questions on the readings, for your classmates. Three should be easy, and three should be more challenging. I'll choose one of the easy ones for a quick quiz just before your presentation. And one of the six questions will appear on the final exam. The due date is the same as the one for the references.
Summarize your topic and readings in your own words. The summary itself should be three typewritten pages (double-spaced), and the bibliography should begin on the fourth page. This is due on Friday, December 12 (after the final exam). However, you should probably have completed it the week after your presentation. You'll be graded on content and style. For the latter, you might want to consult with the Villanova Writing Center.
Soon after they come in, I'll create and disseminate to the class a booklet for the seminar, including all these student writeups.
The day before your presentation, you should have created (and sent me the URL of) a simple website on which you've posted your materials (pdf files, ppt files, youtube clips, screen shots of implementations, whatever is relevant). I'll link it to this page for your classmates to access.
It should be engaging and informative. It should include ties to other topics, such as how it would be classified as a NoSQL system (e.g., key-value) (if relevant), and which CAP property/ies apply (if relevant). Note: if you will be presenting from a Mac (as opposed to a PC), you may need a connector for our room's projector. Let me know if you'd like me to bring one for you.
For numbers 1, 2, 5 and 6 above, use the BigData/index.html file in your html subdirectory on our UNIX cluster. It's been set up for you (find it in the "Student Sites" link in the Table of Contents, above). Of course, you'll need to edit it to suit your material. Let Swathi and me know if you have questions about this.
Here are my expectations of you:
Read the references provided by the speaker, and write a brief (two-paragraph) summary. This is due 6 PM on the day of her/his presentation. Again, I'll get back to you with instructions for submission.
Just prior to the presentation, I'll administer a short (5-minute!) quiz on one of the three "easy" questions.
One of the six questions suggested by the presenter will appear on the final exam.
You should familiarize yourself with the details of ACM's code of ethics and of Villanova University's policy on academic integrity. You can find the latter here, and also here, including the links to departmental and university policies. For our course, this would include not sharing with your classmates your written answers to the questions you submit about your presentation.
It is the policy of Villanova to make reasonable academic accommodations for qualified individuals with disabilities. If you are a person with a disability, please check out the Office of Disability Services' Web site.
The final exam will consist of 20 questions, each worth 5 points. Here is the pool from which those questions will be chosen. The list now includes some contributions from Pete and Krishna's presentations.
Thanks much to gurus Steve, Pete D., Anton, David, Juan and Najib.
Last updated: Dec 2, 2014