CSC 4480: Database Systems, Homework

CSC 4480-001: Principles of Database Systems

Spring, 2018
Homework

Homework #01 (article review; due Jan 29)
Homework #02 (ERD, Chen style; due Feb 5)
Homework #03 (ERD, Merise style; due Feb 5)
Homework #04 (convert ERD to RDB; due Feb 12)
Homework #05 (Relational Algebra; due Feb 26)
Homework #06 (SQL1, due Mar 12)
Homework #07 (SQL2, due Mar 19)
Homework #08 (EERD, due Mar 26)
Homework #09 (UML, due Mar 26)
Homework #10 (EERD->RDB, due Mar 26)
Homework #11 (JDBC, due Apr 9)
Homework #12 (PL/SQL, due Apr 9)
Homework #13 (FD's and English, due Apr 23)
Homework #14 (Minimal Cover, due Apr 30)
Homework #15 (Normal Form Analysis, due Apr 30)
Homework #16-17 (Decompositions, due May 2, but NOT FOR SUBMISSION)
Homework #18 (NoSQL, due May 7)

hw #1:

On this assignment you will read and report on an article or website. Here's the topic to research: One of the hot discussion topics these days is "NoSQL" data stores. Some folks feel that we should no longer be using (or teaching!) relational databases and SQL.

Find an article or website that addresses this issue. It shouldn't be a Wikipedia entry, but it's OK if you find it through Wikipedia (even though googling it is probably sufficient). Write a brief (font size at least 12, limit one page) summary of the article. Include whatever details you can glean as to date, author, etc. Address these points:

date of article (if available)
author
journal
website
type of database user most likely to find this of interest, according to the classification in Chapter 1 of our text
how the article expands the acronym "NoSQL"
Which side of the issue supported by this article
What you think

Remember, for this and all homework submissions (unless some graphics are necessary) remember to use a word processor. Please submit your assignments by email, either in the body of your message or as an attachment. It must be postmarked no later than class time on the due date, unless otherwise specified. Send it to the TA (once we get one), with cc to me.

hw #2:

Draw two ER diagrams for the enterprise described below: one using standard (Chen) notation (that one is hw #2), and one using ordered pair (Merise) notation (this one being hw #3). Indicate clearly which is which. Remember to indicate keys and relationship classifications in the diagram. Use your common sense on any classifications that aren't made explicit in the specs. Don't add any more ingredients than are described here. If you can, use software (e.g., MS-Visio, freeware or Paint) to generate the diagram. At any rate, make sure it's very legible.

If you have questions about these specs, I'll play the role of the client who will resolve them.

So here's the enterprise: It's part of a university setting, with students, departments, faculty and internships. Each student has a unique sID, and also a sName and a gpa. A department has a unique dName and is located in a particular building. A faculty member is identified by a fID and also has a fName and a rank. Each internship has a IID unique to itself and was taken a particular semester. A student has a unique major, which is a department, and may also have minors. For each such minor we note the signup date of that student. Each faculty member must be affiliated with a department. For each internship there's a unique student who registered for it (but the student might take more than one internhip over his/her career) and a unique supervising faculty member (who might also supervise other student internships).

hw #3:

Draw an ER diagram in the Merise style for the same enterprise.

hw #4:

Consider this figure. Convert it algorithmically to a relational database schema. Your deliverable is a set of relation schemas, with primary and foreign keys indicated in one of the usual ways.

hw #5:

Assuming the relational database schema for COMPANY that we've been using (e.g., Fig. 9.2 in our text), formulate the following queries in the relational algebra. Remember, they must be logically correct. It's not enough if they happen to work on the sample instance in the text. Also, remember that this enterprise has both "managers" and "supervisors," and how the terms are different.

Give the last names, ssns and birth dates of all female employees.

Give the project names that employee Katie Barta is working on.

Give the project names that don't have any workers with first name Meghan.

Give the locations that have some department(s) there, but not the research department.

hw #6:

Same as #5, but this time formulate in SQL.

Give the last names, ssns and birth dates of all female employees.

Give the project names that employee Katie Barta is working on.

Give the project names that don't have any workers with first name Meghan.

Give the locations that have some department(s) there, but not the research department.

hw #7:

For the same COMPANY database, formulate these in SQL as well. And unless it's explicitly requested, assume we don't want any duplicated rows. But don't use distinct if it's not necessary. Finally, rename any column in the output if it's not aesthetic (eg, "AVG(SALARY)" isn't aesthetic. "Average salary" would look better).

Give each department's name, together with its payroll (the sum of its employees' salaries).
Give each department's name, together with the number of locations it has.
For each project, give its name, together with the average salary of the women who work on it. Don't worry about division by zero: you may assume every department has at least one woman in it.
For each supervisor who has at least one dependent, give the supervisor's last name, ssn, and the number of dependents s/he has.

hw #8:

Design an EER model with the usual conventions that uses subclass structures wherever appropriate for the following situation: Persons have ssn's, names, and dates of birth. They may be Congresspersons, entertainers, both or neither. If they're Congresspersons, then we keep track of when they entered the Congress. If they're entertainers, then we note how many fans they have. Congresspersons must be either Senators or Representatives. If they're Senators, then we want to know which state they represent. If they're Representatives, then what's their district? Entertainers must be actors or musicians, but they could be both. For actors we need their primary medium; for musicians, their primary instrument. Finally, entertainers may be supporters of various Congresspersons.

hw #9:

Take the EERD in this figure, and convert it to a UML class diagram. You may omit methods and data types.

hw #10:

For the same figure, convert it to the relational data model. Implement the Artists-Fans specialization by using strategy (A) (the superclass and each subclass get a relation schema) and the leadVocals-Instrumentalists one by strategy (C) or (D) (flatten the hierarchy). For this stage, include the "extensional database": all table names, primary keys and foreign keys.

Next, provide SQL code for any one of the "intensional" views other than the "Artists" one.

Finally, take one constraint other than a "disjointness" one, tell which constraint it is, and write SQL code for enforcing it.

hw #11:

Assume the usual "COMPANY" schema. Write a Java class called hw11 that will take three command-line arguments: a username, a password and an employee lastname It connects to ORACLE on csdb and prints out the project names and their locations that that employee works on. Remember to prefix table names by 'DBTA' (the owner of these tables) so that you can check your work.

hw #12:

Create a PL/SQL named procedure called hw12 that does something similar to the Java code in homework #11. The difference is that this time the last name 'Wong' is hard-coded in the procedure. In other words, the procedure prints out the project names and locations that are worked on by the employee(s) with last name 'Wong'. Here's some guidance: Write and submit a file called MAKEhw12.sql. It's analogous to MakeEG.sql in the downloads. When run by the "start" command within Oracle, the "hw12" named procedure is then created.

hw #13:

As we've seen, one of the key tasks in database development is the translation between nontechnical and technical language. In the case of functional dependencies, suppose, e.g., we have the schema

Things = {ambience, brightness, coherence, diligence}

Don't worry about the attributes' semantics: they're not supposed to mean anything. Now consider the two English sentences

a) Things of different brightness must have different ambience.
b) Things with the same coherence could still have different diligences.

These can be translated into the language of FD's as follows (do you see why? If not, don't try this assignment yet! Instead, think back on the "logic" part of the Discrete Structures course. You might also wish to clarify things with Mohamed and me.):

a) a->b holds.
b) c->d fails.

Now take each of the following statements (don't impute any "meaning" to them, nor are the four scenarios necessarily consistent), and translate it similarly into an assertion that some FD holds or fails.

1) Things with different diligences could not have the same ambience.
2) It's possible for things with different coherence to have the same brightness.
3) Things with different coherences could have the same ambience.
4) It's possible for things with different brightnesses and the same coherence to have the same diligence.

hw #14:

Suppose we're given the following set of FD's for schema ABCDEFGH. Find a minimal cover for it. Give reasons as you go along.

A->E
AD->BE
AC->E
E->B
BG->F
BE->D
BDH->E
F->A
D->H
CD->A

hw #15:

Suppose now this time the schema Widgets = {area, brightness, changeability, density, energy}
has the FD's
e->b
e->a
cd->e
cd->a
a->b

a) This is not a minimal set of FD's. Explain.
b) Display graphically a minimal set equivalent to the given set.
c) How many keys are there for this set?
d) List them all.
e) Take one of those keys, and prove it's one.
f) What is the highest normal form the schema is in? Explain.
g) Give a scenario for an insert anomaly caused by this low NF.

hw #16-17:

This assignment is longish, so it is equivalent to two regular ones. Assume the schema R = {a, b, c, d, e, f, g, h, i, j}. Below are three sets F of functional dependencies and three decompositions D. In each case, explain with reasons whether D is FDP and whether it is LJD, and give the NF's of its components.

Take F as follows:
fd1)ch->adg
fd2)dg->f
fd3)chj->bi
fd4)bg->e
D = {dfg,acdgh,bchij,beg}
Take F as follows:
fd1)c->b
fd2)b->je
fd3)g->fh
fd4)d->gi
fd5)dc->a
D = {acdg,gi,bej,fgh,bc}
Take F as follows:
fd1)i->de
fd2)d->cg
fd3)b->aj
fd4)hi->f
fd5)h->b
D = {abhj,cdg,defhi}

hw #18:

This is a follow up on your first homework in the course and hence brings us "full circle." Dig up your initial writeup, re-read the article you reported on, and use one page to address these points:

Which side of the issue did you support in HW1?

In view of what you've learned about relational and NoSQL systems, which side do you support now? Refer to Slide #5 ("5 Problems with RDBMS") in MarkLogic's "Introduction to NoSQL" presentation.

Even though this is an individual assignment, what's your opinion on how well your term project would work as a NoSQL data store? Explain.

Back to home page