Cheat sheet

13
CS 500, Database Theory, Summer 2014 Homework 1: Relational and ER Models Due at 5pm on Monday, July 14 SOLUTION Description This assignment covers the following topics (1) The relational model (2) The ER model I do not expect you to use any special software to draw ER diagrams. The easiest way is to print out this assignment, draw the diagrams by hand, scan in and submit. I am intentionally leaving room for solutions. Note that you will be expected to draw ER diagrams on the midterm and/or final exam, which will be given electronically. This homework is a practice run for how you would complete, and submit, assignments of this kind. Grading This assignment is made up of 5 problems, collectively worth 100 points, or 5% of the over all course grade. If this assignment is submitted late, you will receive no credit. Submission instructions Submit your assignment using the submission script on tux.cs.drexel.edu. I strongly prefer having your submission as 1 PDF file for this assignments. Create a directory (referred to as dirName below) that will hold your submission file. Directory name and location are immaterial, as long as the directory is on tux. Execute /home/julia/cs500/bin/submit <dirName> to submit. You may submit multiple times before the deadline. Only your last submission will be graded. This assignment is to be completed individually. Please consult the course syllabus for a description of our plagiarism policy.

Transcript of Cheat sheet

CS  500,  Database  Theory,  Summer  2014    

Homework  1:  Relational  and  ER  Models  Due  at  5pm  on  Monday,  July  14  

SOLUTION    Description    This  assignment  covers  the  following  topics  

(1) The  relational  model  (2) The  ER  model  

 I  do  not  expect  you  to  use  any  special  software  to  draw  ER  diagrams.    The  easiest  way  is  to  print   out   this   assignment,   draw   the   diagrams   by   hand,   scan   in   and   submit.     I   am  intentionally   leaving   room   for   solutions.     Note   that   you   will   be   expected   to   draw   ER  diagrams   on   the   midterm   and/or   final   exam,   which   will   be   given   electronically.     This  homework   is  a  practice  run  for  how  you  would  complete,  and  submit,  assignments  of   this  kind.          Grading    This  assignment  is  made  up  of  5  problems,  collectively  worth  100  points,  or  5%  of  the  over-­‐all  course  grade.    If  this  assignment  is  submitted  late,  you  will  receive  no  credit.        Submission  instructions    Submit  your  assignment  using  the  submission  script  on  tux.cs.drexel.edu.    I  strongly  prefer  having  your  submission  as  1  PDF  file  for  this  assignments.  

 • Create  a  directory  (referred  to  as  dirName  below)  that  will  hold  your  submission  

file.    Directory  name  and  location  are  immaterial,  as  long  as  the  directory  is  on  tux.  • Execute  /home/julia/cs500/bin/submit <dirName>  to  submit.      • You  may  submit  multiple  times  before  the  deadline.    Only  your  last  submission  will  

be  graded.    This   assignment   is   to  be   completed   individually.     Please   consult   the   course   syllabus   for   a  description  of  our  plagiarism  policy.            

Part  1  (30pts):  From  business  rules  to  ER  diagrams    A  recording  studio  needs  help  designing  its  database.    The  studio  stores  information  about  musicians  and  albums.    Draw  an  ER  diagram  describing  the  studio’s  database  for  each  of  the  two  scenarios  described  below.    Assume  that  the  only  business  rules  that  hold  are  those  stated   below,   and   that   no   additional   business   rules   hold.   Clearly   mark   all   key   and  participation  constraints.    (a)  10pts.  Each  musician  who  records  at  the  studio  has  a  social  security  number  (ssn)  and  a  name,  and  no  two  musicians  have  the  same  ssn.    Musicians  form  bands.    A  band  is  described  by  a  unique  name  and  has  at  least  one  musician  as  a  member.      Bands  record  albums,  which  have  a  title  and  a  year  of  production.    Each  album  is  recorded  by  exactly  one  band,  and  no  two  albums  have  the  same  title  and  the  same  production  year.    Each  album  is  produced  by  exactly  one  musician.  (Don’t  worry  about  whether  that  musician  is  a  member  of  the  recording  band.)        Albums  are  made  up  of  songs,  described  by  their  titles.    You  may  assume  that,  if  the  studio  no  longer  wants  to  store  information  about  an  album,  then  it  also  does  not  store  the  songs  belonging  to  that  album.    Naturally,  each  song  belongs  to  exactly  one  album,  and  all  songs  on  the  same  album  have  different  titles.    Solution:  the  line  connecting  BANDS  and  member_of  is  bold;  the  lines  connecting  ALBUMS  to  produce  and  record  and  SONGS  to  belong_to  are  bold  and  have  an  arrowhead.    

       

MUSICIANS(

name(

member_of( BANDS(

name(ssn(

record(produce(

ALBUMS(

year(;tle(

belong_to( SONGS(

;tle(

(b)  10  pts.  Each  musician  who  records  at  the  studio  has  a  social  security  number  (ssn),  a  name,  an  address,  and  a  phone  number.    Poorly  paid  musicians  often  share  the  same  address.    A  phone  number  may  be  associated  with  at  most  one  address  (if  it’s  a  landline).  Each  musician  has  exactly  one  address  and  at  least  one  phone  number.        Albums  are  described  by  a  title  and  a  recording  year,  and  no  two  albums  have  both  the  same  title  and  the  same  year.    Albums  are  recorded  by  one  or  several  musicians.    Each  album  has  exactly  once  musician  who  acts  as   its  producer.    A  particular  musician  may  produce  zero,  one  or  several  albums.    Solution:  The  line  connecting  ALBUMS  and  recorded_by  is  bold;  the  line  connecting  ALBUMS  to   produced_by   is   bold   and   has   an   arrowhead;   the   line   connecting  MUSICIANS   to   have   is  bold;   the   line   connecting   MUSICIANS   to   live_at   is   bold   and   has   an   arrowhead;   the   line  connecting  PHONES  to  associated_with  is  not  bold  and  has  an  arrowhead.      

       

MUSICIANS(

name(ssn(

ALBUMS(

year(2tle(

produced_by(

recorded_by(

ADDRESSES(

zip(street( state(

PHONES(

num(area(live_at(

associated_with(

have(

(c)  5  pts.  Smooth  Air  Company  operates  multiple  flights  between  certain  pairs  of  cities  daily.    Each  flight  has  an  origin  city,  a  destination  city,  a  scheduled  departure  time,  and  flight  duration.    You  may  assume  that  each  city  is  uniquely  identified  by  its  name.    Solution:  There  are  2  ways  to  draw  this  ER  diagram.    

             (d)  5  pts.    Animal  species  live  in  habitats.    Each  animal  species  belongs  to  exactly  one  habitat  and  each  habitat  is  home  to  at  least  1  animal  species.    You  may  assume  that  each  habitat  and  each  species  are  uniquely  identified  by  their  names.        Solution:  The  line  connecting  Animals  and  have  is  bold  and  has  an  arrowhead.    The  line  connecting  Hatitats  and  has  is  bold.    

   

DESTINATION)

name%

ORIGIN)

name%

flight)

DEPARTURE)

&me%date%

CITY)

name%

flight)

DEPARTURE)

&me%date%

origin% des&na&on%

HABITATS'

name'

ANIMALS'

genus'

have'

species'

Part  2  (20  points):  From  Create  table  statements  to  ER  diagrams    (a)  10pts.  Consider  create  table  statements  below.  Draw  an  ER  diagram  from  which  these  create  table  statements  could  have  been  derived.    Be  sure  to  mark  any  key  and  participation  constraints.  create table Shelf ( A number primary key, B number not null unique, C number ); create table Bag ( D number primary key, A number not null, foreign key (A) references Shelf(A) );  Solution:  

     This  is  one  possible  solution.  Another  correct  solution  is  one  in  which  B  is  the  key  of  the  entity  set  Shelf.    Note  that  a  key  constraint  holds  over  the  entity  set  Bag  (because  D  is  the  primary  key  in  the  table  Bag),  and  that  participation  constraint  holds  over  Bag  (because  A,  a  foreign  key  relating  Bag  to  Shelf,  is  not  null).    Intuitively,  this  ER  diagram  represents  a  relationship  set  like:  a  Bag  is  placed  on  a  Shelf.    Each  Bag  can  be  found  on  exactly  one  Shelf,  and  a  Shelf  may  hold  0,  1  or  several  Bags.    This  is  an  example  of  a  many-­‐to-­‐one  relationship  set.      

BAG$

D$

SHELF$

A$

some_rel$

B$ C$

(b)  10  pts.  Consider  create  table  statements  below.  Draw  an  ER  diagram  from  which  these  create  table  statements  could  have  been  derived.    Be  sure  to  mark  any  key  and  participation  constraints.    create table Pocket ( X number primary key, Y number not null unique, Z number, foreign key (Y) references Drawer(Y) ); create table Drawer ( Y number primary key );    Solution:  These  create  table  statements  model  a  one-­‐to-­‐one  relationship  set  relating  Pocket  and  Drawer.        Note  that  both  key  and  participation  constraints  hold  over  the  entity  set  Pocket:  key  constraint  because  X  is  a  primary  key  in  Pocket,  and  participation  because  Y  in  Pocket  is  not  null.    A  key  constraint  holds  over  the  entity  set  Drawer,  because  Y  is  designated  as  unique  in  Pocket.    No  participation  constraint  holds  over  Drawer:  a  Drawer  not  related  to  a  Pocket  is  one  for  which  there  is  a  tuple  in  Drawer,  but  its  Y  value  does  not  appear  in  the  table  Pocket.        There  is  no  good  intuitive  interpretation  of  this  relationship  set,  perhaps  something  like:  the  contents  of  a  Drawer  can  be  placed  into  at  most  one  Pocket  (some  drawers  are  too  large  and  so  their  contents  does  not  fit  into  any  pocket).    Each  pocket  is  designated  to  hold  the  contents  of  exactly  one  drawer.

     

DRAWER&

Y&

POCKET&

X&

some_rel&

Z&

Problem  3  (10pts):  Keys    Consider  an  entity  set  Person,  with  attributes  social  security  number  (ssn),  name,  nickname,  address,  and  date  of  birth  (dob).    Assume  that  the  following  conditions  hold:  (1)  no  two  persons  have  the  same  ssn;  (2)  no  two  persons  have  the  same  combination  of  name,  address,  and  dob.    Further,  assume  that  all  persons  have  an  ssn,  a  name  and  a  dob,  but  that  some  persons  don’t  have  a  nickname  nor  an  address.    (a)  5pts.  List  all  candidate  keys  and  all  superkeys  for  this  entity  set.    How  many  candidate  keys  and  how  many  superkeys  are  there?    Solution:  There  are  2  candidate  keys:  (ssn)  and  (name,  address,  dob).  

 There  are  16  superkeys,  listed  below.  First,  we  list  all  sets  of  attributes  that  include  the  first  candidate  key,  (ssn),  as  a  subset.    There  are  exactly  4  +  6  +  4  +  1  =  15  such  superkeys.  

  (ssn,  name),  (ssn,  nickname),  (ssn,  address),  (ssn,  dob),     (ssn,  name,  nickname),  (ssn,  name,  address),  (ssn,  name,  dob),     (ssn,  nickname,  address),  (ssn,  nickname,  dob),(ssn,  address,  dob),  

  (ssn,  name,  nickname,  address),  (ssn,  name,  nickname,  dob),  (ssn,  name,  address,  dob),  (ssn,  nickname,  address,  dob),  (ssn,  name,  nickname,  address,  dob)  

 Next,  we  list  all  sets  of  attributes  that  include  the  second  candidate  key,  (name,  address,  dob),  as  a  subset.    There  are  exactly  2+1=3  such  superkeys.  

(ssn,  name,  address,  dob),  (name,  nickname,  address,  dob),    (ssn,  name,  nickname,  address,  dob)  

 Note  that  2  superkeys  (in  bold)  include  both  candidate  keys  as  subsets.  Thus,  there  are  a  total  of  15  +  3–  2  =16  superkeys  for  this  relation.    (b)  5  pts.  Write  a  create  table  statement  that  defines  a  relation  appropriate  for  this  entity  set.  

Solution:    (Lengths  of  the  varchar  fields  are  unimportant)  create table Person ( ssn char (11) primary key, name varchar(64), nickname varchar(32), address varchar(128), dob date, unique (name, address, dob) );

Another  option  is  to  designate  ssn  as  unique,  and  (name,  address,  dob)  as  a  primary  key.      create table Person ( ssn char (11) unique, name varchar(64), nickname varchar(32), address varchar(128), dob date, primary key (name, address, dob) );

 Problem  4  (10pts):  Schemas  and  instances    Consider   an   instance   of   relation   Foo.   Below,   we   ask   you   to   write   three   create   table  statements.    Each  create  table  statement  must  define  a  primary  key.    Foo  (A,  B,  C,  D)    A   B   C   D  1   Ann   23   3  2   Bob   23   4  3   Joe   20   3  4   Bob   20   4      (a)  5  pts.  Write  two  different  create  table  statements  for  which  the  instance  of  Foo  is  legal.    Note  that  taking  the  first  statement  and  simply  reordering  columns  does  not  give  a  different  create  table  statement.    Solution:  

create table Foo ( A number primary key, B char(3), C number, D number ); create table Foo ( A number, B char(3), C number, D number, primary key (C,D) );  

 Other  solutions  are  possible  as  well.    (b)  5  pts.  Write  a  create  table  statement  that  would  make  the  instance  of  Foo  above  illegal.    Solution:  

create table Foo ( A number, B char(3) primary key, C number, D number );

 Other  solutions  are  possible  as  well.        

Problem  5  (10pts):  Foreign  keys    Consider  relation  schemas  below,  with  primary  keys  underlined.    Suppose  that  each  mayor  governs  exactly  one  city,  and  that  each  governor  governs  exactly  one  state.      City  (name,  state,  population,  elevation)  State  (name,  region)  Mayor  (name,  city,  state,  party)  Governor  (name,  state,  party)    (a)  7  pts.  Write  create  table  statements  that  encode  these  relation  schemas  and  business  rules  with  the  right  foreign  key  constraints.      Solution:            create  table  City  (     name                        varchar(64),     state                          varchar(32),     population  number,     elevation        number,     primary  key  (name,  state)          );          create  table  State  (     name  varchar(32)  primary  key,     region  varchar(32)        );        create  table  Mayor  (     name  varchar(128)  primary  key,     city          varchar(64)  not  null,     state      varchar(32)  not  null,     party  varchar(32),     foreign  key  (city,  state)  references  City(name,  state)      );      create  table  Governor  (     name  varchar(128)  primary  key,     state    varchar(32)  not  null,     party  varchar(32),     foreign  key  (state)  references  State(name)        );    (b)  3  pts.  In  what  order  would  you  drop  these  tables?    Give  all  valid  sequences.    Solution:  All  sequences  in  which  Mayor  is  dropped  before  City  and  Governor  is  dropped  before  State  are  valid,  namely:  

Governor,  Mayor,  City,  State  Mayor,  Governor,  City,  State  Mayor,  City,  Governor,  State  Governor,  State,  Mayor,  City  

 Part  6  (20pts):  Translating  ER  models  to  relational  schemas    Consider  ER  diagrams  below.  Write   a   SQL   statement   (create  table)   that   implements   the  constraints  specified  by  the  ER  diagram  below.    Create  as  many  tables  as  required.    Briefly  explain  which  constraints  are  captured  in  your  relational  implementation,  and  in  what  way.  If  a  constraint  cannot  be  implemented,  state  that  explicitly  in  your  explanation.    You  will  not  receive  full  credit  without  an  explanation.   (a)  5  pts.  

 

 (the line connecting FOREST and made_of is bold; the line connecting made_of and TREE is not bold, has an arrowhead)  Solution:  

create table Forest ( name varchar(128) primary key ); create table Made_Of_Trees ( id number primary key, forest_name varchar(128), foreign key (forest_name) references Forest(name) );  

 We   cannot   model   participation   constraint   on   FOREST   in   a   relational   schema.     Thus,   our  tables  above  only  model  the  key  constraint  on  TREE,  which  states  that  a  tree  belongs  to  at  most  one  forest.    This  constraint  is  implemented  by  making  id  in  Made_Of_Trees  a  primary  key  in  that  table.    Trees  that  do  not  belong  to  any  forest  will  still  appear  in  this  relation,  but  will  have  the  value  of  forest_name  set  to  null.        

TREE$

id#

made_of$FOREST$

name#

(b)  5  pts.    

 (the line annotated with “child” has an arrowhead and is not bold)  Solution:  

create table Person ( ssn char(11) primary key, name varchar(128), dob date ); create table Mother_Of ( child_ssn char(11) primary key, mother_ssn char(11), foreign key (child_ssn) references Person(ssn), foreign key (mother_ssn) references Person(ssn) );  

The  ER  diagram  specifies  that  a  person  has  at  most  one  mother  (key  constraint).    The  key  constraint  is  implemented  by  designating  child_ssn  as  primary  key  in  the  relation  Mother_Of.      

PERSON'

ssn'name'

dob'

mother_of'

child'mother'

(c)  5  pts.  

   

(the line connecting MONARCH and rules is bold, has an arrowhead; the line connecting rules and COUNTRY is not bold, has an arrowhead)  Solution:  The  ER  diagram  specifies  a  one-­‐to-­‐one  relationship  set  rules.    In  other  words,  it  specifies  that  each  monarch  rules  exactly  one  country  (key  and  participation  constraint),  and  that  each  country  is  ruled  by  at  most  one  monarch  (key  constraint).    We  can  encode  this  ER  diagram  in  the  following  two  relational  tables.    

create table Country ( name varchar(128) primary key ); create table Monarch_Rules ( name varchar(128) primary key, country_name varchar(128) not null unique, foreign key (country_name) references Country(name) );

 The  constraint  that  each  monarch  rules  exactly  one  country  is  modeled  by  the  primary  key  in  Monarch_Rules,  and  by  the  not  null  constraint  on  country_name  in  Monarch_Rules.    The  constraint   that   each   country   is   ruled   by   at   most   one   monarch   is   captured   by   the  combination   of   a   foreign   key   linking   together   Monarch_Rules   and   Country,   and   by   the  unique   constraint   on   country_name   in   Monarch_Rules.     Countries   that   are   not   ruled   by  monarchs  appear  in  Country  but  not  in  Monarch_Rules.          

COUNTRY(

!!

rules(MONARCH(

name! name!

(d)  5  pts.    

   

(neither of the lines is bold, and neither has an arrowhead)  Solution:  These  two  entity  sets  and  the  many-­‐to-­‐many  relationship  set  inhabit  are  modeled  with  3  create  table  statements.  

create table Animals ( genus varchar(128), species varchar(128), primary key (genus, species) ); create table Habitats ( name varchar(128) primary key ); create table Inhabit ( genus varchar(128), species varchar(128), habitat varchar(128), population number, primary key (genus, species, habitat), foreign key (genus, species) references Animals(genus, species), foreign key (habitat) references Habitats(name) );

HABITATS'

name%

ANIMALS'

genus%

inhabit'

species% popula.on%