Databases

Table of Contents

The relational model

Concepts: schema, state

Database schema tree

Data types

all table entries are data values that conform to some selection of data types.

example data types:

domain val(D) of a type D is the set of possible values for that type

with SQL, we can create our own application-specific domains (new data types). examples:

Relation schema

relation schema s (schema of single relation) defines:

written as: s = { A₁ : D₁, ..., An : Dn }

SQL notation:

CREATE TABLE exercises
    (CAT    CHAR(1),
     ENO    NUMERIC(2),
     TOPIC  VARCHAR(40),
     MAXPT  NUMERIC(2))

a relational database schema S defines:

so S = ({R₁, ..., Rm}, sch, C)

Tuples

used to formalize table rows. so e.g. in table exercises: (’H’, 1, ’Rel.Alg.’, 10)

Database states

let a database schema ({R₁, ..., Rm}, schema, C) be given.

a database state I for this database schema defines for every relation name Ri to a finite set of tuples I(Ri) w.r.t the relation schema Schema(Ri)

Null values

relational model allows missing attribute values.

these are represented by NULL. not the same as 0 or an empty string.

used to model scenarios like:

without null values, you'd have to split a relation into specific relations, like student_with_email and student_without_email. or the users would have to make up a fake value, which would not be uniform and you'd have no idea how to query.

problems:

so declaring many attributes as NOT NULL simplifies the program, but only do that if you're sure that there's always a value.

Constraints

the database should be an image of the relevant subset of the real world. plain definition of tables often allows too many database states.

Integrity constraints (IC): conditions which every database state has to satisfy, restricting the set of possible database states.

In CREATE TABLE, possible constraints:

Why specify constraints:

Keys

key of relation R is attribute A that uniquely identifies tuples in R. this refers to all possible database states, not just the current one.

key constraint is satisfied in database state iff all tuples have different values for A.

once something is declared as a key, the DBMS refuses insertion of tuples with duplicate key values.. this refers to all possible database states, not just the current one.

keys can consist of several attributes, then they are composite keys. if columns A,B form a composite key, there cannot be two tuples t ≠ u which agree in both attributes. keys should never change.

a key constraint becomes weaker (less restrictive) if attributes are added to it.

a key is minimal if no attribute can be removed without destroying the key constraint.

a relation may have more than one minimal key. one is the primary key, which cannot be null. other keys are alternate/secondary.

choosing a primary key: single simple attribute, never updated

Foreign keys

use key attributes to uniquely reference a tuple, like a pointer. denoted with arrows:

RESULTS (SID → STUDENTS,
        (CAT, ENO) → EXERCISES,
        POINTS)
        STUDENTS (SID,FIRST,LAST,EMAIL)
        EXERCISES (CAT,ENO,TOPIC,MAXPT)

to refer from relation R to tuples of S, add primary key attributes of S to attributes of R. only stable if the logical 'address' of a tuple does not change.

implements a one-to-many relationship.

an existence guarantee is needed.

foreign key constraints in SQL:

CREATE TABLE RESULTS (
    ...
    FOREIGN KEY (SID) REFERENCES STUDENTS(SID)
)

this ensures that every referenced row exists, which ensures referential integrity of the database.

these operations violate the constraints:

only keys may be referenced. a table with composite key must be referenced by composite key that has same number of attributes.