Superkey
In the relational data model a superkey is a set of attributes that uniquely identifies each tuple of a relation.[1][2] Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is, non-key attributes are functionally dependent on the superkey.
The set of all attributes is always a superkey (the trivial superkey). Tuples in a relation are by definition unique, with duplicates removed after each operation, so the set of all attributes is always uniquely valued for every tuple. A candidate key (or minimal superkey) is a superkey that can't be reduced to a simpler superkey by removing an attribute.[3]
For example, in an employee schema with attributes employeeID
, name
, job
, and departmentID
, if employeeID
values are unique then employeeID
combined with any or all of the other attributes can uniquely identify tuples in the table. Each combination, {employeeID
}, {employeeID
, name
}, {employeeID
, name
, job
}, and so on is a superkey. {employeeID
} is a candidate key, since no subset of its attributes is also a superkey. {employeeID
, name
, job
, departmentID
} is the trivial superkey.
If attribute set K is a superkey of relation R, then at all times it is the case that the projection of R over K has the same cardinality as R itself.
Example
Monarch Name | Monarch Number | Royal House |
---|---|---|
Edward | II | Plantagenet |
Edward | III | Plantagenet |
Richard | III | Plantagenet |
Henry | IV | Lancaster |
First, list out all the sets of attributes:
- • {}
- • {Monarch Name}
- • {Monarch Number}
- • {Royal House}
- • {Monarch Name, Monarch Number}
- • {Monarch Name, Royal House}
- • {Monarch Number, Royal House}
- • {Monarch Name, Monarch Number, Royal House}
Second, eliminate all the sets which do not meet superkey's requirement. For example, {Monarch Name, Royal House} cannot be a superkey because for the same attribute values (Edward, Plantagenet), there are two distinct tuples:
- (Edward, II, Plantagenet)
- (Edward, III, Plantagenet)
Finally, after elimination, the remaining sets of attributes are the only possible superkeys in this example:
- {Monarch Name, Monarch Number} — this is also the candidate key
- {Monarch Name, Monarch Number, Royal House}
In reality, superkeys cannot be determined simply by examining one set of tuples in a relation. A superkey defines a functional dependency constraint of a relation schema which must hold for all possible instance relations of that relation schema.
References
- Date, Christopher (2015). "Codd's First Relational Papers: A Critical Analysis" (PDF). warwick.ac.uk. Retrieved 2020-01-04.
Note that the extract allows a "relation" to have any number of primary keys, and moreover that such keys are allowed to be "redundant" (better: reducible). In other words, what the paper calls a primary key is what later (and better) became known as a superkey, and what the paper calls a nonredundant (better: irreducible) primary key is what later became known as a candidate key or (better) just a "key".
-
Introduction to Database Management Systems. Tata McGraw-Hill. 2005. p. 77. ISBN 9780070591196.
no two tuples in any legal relation
- Saiedian, H. (1996-02-01). "An Efficient Algorithm to Compute the Candidate Keys of a Relational Database Schema". The Computer Journal. 39 (2): 124–132. doi:10.1093/comjnl/39.2.124. ISSN 0010-4620.
Further reading
- Silberschatz, Abraham (2011). Database System Concepts (6th ed.). McGraw-Hill. pp. 45–46. ISBN 978-0-07-352332-3.
External links
- Relation Database terms of reference, Keys: An overview of the different types of keys in an RDBMS