One place for hosting & domains

      Constraints

      How To Use Constraints in SQL


      Introduction

      When designing an SQL database, there may be cases where you want to impose restrictions on what data can be added to certain columns in a table. SQL makes this possible through the use of constraints. After applying a constraint to a column or table, any attempts to add data to the column or table that doesn’t align with the constraint will fail.

      Different SQL implementations have their own unique ways of dealing with constraints. This guide provides an overview of the syntax that many database management systems use to manage constraints, using MySQL in examples throughout.

      Prerequisites

      In order to follow this guide, you will need a computer running some type of relational database management system (RDBMS) that uses SQL. The instructions and examples in this guide were validated using the following environment:

      Note: Please note that many RDBMSs use their own unique implementations of SQL. Although the commands outlined in this tutorial will work on most RDBMSs, the exact syntax or may differ if you test them on a system other than MySQL.

      It will also be helpful to have a general understanding of what SQL constraints are and how they function. For an overview of this concept, you can follow our article on Understanding SQL Constraints.

      You’ll also need a database you can use to practice creating tables with constraints. If you don’t have such a testing database, see the following Connecting to MySQL and Setting up a Sample Database section for details on how to create one.

      Connecting to and Setting up a Sample Database

      In case your SQL database system runs on a remote server, SSH into your server from your local machine:

      Then open up the MySQL server prompt, replacing sammy with the name of your MySQL user account:

      Create a database named constraintsDB:

      • CREATE DATABASE constraintsDB;

      If the database was created successfully, you’ll receive output like this:

      Output

      Query OK, 1 row affected (0.01 sec)

      To select the constraintsDB database, run the following USE statement:

      Output

      Database changed

      With that, you’re ready to follow the rest of the guide and begin learning about how to create and manage constraints in SQL.

      Creating Tables with Constraints

      Typically, you define constraints during a table’s creation. The following CREATE TABLE syntax creates a table named employeeInfo with three columns: empId, empName, and empPhoneNum. The statement also applies a UNIQUE constraint to the empId column. This will prevent any rows in the table must from having identical values in this column:

      • CREATE TABLE employeeInfo (
      • empId int UNIQUE,
      • empName varchar(30),
      • empPhoneNum int
      • );

      This statement defines the UNIQUE constraint immediately after the empId column, meaning that the constraint applies only to that column. If you were to try adding any data to this table, the DBMS will check the existing contents of only the empId to ensure that any new values you add to empId are in fact unique. This is what’s referred to as a colum-level constraint.

      You can also apply the constraint outside of the column definitions. The following example creates a table named racersInfo with three columns: racerId, racerName, and finish. Below the column definitions, it also applies a CHECK constraint to finish column to ensure that every racer has a finish greater than or equal to 1 (since no racer can place below first place):

      • CREATE TABLE racersInfo (
      • racerId int,
      • finish int,
      • racerName varchar(30),
      • CHECK (finish > 0)
      • );

      Because the constraint is applied outside of any individual column definition, you need to specify the name of the columns you want the constraint to apply to in parentheses. Any time you specify a constraint outside of the definition of a single column, it’s known as a table-level constraint. Column-level constraints only apply to individual columns, but table constraints like this can apply to or reference multiple columns.

      Naming Constraints

      Whenever you define a constraint, your RDBMS generates a name for it automatically. This name is used to reference the constraint in error messages in commands used to manage constraints.

      Sometimes, though, it’s convenient for database administrators to provide their own name for a constraint. Automatically-generated constraint names generally aren’t descriptive, so providing a name yourself can help you to remember a constraint’s purpose.

      To name a constraint, precede the constraint type with the CONSTRAINT keyword followed by the name of your choice. This example statement recreates the racersInfo table, renaming it to newRacersInfo and adding noNegativeFinish as the name for the CHECK constraint:

      • CREATE TABLE newRacersInfo (
      • racerId int,
      • finish int,
      • racerName varchar(30),
      • CONSTRAINT noNegativeFinish
      • CHECK (finish >= 1)
      • );

      Note: If you don’t set a name for a constraint, or you do but forget it later on, you’ll likely be able to find the name by consulting your database management system’s information schemas. Many modern database systems and clients even provide a shortcut to display internal CREATE statements that indicate a constraint’s name.

      Here are links to the official documentation for the relevant shortcut for MySQL and PostgreSQL:

      • MySQL: MySQL includes the SHOW CREATE TABLE statement, which returns the entire CREATE TABLE statement that created the named table:
      • SHOW CREATE TABLE table_name;
      • PostgreSQL: The PostgreSQL client psql has a number of options you can use to reveal information about a given table. The d option returns metadata of the named table:

      Managing Constraints

      In MySQL, you can add constraints to existing tables as well as delete them with ALTER TABLE statements.

      For example, the following command adds a UNIQUE constraint to the empName column in the employeeInfo table created previously:

      • ALTER TABLE employeeInfo ADD UNIQUE (empName);

      When adding a constraint to an existing table, you can also use the CONSTRAINT keyword to provide a name to identify the constraint. This example adds a UNIQUE constraint named uID to the racerId column from the racersInfo table created previously:

      • ALTER TABLE racersInfo ADD CONSTRAINT uID UNIQUE (racerId);

      If, before adding a constraint like this, you inserted any records that would violate the condition of the new constraint, the ALTER TABLE statement will fail.

      To delete a constraint, use the DROP CONSTRAINT syntax, followed by the name of the constraint you want to delete. This command deletes the racersPK constraint created in the previous command:

      • ALTER TABLE racersInfo DROP CONSTRAINT uID;

      Conclusion

      By reading this guide, you learned how to add and delete constraints to columns and tables using SQL. While the commands shown here should work on most relational databases, be aware that every SQL database uses its own unique implementation of the language. You should consult your DBMS’s official documentation for a more complete description of each command and their full sets of options.

      If you’d like to learn more about working with SQL, we encourage you to check out the other tutorials in this series on How To Use SQL.



      Source link

      Understanding SQL Constraints


      Introduction

      When designing a database, there may be times when you want to put limits on what data is allowed in certain columns. For example, if you’re creating a table that will hold information on skyscrapers, you may want the column holding each building’s height to prohibit negative values.

      Relational database management systems (RDBMSs) allow you to control what data gets added to a table with constraints. A constraint is a special rule that applies to one or more columns — or to an entire table — that restricts what changes can be made to a table’s data, whether through an INSERT, UPDATE, or DELETE statement.

      This article will review in detail what constraints are and how they’re used in RDBMSs. It will also walk through each of the five constraints defined in the SQL standard and explain their respective functions.

      What Are Constraints?

      In SQL, a constraint is any rule applied to a column or table that limits what data can be entered into it. Any time you attempt to perform an operation that changes that data held in a table — such as an INSERT, UPDATE, or DELETE statement — the RDBMS will test whether that data violates any existing constraints and, if so, return an error.

      Database administrators often rely on constraints to ensure that a database follows a set of defined business rules. In the context of a database, a business rule is any policy or procedure that a business or other organization follows and that its data must adhere to as well. For instance, say you’re building a database that will catalog a client’s store inventory. If the client specifies that each product record should have a unique identification number, you could create a column with a UNIQUE constraint that will ensure no two entries in that column are the same.

      Constraints are also helpful with maintaining data integrity. Data integrity is a broad term that’s often used to describe the overall accuracy, consistency, and rationality of data held in a database, based on its particular use case. Tables in a database are often closely related, with columns in one table being dependent on the values in another. Because data entry is often prone to human error constraints are useful in cases like this, as they can help ensure that no incorrectly entered data could impact such relationships and thus harm the database’s data integrity.

      Imagine you’re designing a database with two tables: one for listing current students at a school and another for listing members of that school’s basketball team. You could apply a FOREIGN KEY constraint to a column in the basketball team table which refers to a column in the school table. This will establish a relationship between the two tables by requiring any entry to the team table to refer to an existing entry in the students table.

      Users define constraints when they first create a table, or they can add them later on with an ALTER TABLE statement as long as it doesn’t conflict with any data already in the table. When you create a constraint, the database system will generate a name for it automatically, but in most SQL implementations you can add a custom name for any constraint. These names are used to refer to constraints in ALTER TABLE statements when changing or removing them.

      The SQL standard formally defines just five constraints:

      • PRIMARY KEY
      • FOREIGN KEY
      • UNIQUE
      • CHECK
      • NOT NULL

      Note: Many RDBMSs include the DEFAULT keyword, which is used to define a default value for a column other than NULL if no value is specified when inserting a row. The documentation of some of these database management systems refer to DEFAULT as a constraint, as their implementations of SQL use a DEFAULT syntax similar to that of constraints like UNIQUE or CHECK. However, DEFAULT technically is not a constraint since it doesn’t restrict what data can be entered into a column.

      Now that you have a general understanding of how constraints are used, let’s take a closer look at each of these five constraints.

      PRIMARY KEY

      The PRIMARY KEY constraint requires every entry in the given column to be both unique and not NULL, and allows you to use that column to identify each individual row in the table

      In the relational model, a key is a column or set of columns in a table in which every value is guaranteed to be unique and to not contain any NULL values. A primary key is a special key whose values are used to identify individual rows in a table, and the column or columns that comprise the primary key can be used to identify the table throughout the rest of the database.

      This is an important aspect of relational databases: with a primary key, users don’t need to know where their data is physically stored on a machine and their DBMS can keep track of each record and return them on an ad hoc basis. In turn, this means that records have no defined logical order, and users have the ability to return their data in whatever order or through whatever filters they wish.

      You can create a primary key in SQL with the PRIMARY KEY constraint, which is essentially a combination of the UNIQUE and NOT NULL constraints. After defining a primary key, the DBMS will automatically create an index associated with it. An index is a database structure that helps to retrieve data from a table more quickly. Similar to an index in a textbook, queries only have to review entries from the indexed column to find the associated values. This is what allows the primary key to act as an identifier for each row in the table.

      A table can only have one primary key but, like regular keys, a primary key can comprise more than one column. With that said, a defining feature of primary keys is that they use only the minimal set of attributes needed to uniquely identify each row in a table. To illustrate this idea, imagine a table that stores information about students at a school using the following three columns:

      • studentID: used to hold each student’s unique identification number
      • firstName: used to hold each student’s first name
      • lastName: used to hold each student’s last name

      It’s possible that some students at the school could share a first name, making the firstName column a poor choice of a primary key. The same is true for the lastName column. A primary key consisting of both the firstName and lastName columns could work, but there’s still a possibility that two students could share a first and last name.

      A primary key consisting of the studentID and either the firstName or lastName columns could work, but since each student’s identification number is already known to be unique, including either of the name columns in the primary key would be superfluous. So in this case the minimal set of attributes that can identify each row, and would thus be a good choice for the table’s primary key, is just the studentID column on its own.

      If a key is made up of observable application data (that is, data that represents real world entities, events, or attributes) it’s referred to as a natural key. If the key is generated internally and doesn’t represent anything outside the database, it’s known as a surrogate or synthetic key. Some database systems recommend against using natural keys, as even seemingly constant data points can change in unpredictable ways.

      FOREIGN KEY

      The FOREIGN KEY constraint requires that every entry in the given column must already exist in a specific column from another table.

      If you have two tables that you’d like to associate with one another, one way you can do so is by defining a foreign key with the FOREIGN KEY constraint. A foreign key is a column in one table (the “child” table) whose values come from a key in another table (the “parent”). This is a way to express a relationship between two tables: the FOREIGN KEY constraint requires that values in the column on which it applies must already exist in the column that it references.

      The following diagram highlights such a relationship between two tables: one used to record information about employees at a company and another used to track the company’s sales. In this example, the primary key of the EMPLOYEES table is referenced by the foreign key of the SALES table:

      Diagram example of how the EMPLOYEE table's primary key acts as the SALES table's foreign key

      If you try to add a record to the child table and the value entered into the foreign key column doesn’t exist in the parent table’s primary key, the insertion statement will be invalid. This helps to maintain relationship-level integrity, as the rows in both tables will always be related correctly.

      Oftentimes, a table’s foreign key is the parent table’s primary key, but this isn’t always the case. In most RDBMSs, any column in the parent table that has a UNIQUE or PRIMARY KEY constraint applied to it can be referenced by the child table’s foreign key.

      UNIQUE

      The UNIQUE constraint prohibits any duplicate values from being added to the given column.

      As its name implies, a UNIQUE constraint requires every entry in the given column to be a unique value. Any attempt to add a value that already appears in the column will result in an error.

      UNIQUE constraints are useful for enforcing one-to-one relationships between tables. As mentioned previously, you can establish a relationship between two tables with a foreign key, but there are multiple kinds of relationships that can exist between tables:

      • one-to-one: Two tables are said to have a one-to-one relationship if rows in the parent table are related to one and only one row in the child table
      • one-to-many: In a many-to-any relationship, a row in the parent table can relate to multiple rows in the child table, but each row in the child table can only relate to one row in the parent
      • many-to-many: If rows in the parent table can relate to multiple rows in the child table, and vice versa, the two are said to have a many-to-many relationship

      By adding a UNIQUE constraint to a column on which a FOREIGN KEY constraint has been applied, you can ensure that each entry in the parent table appears only once in the child, thereby establishing a one-to-one relationship between the two tables.

      Note that you can define UNIQUE constraints at the table level as well as the column level. When defined at the table level, a UNIQUE constraint can apply to more than one column. In cases like this, each column included in the constraint can have duplicate values but every row must have a unique combination of values in the constrained columns.

      CHECK

      A CHECK constraint defines a requirement for a column, known as a predicate, that every value entered into it must meet.

      CHECK constraint predicates are written in the form of an expression that can evaluate to either TRUE, FALSE, or potentially UNKNOWN. If you attempt to enter a value into a column with a CHECK constraint and the value causes the predicate to evaluate to TRUE or UNKNOWN (which happens for NULL values), the operation will succeed. However, if the expression resolves to FALSE, it will fail.

      CHECK predicates often rely on a mathematical comparison operator (like <, >, <=, OR >=) to limit the range of data allowed into the given column. For instance, one common use for CHECK constraints is to prevent certain columns holding negative values in cases where a negative value wouldn’t make sense, as in the following example.

      This CREATE TABLE statement creates a table named productInfo with columns for each product’s name, identification number, and price. Because it wouldn’t make sense for a product to have a negative price, this statement imposes a CHECK constraint on the price column to ensure that it only contains positive values:

      • CREATE TABLE productInfo (
      • productID int,
      • name varchar(30),
      • price decimal(4,2)
      • CHECK (price > 0)
      • );

      Not every CHECK predicate must use a mathematical comparison operator. Typically, you can use any SQL operator that can evaluate to TRUE, FALSE, or UNKNOWN in a CHECK predicate, including LIKE, BETWEEN, IS NOT NULL, and others. Some SQL implementations, but not all, even allow you to include a subquery in a CHECK predicate. Be aware, though, that most implementations do not allow you to reference another table in a predicate.

      NOT NULL

      The NOT NULL constraint prohibits any NULL values from being added to the given column.

      In most implementations of SQL, if you add a row of data but don’t specify a value for a certain column, the database system will by default represent the missing data as NULL. In SQL, NULL is a special keyword used to represent an unknown, missing, or otherwise unspecified value. However, NULL is not a value itself but instead the state of an unknown value.

      To illustrate this difference, imagine a table used to track clients at a talent agency that has columns for each client’s first and last names. If a client goes by a mononym — like “Cher”, “Usher”, or “Beyoncé” — the database administrator might only enter the mononym in the first name column, causing the DBMS to enter NULL in the last name column. The database doesn’t consider the client’s last name to literally be “Null.” It just means that the value for that row’s last name column is unknown or the field doesn’t apply for that particular record.

      As its name implies, the NOT NULL constraint prevents any values in the given column from being NULL. This means that for any column with a NOT NULL constraint, you must specify a value for it when inserting a new row. Otherwise, the INSERT operation will fail.

      Conclusion

      Constraints are essential tools for anyone looking to design a database with a high level of data integrity and security. By limiting what data gets entered into a column, you can ensure that relationships between tables will be maintained correctly and that the database adheres to the business rules that define its purpose.

      For more detailed information on how to create and manage SQL constraints, you can review our guide on How To Use Constraints in SQL. If you’d like to learn more about SQL in general, we encourage you to check out our series on How To Use SQL.



      Source link