Select (SQL)

This is an old revision of this page, as edited by Troels Arvin (talk | contribs) at 12:37, 3 February 2006 (oops in wikilinks). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A SELECT statement in SQL returns a result set of records from one or more tables.

It is used to retrieve zero or more rows from one or more tables in a database. In most applications, SELECT is the most commonly used Data Manipulation Language (DML) command. In specifying a SELECT query, the user specifies a description of the desired result set, but they do not specify what physical operations must be executed to produce that result set. Translating the query into an optimal "query plan" is left to the database system, more specifically to the query optimiser.

Commonly available keywords related to SELECT include:

  • WHERE – used to identify which rows to be retrieved, or applied to GROUP BY.
  • GROUP BY – used to combine rows with related values into elements of a smaller set of rows.
  • HAVING – used to identify which rows, following a GROUP BY, are to be retrieved.
  • ORDER BY – used to identify which columns are used to sort the resulting data.

Examples

Table "T" Query Result
C1 C2
1 a
2 b
SELECT * FROM T;
C1 C2
1 a
2 b
C1 C2
1 a
2 b
SELECT C1 FROM T;
C1
1
2
C1 C2
1 a
2 b
SELECT * FROM T WHERE C1 = 1;
C1 C2
1 a
C1 C2
1 a
2 b
SELECT * FROM T ORDER BY C1 DESC;
C1 C2
2 b
1 a

Given a table T, the query SELECT * FROM T; will result in all the elements of all the rows of the table being shown.

With the same table, the query SELECT C1 FROM T; will result in the elements from the column C1 of all the rows of the table being shown — in Relational algebra terms, a projection will be performed.

With the same table, the query SELECT * FROM T WHERE C1 = 1; will result in all the elements of all the rows where the value of column C1 is '1' being shown — in Relational algebra terms, a selection will be performed, because of the WHERE keyword.

The last query SELECT * FROM T ORDER BY C1 DESC; will output the same rows as the first query, however the results will be in reverse sort order (Z-A) because of the ORDER BY keyword using C1 as a sorting point. This query doesn't have a WHERE keyword, so anything and everything will be returned. Multiple ORDER BY items can be specified (seperated by comma [eg. ORDER BY C1 ASC, C2 DESC]) to further refine sorting.

Limiting result rows

In ISO SQL, result sets may be limited by using

  • cursors, or
  • By intoducing window functions to the SELECT-statement

Several window functions exist. ROW_NUMBER() OVER may be used for a simple limit on the returned rows. E.g., to return no more than ten rows:

SELECT * FROM (
  SELECT
    ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
    columns
  FROM tablename
) AS foo
WHERE rownumber <= 10

The above code is dangerous, because more than one row might qualify for the 10th position.

The RANK() OVER window function acts like ROW_NUMBER, but may return more than n rows in case of tie conditions. E.g., to return the top-10 youngest persons:

SELECT * FROM (
  SELECT
    RANK() OVER (ORDER BY age ASC) AS ranking,
    person_id,
    person_name,
    age
  FROM person
) AS foo
WHERE ranking <= 10

The above code could return more than ten rows, e.g. if there are eleven people of the same age.

Not all DBMSes support the mentioned window functions, and non-standard syntax has to be used. Below, variants of the simple limit query for different DBMSes are listed:

Vendor Limit Syntax
DB2 (Supports the standard)
Informix SELECT FIRST 10 * from T
Microsoft (Supports the standard, since SQL Server 2005)
MySQL SELECT * from T LIMIT 10
PostgreSQL SELECT * from T LIMIT 10
Oracle (Supports the standard)

See also