What Does DISTINCT in SQL Statements Do?
The DISTINCT keyword modifies the results of the specified SQL statement ensuring that it returns unique rows in its result set. For example, suppose we have a list of pets' names and their species in a pet store and we want to know how many species of animals are in the store. Assume we have a table with two columns – name and species – with the data as shown in the diagram on the left.
Let's run the following SQL statement on the table:
SELECT species from pets;
This would give us the result in the following form:
You can see the repetition is quite disturbing. Using the DISTINCT modifier like so:
SELECT DISTINCT species from pets;
We get our refined results as:
In the second case our results are more palatable as we do not get repetition in the result set. What about running the following query?
SELECT name, species from pets;
This would return the entire table as is. If we are to run this distinct version:
SELECT DISTINCT name, species from pets;
This would also return the same result as though the DISTINCT keyword was never used. What happened here? Here the SQL database engine has evaluated the entire column as a single unit and is looking for distinct combinations factoring all rows. In this case, DISTINCT does not make a difference. It will still be wise to use the DISTINCT version regardless as you do not know how many rows you are dealing with. Your store could house thousands of pets and your data entry clerk could as well make a double entry and you would then need to filter out his additional entry.