Getting Rid of Unwanted Rows
Extracting data from an SQL database can often be a daunting task, especially when you have thousands of records to contend with. There are a few different methods that you can use to get rid of unwanted data in your results. With these techniques, you can learn how to filter out unwanted rows in an SQL query.
The WHERE clause
Mastering the WHERE clause is the first step towards removing unwanted tuples from your output. This clause can be appended to almost any SQL command, especially the typical CRUD commands INSERT, SELECT, UPDATE and DELETE.
The statement below extracts all the rows in the ‘users’ table:
SELECT * FROM users;
But what if we only want to retrieve the users whose first name is William? Then, assuming a column called ‘firstName’, we could use the following WHERE clause to limit our results:
SELECT * FROM users WHERE firstName = ‘William’;
Identifying unique characteristics of the data you require and then using the WHERE clause to exclude entries which don’t match that criteria is a great way to filter out unwanted rows.
Removing Duplicate Rows
Using a WHERE clause to filter your SQL query is useful for limiting the range of data you retrieve, but what if you want to know how many unique items there are in your database?
SELECT salary FROM employees;
If two different people in a company earn the same amount of money, then the query above will return two rows. If you’re trying to find out the range of salaries in your company then this might result in you having to wade through a large amount of identical data. Instead what we need to do is make sure each salary is only shown once. This can be done using the ‘distinct’ keyword.
SELECT distinct salary FROM employees;
With the addition of the distinct keyword only a single row is returned for each different salary, regardless of the how many people actually earn that much. The nature of the distinct keyword means that it can only be used on SELECT queries, however it is a powerful tool for filtering out duplicate content.
Inner Joins are used when working with data stored in two separate tables, and will only return a row when there is a match in both tables. A useful example would be to consider a system with two tables, ‘customers’ and ‘orders’. A customer is identified by a unique number, which forms a foreign key in the orders table. This allows us to associate an order with a customer. If we then wanted to see all the orders made by a customer with the ID of ‘5’ we could use a query like this:
SELECT * FROM customers INNER JOIN orders ON customers.UID = orders.cust_Id WHERE UID = ‘5’;
This will filter out all customer records where the unique identifier isn’t 5, and any orders which are not associated with that UID. The output set is limited by data in an external table, and hence a tedious manual search is avoided.