Structured Query Language, or SQL, is a programming language used to communicate with databases. It is used to retrieve, update, and manage data within a relational database management system (RDBMS). One of the most powerful features of SQL is the subquery. A subquery, also known as a nested query, is a query that is embedded within another query. This article will provide an in-depth look at SQL subqueries, their uses, and limitations.
A subquery is a query that is nested inside another query. It is also known as a nested query or inner query. A subquery is used to retrieve data from one or more tables, and then use that data in another query. The results of the subquery are used to perform calculations or make decisions in the outer query. A subquery can be used in the SELECT, FROM, WHERE, and HAVING clauses of a SQL statement.
2. Types of Subqueries
There are four types of subqueries in SQL: single-row subquery, multiple-row subquery, multiple-column subquery, and correlated subquery.
2.1 Single-Row Subquery
A single-row subquery returns a single row of data. It is used when the inner query returns only one row of data, and that row is used in the outer query. For example:
sql
Copy code
SELECT employee_name, salary
FROM employees
WHERE salary = (SELECT MAX(salary) FROM employees);
This query returns the name and salary of the employee with the highest salary.
2.2 Multiple-Row Subquery
A multiple-row subquery returns multiple rows of data. It is used when the inner query returns multiple rows of data, and those rows are used in the outer query. For example:
sql
Copy code
SELECT *
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE location_id = ‘US’);
This query returns all the employees who work in a department located in the US.
2.3 Multiple-Column Subquery
A multiple-column subquery returns multiple columns of data. It is used when the inner query returns multiple columns of data, and those columns are used in the outer query. For example:
SELECT department_name, (SELECT AVG(salary) FROM employees WHERE department_id = departments.department_id)
FROM departments;
This query returns the name of each department and the average salary of employees in that department.
2.4 Correlated Subquery
A correlated subquery is a subquery that references a column from the outer query. It is used when the inner query depends on the value of a column in the outer query. For example:
SELECT employee_name, salary
FROM employees e
WHERE salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id);
This query returns the name and salary of employees who earn more
than the average salary of their department. The inner query depends on the department_id column from the outer query.
3. Subquery Operators
There are several operators that can be used with subqueries in SQL. These operators include IN, NOT IN, EXISTS, NOT EXISTS, ANY, and ALL.
3.1 IN Operator
The IN operator is used to check if a value is in a list of values. It is commonly used with a subquery to compare the results of the inner query to a list of values. For example:
SELECT employee_name
FROM employees
WHERE department_id IN (SELECT department_id FROM departments WHERE department_name LIKE ‘Sales%’);
This query returns the names of employees who work in a department that starts with the word “Sales”.
3.2 NOT IN Operator
The NOT IN operator is used to check if a value is not in a list of values. It is the opposite of the IN operator. For example:
SELECT employee_name
FROM employees
WHERE department_id NOT IN (SELECT department_id FROM departments WHERE location_id = ‘US’);
This query returns the names of employees who do not work in a department located in the US.
3.3 EXISTS Operator
The EXISTS operator is used to check if a subquery returns any rows. It returns true if the subquery returns any rows, and false if it does not. For example:
SELECT *
FROM departments d
WHERE EXISTS (SELECT * FROM employees e WHERE e.department_id = d.department_id);
This query returns all the departments that have at least one employee.
3.4 NOT EXISTS Operator
The NOT EXISTS operator is used to check if a subquery does not return any rows. It returns true if the subquery does not return any rows, and false if it does. For example:
SELECT department_name
FROM departments d
WHERE NOT EXISTS (SELECT * FROM employees e WHERE e.department_id = d.department_id);
This query returns the names of departments that do not have any employees.
3.5 ANY Operator
The ANY operator is used to compare a value to a list of values returned by a subquery. It returns true if the value matches any value in the list, and false if it does not. For example:
SELECT employee_name, salary
FROM employees
WHERE salary > ANY (SELECT salary FROM employees WHERE job_id = ‘SA_REP’);
This query returns the name and salary of employees who earn more than any sales representative.
3.6 ALL Operator
The ALL operator is used to compare a value to a list of values returned by a subquery. It returns true if the value matches all values in the list, and false if it does not. For example:
SELECT employee_name, salary
FROM employees
WHERE salary > ALL (SELECT salary FROM employees WHERE job_id = ‘SA_REP’);
This query returns the name and salary of employees who earn more than all sales representatives.
4. Subquery Examples
Subqueries can be used in a variety of ways to retrieve and manipulate data from a database. Here are a few examples:
4.1 Subquery in the SELECT Clause
SELECT employee_name, (SELECT department_name FROM departments WHERE departments.department_id = employees.department_id)
FROM employees;
This query returns the name of each employee and the name of the department they work in.
4.2 Subquery in the WHERE Clause
SELECT *
FROM employees
WHERE department_id = (SELECT department_id FROM departments WHERE department_name = ‘Sales’);
This query returns all the employees who work in the Sales department.
4.3 Subquery in the FROM Clause
SELECT department_name, COUNT(*)
FROM (SELECT department_id, department_name FROM departments WHERE location_id = ‘US’) d
JOIN employees e
ON d.department_id = e.department_id
GROUP BY department_name;
This query returns the number of employees in each department located in the US.
4.4 Subquery in the HAVING Clause
SELECT department_id, AVG(salary)
FROM employees
GROUP BY department_id
HAVING AVG(salary) > (SELECT AVG(salary) FROM employees);
This query returns the department IDs and average salaries for departments where the average salary is greater than the overall average salary.
5. Conclusion
Subqueries are a powerful tool in SQL that allow for complex queries and data manipulation. They can be used in a variety of ways to retrieve and compare data from multiple tables, and to filter and manipulate results. By understanding subqueries and their operators, you can greatly expand your SQL querying capabilities and unlock new insights from your data.
FAQs
- What is a subquery in SQL?
- A subquery is a query within another query in SQL. It can be used to retrieve and compare data from multiple tables, and to filter and manipulate results.
- What are the operators that can be used with subqueries in SQL?
- The operators that can be used with subqueries in SQL include IN, NOT IN, EXISTS, NOT EXISTS, ANY, and ALL.
- What is the difference between the IN and NOT IN operators?
- The IN operator checks if a value is in a list of values, while the NOT IN operator checks if a value is not in a list of values.
- What is the EXISTS operator used for?
- The EXISTS operator is used to check if a subquery returns any rows.
- How can subqueries be used to manipulate data in SQL?
- Subqueries can be used to retrieve and compare data from multiple tables, and to filter and manipulate results. They can be used in the SELECT, WHERE, FROM, and HAVING clauses of a SQL query.
Ready to level up your data skills? Enroll in our SQL for Data Science course today and gain the knowledge and expertise needed to manage and manipulate databases with confidence. Start your learning journey now!
If you’re looking to jumpstart your career as a data analyst, consider enrolling in our comprehensive Data Analyst Bootcamp with Internship program. Our program provides you with the skills and experience necessary to succeed in today’s data-driven world. You’ll learn the fundamentals of statistical analysis, as well as how to use tools such as SQL, Python, Excel, and PowerBI to analyze and visualize data. But that’s not all – our program also includes a 3-month internship with us where you can showcase your Capstone Project.