🇯🇵 日本語 | 🇺🇸 English | 🇪🇸 Español | 🇵🇹 Português | 🇹🇭 ไทย | 🇨🇳 中文

【SQL】Complete Guide to Using the GROUP BY Clause! Aggregate Your Data Freely

Whether it's analyzing website traffic, sales on an e-commerce site, or user behavior in an app, "aggregation" is an essential task when handling data. Wouldn't it be helpful to instantly extract information like "number of products per category", "number of users per prefecture", or "total monthly sales"? The SQL GROUP BY clause is the key to making that happen.

The GROUP BY clause is a powerful tool that divides the data in a table into groups based on the values of a specific column, and applies aggregate functions (COUNT, SUM, AVG, etc.) to each group. At first glance, it may seem a bit complicated, but once you master it, your data analysis capabilities will expand dramatically.

This article is designed so even beginners aiming to become web creators can follow along. We’ll explain everything from the basics of GROUP BY to advanced techniques like using the HAVING clause and grouping by multiple columns—complete with plenty of "copy-paste ready code examples". Let's experience the fun of "grouping and aggregating data" together!


Preparation: Create Sample Data for Analysis

The best way to learn SQL is to actually try it out. For this article, we'll use a products table representing a fictional e-commerce site's product list. The SQL statements below create the table and insert sample data. All examples in this article use this table.

-- Drop the table if it exists (so you can re-run it)
DROP TABLE IF EXISTS products;

-- Create the products table
CREATE TABLE products (
  id INTEGER PRIMARY KEY,
  name TEXT NOT NULL,
  category TEXT NOT NULL,
  prefecture TEXT NOT NULL,
  price INTEGER NOT NULL,
  stock_quantity INTEGER NOT NULL
);

-- Insert data
INSERT INTO products (id, name, category, prefecture, price, stock_quantity) VALUES
(1, 'Amazing T-Shirt', 'Apparel', 'Tokyo', 3000, 50),
(2, 'Best Jeans Ever', 'Apparel', 'Okayama', 12000, 30),
(3, 'Magic Keyboard', 'PC Accessories', 'Tokyo', 8500, 20),
(4, 'Glowing Mouse', 'PC Accessories', 'Kanagawa', 4500, 45),
(5, 'Master’s Knife', 'Kitchen Goods', 'Niigata', 9800, 15),
(6, 'Ultimate Frying Pan', 'Kitchen Goods', 'Niigata', 6200, 25),
(7, 'Intro to SQL', 'Books', 'Tokyo', 2800, 100),
(8, 'Web Design for Beginners', 'Books', 'Tokyo', 3200, 80),
(9, 'Comfy Sneakers', 'Apparel', 'Tokyo', 7800, 60);

This table contains data like product ID, name, category, place of production (prefecture), price, and stock quantity. Note that we’ve intentionally included duplicate entries—for example, 3 items in the "Apparel" category and 4 items from "Tokyo". This is exactly what makes GROUP BY meaningful when aggregating data.


[Basic] Count the number of products by category using `GROUP BY`

Let’s start with the most basic use of GROUP BY. Suppose we want to find out how many types of products there are for each product category. To do this, we group the category column using GROUP BY, and use the familiar COUNT() function to count the number of rows in each group.

SELECT
  category,
  COUNT(*) AS product_count
FROM
  products
GROUP BY
  category;

Execution result:

category       | product_count
---------------|---------------
PC Accessories | 2
Apparel        | 3
Kitchen Goods  | 2
Books          | 2

We successfully retrieved the number of products per category! Internally, the SQL engine groups rows with the same category value (like 'Apparel', 'PC Accessories', etc.), then applies COUNT(*) to each group. Also note how AS product_count gives an alias to the result column.


[Advanced 1] Combine with other aggregate functions like `SUM` and `AVG`

The great thing about GROUP BY is that it’s not limited to COUNT(). You can freely combine it with other aggregate functions like SUM() (sum), AVG() (average), MAX() (maximum), and MIN() (minimum).

Let’s try calculating the “average price by category.”

SELECT
  category,
  AVG(price) AS average_price
FROM
  products
GROUP BY
  category;

Execution result:

category       | average_price
---------------|---------------
PC Accessories | 6500
Apparel        | 7600
Kitchen Goods  | 8000
Books          | 3000

As you can see, simply changing the aggregate function allows you to analyze different aspects of each group. For example, if you want to know the “total stock per category,” use SUM(stock_quantity).

SELECT
  category,
  SUM(stock_quantity) AS total_stock
FROM
  products
GROUP BY
  category;

Execution result:

category       | total_stock
---------------|-------------
PC Accessories | 65
Apparel        | 140
Kitchen Goods  | 40
Books          | 180

This makes it clear at a glance that the “Books” category has the most stock, while “Kitchen Goods” has the least.


[Advanced 2] Narrowing Down Aggregated Results with `HAVING` Clause

There are many cases where you want to further filter the results after using GROUP BY. For example, "I want to display only the categories that have 3 or more products."

Note that the regular WHERE clause is used to filter individual rows before aggregation, so it cannot be used for aggregated results like COUNT(*).

The `HAVING` clause is used to specify conditions on groups after aggregation.

SELECT
  category,
  COUNT(*) AS product_count
FROM
  products
GROUP BY
  category
HAVING
  COUNT(*) >= 3;

Execution result:

category | product_count
---------|---------------
Apparel  | 3

By adding the condition HAVING COUNT(*) >= 3, only the "Apparel" category, which has 3 products, was returned in the results.

`WHERE` is used before grouping, `HAVING` is used after grouping. Be sure to remember this difference.


[Advanced 3] Grouping by Multiple Columns

You can also use GROUP BY with multiple columns to create more detailed groups. For example, "I want to know the number of products for each combination of category and production location."

Simply specify multiple column names separated by commas in the GROUP BY clause.

SELECT
  category,
  prefecture,
  COUNT(*) AS product_count
FROM
  products
GROUP BY
  category, prefecture;

Execution result:

category   | prefecture | product_count
-----------|------------|---------------
Peripherals| Kanagawa   | 1
Peripherals| Tokyo      | 1
Apparel    | Okayama    | 1
Apparel    | Tokyo      | 2
Kitchen    | Niigata    | 2
Books      | Tokyo      | 2

You can see that "Apparel in Okayama" and "Apparel in Tokyo" are treated as separate groups. This allows for more detailed analysis, such as "There are 2 types of apparel products in Tokyo."


[Hands‑On Section] Execute SQL Freely in the Browser!

The theory is over! It's time for you to actually write SQL, run it, and verify the results.

Copy the entire HTML code below, save it as a file named sql_groupby_test.html (or similar) on your PC, and open it in a web browser. You'll get your own SQL execution environment with the products table embedded.

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>SQL GROUP BY Execution Environment</title>
  <style>
    body { font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif; line-height: 1.7; color: #333; max-width: 800px; margin: 2rem auto; padding: 0 1rem; }
    h1 { color: #2c3e50; }
    textarea { width: 100%; height: 180px; font-family: "SF Mono", "Consolas", monospace; font-size: 16px; padding: 12px; border: 1px solid #ccc; border-radius: 6px; box-sizing: border-box; margin-bottom: 1rem; }
    button { background-color: #3498db; color: white; border: none; padding: 12px 22px; font-size: 16px; border-radius: 6px; cursor: pointer; transition: background-color 0.2s; }
    button:hover { background-color: #2980b9; }
    button:disabled { background-color: #bdc3c7; cursor: not-allowed; }
    #result-container { margin-top: 2rem; border: 1px solid #ddd; padding: 1rem; border-radius: 6px; background: #fdfdfd; min-height: 50px; }
    #error-message { color: #e74c3c; font-weight: bold; }
    table { border-collapse: collapse; width: 100%; margin-top: 1rem; }
    th, td { border: 1px solid #ddd; padding: 10px; text-align: left; }
    th { background-color: #f2f2f2; }
    tr:nth-child(even) { background-color: #f9f9f9; }
  </style>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/sql.js/1.10.3/sql-wasm.js"></script>
</head>
<body>

  <h1>Try out SQL in Your Browser!</h1>
  <p>Enter SQL statements in the textarea below and click the “Execute” button. Try out the queries introduced in this article!</p>

  <textarea id="sql-input">-- Display average price and max price by prefecture
SELECT
  prefecture,
  AVG(price) AS "Average Price",
  MAX(price) AS "Max Price"
FROM
  products
GROUP BY
  prefecture;</textarea>
  
  <button id="execute-btn">Execute</button>
  
  <div id="result-container">
    <p id="error-message"></p>
    <div id="result-output"></div>
  </div>

  <script>
    const sqlInput = document.getElementById('sql-input');
    const executeBtn = document.getElementById('execute-btn');
    const errorMsg = document.getElementById('error-message');
    const resultOutput = document.getElementById('result-output');

    let db;

    async function initDb() {
      executeBtn.disabled = true;
      executeBtn.textContent = 'Initializing DB...';
      try {
        const SQL = await initSqlJs({
          locateFile: file => `https://cdnjs.cloudflare.com/ajax/libs/sql.js/1.10.3/${file}`
        });
        db = new SQL.Database();
        
        const setupSql = `
          DROP TABLE IF EXISTS products;
          CREATE TABLE products (
            id INTEGER PRIMARY KEY,
            name TEXT NOT NULL,
            category TEXT NOT NULL,
            prefecture TEXT NOT NULL,
            price INTEGER NOT NULL,
            stock_quantity INTEGER NOT NULL
          );
          INSERT INTO products (id, name, category, prefecture, price, stock_quantity) VALUES
          (1, 'Amazing T‑Shirt', 'Apparel', 'Tokyo', 3000, 50),
          (2, 'Best Jeans', 'Apparel', 'Okayama', 12000, 30),
          (3, 'Magic Keyboard', 'PC Accessories', 'Tokyo', 8500, 20),
          (4, 'Glowing Mouse', 'PC Accessories', 'Kanagawa', 4500, 45),
          (5, 'Artisan Knife', 'Kitchen Goods', 'Niigata', 9800, 15),
          (6, 'Ultimate Frypan', 'Kitchen Goods', 'Niigata', 6200, 25),
          (7, 'Intro to SQL', 'Books', 'Tokyo', 2800, 100),
          (8, 'Web Design Basics', 'Books', 'Tokyo', 3200, 80),
          (9, 'Comfy Sneakers', 'Apparel', 'Tokyo', 7800, 60);
        `;
        db.run(setupSql);
        
        executeBtn.disabled = false;
        executeBtn.textContent = 'Execute';
        resultOutput.innerHTML = '<p>Ready! Feel free to experiment with SQL.</p>';

      } catch (err) {
        errorMsg.textContent = 'Database initialization failed: ' + err.message;
        console.error(err);
      }
    }

    function executeSql() {
      if (!db) return;
      
      const sql = sqlInput.value;
      errorMsg.textContent = '';
      resultOutput.innerHTML = '';

      try {
        const results = db.exec(sql);
        if (results.length === 0) {
          resultOutput.innerHTML = '<p>Query succeeded but returned no results (e.g., INSERT, UPDATE).</p>';
          return;
        }
        
        results.forEach(result => {
          const table = document.createElement('table');
          const thead = document.createElement('thead');
          const tbody = document.createElement('tbody');
          
          const headerRow = document.createElement('tr');
          result.columns.forEach(colName => {
            const th = document.createElement('th');
            th.textContent = colName;
            headerRow.appendChild(th);
          });
          thead.appendChild(headerRow);
          
          result.values.forEach(row => {
            const bodyRow = document.createElement('tr');
            row.forEach(cellValue => {
              const td = document.createElement('td');
              td.textContent = cellValue === null ? 'NULL' : (typeof cellValue === 'number' ? cellValue.toLocaleString() : cellValue);
              bodyRow.appendChild(td);
            });
            tbody.appendChild(bodyRow);
          });
          
          table.appendChild(thead);
          table.appendChild(tbody);
          resultOutput.appendChild(table);
        });

      } catch (err) {
        errorMsg.textContent = 'SQL Error: ' + err.message;
        console.error(err);
      }
    }

    executeBtn.addEventListener('click', executeSql);
    
    initDb();
  </script>
</body>
</html>

[Challenge Task]


Common Pitfall for Beginners: Rules of the `SELECT` Clause

When using GROUP BY, there's a common error beginners often encounter. That error is the rule: "You can only use columns in the SELECT clause that are either specified in the GROUP BY clause or wrapped in an aggregate function."

For example, the following SQL will result in an error:

-- This is an error example
SELECT
  category,
  name -- ← Column not specified in GROUP BY or aggregate function
FROM
  products
GROUP BY
  category;

Why does this cause an error? Think about it logically. When you use GROUP BY category to combine rows for "Apparel", for example, the name column has multiple values, like "Amazing T-shirt", "Best Jeans", "Comfortable Sneakers". The SQL engine cannot decide which value to display, so it returns an error.

The rows grouped by GROUP BY represent the original multiple rows as a single row. Therefore, you can only include in the SELECT clause columns that identify the group (in this case, category) or aggregate functions (like COUNT(*) or SUM(price)).


Related Technique: Sorting Results with `ORDER BY`

Often, you'll want to sort the results from a GROUP BY query in a particular order. In that case, you can use the familiar ORDER BY clause. ORDER BY comes after GROUP BY and HAVING.

For example, let's sort the "number of products by category" query by the product count in descending order.

SELECT
  category,
  COUNT(*) AS product_count
FROM
  products
GROUP BY
  category
ORDER BY
  product_count DESC;

Execution Result:

category   | product_count
-----------|---------------
Apparel    | 3
PC Accessories | 2
Kitchen Goods | 2
Books      | 2

By using ORDER BY product_count DESC, we sorted the product_count column in descending order, making it easy to see which category has the most products.


Summary

In this article, we learned about the powerful GROUP BY clause for aggregating data.

Mastering GROUP BY opens the door to the world of data analysis, where you can not only retrieve data but also interpret trends and patterns hidden within it. Try experimenting with various aggregations in the hands-on section of this article, and you'll experience the power of GROUP BY. This will add a valuable "data aggregation skill" to your skillset as a web creator.