• support@answerspoint.com

Using LIMIT within GROUP BY to get N results per group?

594
SELECT
year, id, rate
FROM h
WHERE year BETWEEN 2000 AND 2009
AND id IN (SELECT rid FROM table2)
GROUP BY id, year
ORDER BY id, rate DESC

yields:

year    id  rate
2006    p01 8
2003    p01 7.4
2008    p01 6.8
2001    p01 5.9
2007    p01 5.3
2009    p01 4.4
2002    p01 3.9
2004    p01 3.5
2005    p01 2.1
2000    p01 0.8
2001    p02 12.5
2004    p02 12.4
2002    p02 12.2
2003    p02 10.3
2000    p02 8.7
2006    p02 4.6
2007    p02 3.3

What I'd like is only the top 5 results for each id:

2006    p01 8
2003    p01 7.4
2008    p01 6.8
2001    p01 5.9
2007    p01 5.3
2001    p02 12.5
2004    p02 12.4
2002    p02 12.2
2003    p02 10.3
2000    p02 8.7

Is there a way to do this using some kind of LIMIT like modifier that works within the GROUP BY?

2Answer


0

This can be done in MySQL, but it is not as simple as adding a LIMIT clause. Here is an article that explains the problem in detail:

How to select the first/least/max row per group in SQL

It's a good article - he introduces an elegant but naïve solution to the "Top N per group" problem, and then gradually improves on it.

  • answered 2 years ago
  • G John

0

You could use GROUP_CONCAT aggregated function to get all years into a single column, grouped by id and ordered by rate:

SELECT   id, GROUP_CONCAT(year ORDER BY rate DESC) grouped_year
FROM     yourtable
GROUP BY id

Result:

-----------------------------------------------------------
|  ID | GROUPED_YEAR                                      |
-----------------------------------------------------------
| p01 | 2006,2003,2008,2001,2007,2009,2002,2004,2005,2000 |
| p02 | 2001,2004,2002,2003,2000,2006,2007                |
-----------------------------------------------------------

And then you could use FIND_IN_SET, that returns the position of the first argument inside the second one, eg.

SELECT FIND_IN_SET('2006', '2006,2003,2008,2001,2007,2009,2002,2004,2005,2000');
1

SELECT FIND_IN_SET('2009', '2006,2003,2008,2001,2007,2009,2002,2004,2005,2000');
6

Using a combination of GROUP_CONCAT and FIND_IN_SET, and filtering by the position returned by find_in_set, you could then use this query that returns only the first 5 years for every id:

SELECT
  yourtable.*
FROM
  yourtable INNER JOIN (
    SELECT
      id,
      GROUP_CONCAT(year ORDER BY rate DESC) grouped_year
    FROM
      yourtable
    GROUP BY id) group_max
  ON yourtable.id = group_max.id
     AND FIND_IN_SET(year, grouped_year) BETWEEN 1 AND 5
ORDER BY
  yourtable.id, yourtable.year DESC;
  • answered 2 years ago
  • G John

Your Answer

    Facebook Share