Use of Inner joins in Hive Query.

I have 2 tables product & psales, data in the tables is as shown below

select * from psales;
+-------------+---------------+--+
| psales.pid | psales.sales |
+-------------+---------------+--+
| 1 | 10 |
| 1 | 125 |
| 1 | 350 |
| 2 | 15 |
| 2 | 44 |
| 2 | 183 |
| 3 | 167 |
| 3 | 88 |
| 1 | 72 |
| 4 | 91 |
+-------------+---------------+--+

select * from product;
+--------------+----------------+--+
| product.pid | product.pname |
+--------------+----------------+--+
| 1 | p1 |
| 2 | p2 |
| 3 | p3 |
| 4 | p4 |
+--------------+----------------+--+

Objective is to get the product with 2nd highest combined sales.

Here is the query i'm currently using to get the product with highest combined sales (working fine)
select p1.pname, p1.total_sales from (select p.pid as pid, p.pname as pname, s.sales as sales,
sum(s.sales) over (partition by p.pid order by p.pid) as total_sales from product p
inner join psales s on (p.pid = s.pid) order by total_sales desc) p1 limit 1;

When i try to get the row_num in the inner query, it gives me error shown below :
select p1.pname as pname, p1.total_sales as total_sales, row_num() over (partition by pname order by pname) as rownum from (select p.pid as pid, p.pname as pname, s.sales as sales, sum(s.sales) over (partition by p.pid order by p.pid) as total_sales, row_num() over (partition by p.pid) as rownum from product p inner join psales s on (p.pid = s.pid)
order by total_sales desc) p1 where rownum =2;
Error: Error while compiling statement: FAILED: SemanticException Failed to breakup Windowing invocations into Groups. At least 1 group must only depend on input columns. Also check for circular dependencies. Underlying error: Invalid function row_num (state=1200,code=2000)

Please guide me.
Thanks in advance.

Tagged:
Sign In or Register to comment.