hadoop - HIVE left join on nearest date -
i trying join 2 tables in hive using key , nearest date in 2 tables @ time of join. example: below 2 input tables
<----------table a-------------> <------------table b------------> a_id a_date changed_col b_id b_date b_value a_id **** ****** *********** **** ****** ******* ***** a01 2017-03-20 abc b01 2017-04-02 200 a01 a01 2017-04-01 xyz b01 2017-04-04 500 a01 a01 2017-04-05 lll
however when left join table b table a, should nearest lowest date in table same key(a_id). below expected output table:
b_id b_date a_id a_date changed col b_value **** ****** **** ****** *********** ******* b01 2017-02-04 a01 2017-01-04 xyz 200 b01 2017-04-04 a01 2017-01-04 xyz 500
any appreciated. thanks
select b.b_id ,b.b_date ,b.a_id ,a.a_date ,a.changed_col ,b_value b left join (select * (select b.b_id ,a.a_date ,a.changed_col ,row_number () on ( partition b.b_id order a.a_date desc ) rn b join on a.a_id = b.a_id a.a_date <= b.b_date ) rn = 1 ) on a.b_id = b.b_id
+------+------------+------+------------+-------------+---------+ | b_id | b_date | a_id | a_date | changed_col | b_value | +------+------------+------+------------+-------------+---------+ | b01 | 2017-04-02 | a01 | 2017-04-01 | xyz | 200 | | b01 | 2017-04-04 | a01 | 2017-04-01 | xyz | 500 | +------+------------+------+------------+-------------+---------+
Comments
Post a Comment