hadoop - HIVE left join on nearest date -


i trying join 2 tables in hive using key , nearest date in 2 tables @ time of join. example: below 2 input tables

<----------table a------------->            <------------table b------------> a_id    a_date      changed_col             b_id    b_date      b_value a_id ****    ******      ***********             ****    ******      ******* *****    a01     2017-03-20      abc                 b01     2017-04-02  200     a01 a01     2017-04-01      xyz                 b01     2017-04-04  500     a01 a01     2017-04-05      lll              

however when left join table b table a, should nearest lowest date in table same key(a_id). below expected output table:

b_id    b_date          a_id        a_date      changed col   b_value ****    ******          ****        ******      ***********   ******* b01     2017-02-04      a01     2017-01-04      xyz             200 b01     2017-04-04      a01     2017-01-04      xyz             500 

any appreciated. thanks

select  b.b_id        ,b.b_date        ,b.a_id        ,a.a_date        ,a.changed_col        ,b_value                 b          left join  (select  *                        (select  b.b_id                                    ,a.a_date                                    ,a.changed_col                                     ,row_number () on                                     (                                         partition    b.b_id                                         order        a.a_date desc                                     ) rn                                        b                                     join                                        on      a.a_id = b.a_id                               a.a_date <= b.b_date                              )                      rn = 1                     )          on          a.b_id  =                     b.b_id 

+------+------------+------+------------+-------------+---------+ | b_id |   b_date   | a_id |   a_date   | changed_col | b_value | +------+------------+------+------------+-------------+---------+ | b01  | 2017-04-02 | a01  | 2017-04-01 | xyz         |     200 | | b01  | 2017-04-04 | a01  | 2017-04-01 | xyz         |     500 | +------+------------+------+------------+-------------+---------+ 

Comments

Popular posts from this blog

php - Permission denied. Laravel linux server -

google bigquery - Delta between query execution time and Java query call to finish -

python - Pandas two dataframes multiplication? -