Tutorial - Apache Hive - Apache Software Foundation
https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-Joins


Hive 的說明內只出了Joins的幾種組合用法,
其中有個是 left semi join


In order check the existence of a key in another table, the user can use LEFT SEMI JOIN as illustrated by the following example.

INSERT OVERWRITE TABLE pv_users
SELECT u.*
FROM user u LEFT SEMI JOIN page_view pv ON (pv.userid = u.id)
WHERE pv.date = '2008-03-03';

如果有兩張表
A,B

id name   
1  abc
2  edf

B
id city
1  taipei
2  ku
1  yl

使用 left semi join 時,B表只會出現一筆 rec ,達到去重效果。

cf.

Hive Join(翻译自Hive wiki) - ggjucheng - 博客园
http://www.cnblogs.com/ggjucheng/archive/2013/01/15/2860723.html

 

arrow
arrow
    文章標籤
    hive hadoop
    全站熱搜

    peicheng 發表在 痞客邦 留言(0) 人氣()