[hive][apache] hive basic instruction－FLASHC

[hive][apache] hive basic instruction

hive的簡介

緣起
- hadoop core 的介紹
- hdfs
- MapReduce 特點與架構
發展
- facebook 發展
- Qubole
The Qubole Data Service (QDS) is a Software-as-a-Service analytics platform running on leading cloud offerings like AWS.
- 使用的公司
* baidu
* facebook
* qubole

使用比例
為什麼使用hive的轉變

tips
- 對於同一個表使用多個查詢
(Making Multiple Passes over the Same Data)
The following rewrite achieves the same thing, but using a single pass through the source history table

HDFS was designed for many millions of large files, not billions of small files
Each partition corresponds to a directory that usually contains multiple files.
MapReduce processing converts a job into multiple tasks.

Another solution is to use two levels of partitions along different dimensions. For ex- ample, the first partition might be by day and the second-level partition might be by geographic region, like the state:

The primary reason to avoid normalization is to minimize disk seeks, such as those typically required to navigate foreign key relations
when you have 10s of terabytes to many petabytes of data, optimizing speed makes these limitations worth accepting.

peicheng

FLASHC

peicheng 發表在痞客邦留言(0) 人氣()

E-mail轉寄

«	四月 2025	»
日	一	二	三	四	五	六
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

四月 2025

日

一

二

三

四

五

六

FLASHC

FLASHC It's time to starting forward. Do what you love. Love what you do.

公告版位

[hive][apache] hive basic instruction

歷史上的今天

留言列表

月曆

近期文章

文章彙整

文章分類

最新迴響

我的連結

參觀人氣

RSS訂閱

«	四月 2025					»
日	一	二	三	四	五	六
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

«	四月 2025					»
日	一	二	三	四	五	六
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

«	四月 2025					»
日	一	二	三	四	五	六
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30