Hive Basics - foundation, hive, Knowledge

hadoop hive

1) Hive was born in 2007,

2) 2014 hive 0.13.0 is very popular (it is relatively stable for the first time)

3) 2015hive1.2.0 (relatively only an upgrade)

4) 2016hive2.1.0 (updated many functions)

1.1hive metadata management

h4>

1) Modeling and processing metadata through hive, turning it into a table form, establishing data warehouse management

2) Establishing a form similar to a relational database, not a storage On hive, it is stored in a relational database, while the metadata in hive reality is not stored on hadoop

3) mysql provides storage for hive

1.2 Hive’s architecture

Processing large data, suitable for P-processing data,

metastore: relational database

Driver: core driver Server: link jdbc

1.3 Commonly used commands:

1) Belline mode

2) Command line mode

formatted

1.4 Data type

String (universal), decimal (calculation of array)

Complex data type

array array[‘apple’,’Orange’,’ Mongo’]

map(mapping)[‘a’:’apple’,’o’:’orange’] key-value pair

Equivalent to how many points a skill level has reached

struct (can have multiple columns)

1.5 Metadata table structure

database is a subfolder, table is a subfolder

< p>partition: corresponds to the folder to analyze the data

buckets: the optimization of the bucket query connection, which determines how the data is distributed, is a part of the data file, and corresponds to a data file (the data on the file Separate)

row (row): Corresponds to a data file is viewed horizontally

views: query the imaging of data data, no data is stored

Index: Index, corresponding to folders and files

1.6 hiveDatabase (database)

Create database: create database name

To switch database: use database name

Default path: /user/hive/warehouse

1.7 hive Tables (external table to internal table)

1) external tables (external tables): add Location’address’ when creating a table with keywords, deleting the table will not delete data

2) internal tables (management Table): The data is completely managed by Hive. Deleting the table (metadata) will delete the data.

Hive table creation statement

Must write: row format delimited; (delimited split)

Each column | segmentation: fields terminated by’|’

We use arrays, segmentation: collection items terminated by’,’

Key-value pairs we use: Split: Map keys terminated by’:’