Skip to navigation Skip to content
Simon Technology Blog
  • Architecture
  • Cloud
  • Database
  • Develop
  • Hardware
  • Industry
  • Language
  • Mobile
  • Opensource
  • OS
  • Web
Main Navigation

Category: Hadoop

Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without understanding the underlying details of distributed. Make full use of the power of clusters for high-speed computing and storage. Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. HDFS has the characteristics of high fault tolerance and is designed to be deployed on low-cost hardware; and it provides high throughput (high throughput) to access application data, suitable for those with large data sets (large data sets). set) application. HDFS relaxes the requirements of POSIX and can access data in the file system in the form of streaming access. The core design of the Hadoop framework is: HDFS and MapReduce. HDFS provides storage for massive amounts of data, while MapReduce provides calculations for massive amounts of data.

Hive data type

Basic type < p align="center">Type name

Size

Minimum value

Maximum value

Example

TINYINT

1byte

-128

127

100Y

SMALLINT

2byte

October 12, 2021By Simo Hadoop data, hive, typeLeave a Comment

Hadoop – How to extract the first tuple from the package generated in the PIG (whose size may be different)?

I’m generating a “package” information, its size (the number of tuples in the package) may be different. From here, I want to dynamically extract the first element. I should How to do it Accordi

October 12, 2021By Simo Hadoop Different, Group, Medium, Of a group, possible, SizeLeave a Comment

Hadoop-HDFS-Storage Model – Architecture Model – Role Introduction

October 12, 2021By Simo Hadoop architecture, Hadoop, HDFS, Introduction, model, role, storageLeave a Comment

How CouchBase achieves powerful consistency

I searched for an explanation of how Couchbase achieves strong consistency within the cluster. Are all these results of using membase? Couchbase IS membase btw. Couchbase IS membase btw. Couchba

October 12, 2021By Simo Hadoop Consistency, Couchbase, How, implement, powerfulLeave a Comment

Hive Digital Client Interface Tool

1. Hive’s official website introduces three graphical interface tools that can connect to HiveServer2 through JDBC in Windows, including: SQuirrel SQL Client, Oracle SQL Developer and DbVisualizer.

October 12, 2021By Simo Hadoop client, hive, Interface, Number, tool, warehouseLeave a Comment

Why happened to hadoop spilling?

I am very new to the Hadoop system and the learning phase.

I noticed that Spill occurs as long as the MapOutputBuffer reaches 80% in the Shuffle and Sort phases (I think this can also be conf

October 12, 2021By Simo Hadoop Hadoop, spilling, why, will occurLeave a Comment

Hadoop – excludes partition fields from the selection query in Hive

Suppose I have the following table definition in Hive (the actual table has about 65 columns):

CREATE EXTERNAL TABLE S .TEST (
COL1 STRING,
COL2 STRING
)
PARTITIONED BY (extract_date STRING

October 12, 2021By Simo Hadoop fields, SubzaLeave a Comment

Hadoop uses Java to set FSPERMISSION to DIR in recursive way

Hi I have a test program, load the file to hdfs user/user1/data/app/type/file.gz on this path now this test program is run multiple times by multiple users . So I want to set the file permissions t

October 12, 2021By Simo Hadoop DIR, FSPERMISSION, Hadoop, Java, Recurns, setting, wayLeave a Comment

HIVE common statement

1Hive Introduction Hive to me is a data warehouse based on HDFS, which provides a type of SQL language (it is basically the same as the SQL standard but has some special differences) , Allows engi

October 12, 2021By Simo Hadoop common, hive, statementLeave a Comment

Hadoop – How to make S3DISTCP complies with the wrap

I have millions of small one-line s3 files that I want to merge together. I have s3distcp syntax, but I found that after merging the files, the merged set does not contain newline characters.

October 12, 2021By Simo Hadoop DISTCP, Hadoop, How, Merge, newline, S3DistCPLeave a Comment

Posts navigation

Page 1 Page 2 Page 3 … Page 10
Recent Posts
  • Sencha-Touch-2 – Sencha Touch 2, Nested XML Analysis NodeValue
  • Add a separation line and format XML content
  • Is there a norm of simplified XML subsets?
  • Look at it when you write React
  • ReactJS – Present React Redux React-Router App to add the server to the Firebase hosted by the Firebase
Categories
  • Android
  • Apache
  • Apache Kafka
  • Asp
  • Auto-Test
  • Automated Build
  • Aws
  • Bitcoin
  • Browser
  • C & C++
  • C#
  • Centos
  • Cgi
  • Character
  • Cloud Service
  • Cocos2dx
  • Cordova
  • CSS
  • Data Structure
  • Delphi
  • Design Pattern
  • Dojo
  • Dubbo
  • ELK
  • Flex
  • football
  • Game
  • Hadoop
  • Hibernate
  • HTML
  • Hybrid
  • Intel
  • IOS
  • Ipad
  • iPhone
  • Java
  • Javascript
  • Jetty
  • JQuery
  • Jsp
  • Linux
  • Load Balance
  • Lua
  • Macbook
  • Macos
  • Mathematics
  • Micro Services
  • Monitoring
  • Motherboard
  • Mysql
  • Network Hardware
  • Network Marketing
  • Nginx
  • NodeJs
  • Nosql
  • Oracle
  • Os Theory
  • Performance
  • PHP
  • Postgresql
  • Power Designer
  • React
  • Redis
  • Regexp
  • Rom
  • Rss
  • Ruby
  • Search Engines
  • Shell Script
  • Silicon Valley
  • Silverlight
  • Software Design
  • Spring
  • Sql
  • Sqlite
  • Sqlserver
  • Storage
  • Storm
  • Surface
  • SVN
  • Swift
  • System Architecture
  • Tablet
  • Uncategorized
  • Unix
  • Visual Basic
  • Visual Studio
  • Web Crawler
  • WebService
  • Windows
  • Wireless
  • XML
  • ZooKeeper
Archives
  • October 2021
  • September 2021
  • August 2021
  • May 2021
  • April 2021
  • September 2020
  • September 2019
  • August 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
© Simon Technology Blog 2025 • ThemeCountry Powered by WordPress