Category: Hadoop

Hadoop is a distributed system infrastructure developed by the Apache Foundation. Users can develop distributed programs without understanding the underlying details of distributed. Make full use of the power of clusters for high-speed computing and storage. Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. HDFS has the characteristics of high fault tolerance and is designed to be deployed on low-cost hardware; and it provides high throughput (high throughput) to access application data, suitable for those with large data sets (large data sets). set) application. HDFS relaxes the requirements of POSIX and can access data in the file system in the form of streaming access. The core design of the Hadoop framework is: HDFS and MapReduce. HDFS provides storage for massive amounts of data, while MapReduce provides calculations for massive amounts of data.

Hive Basics

hadoop hive 1) Hive was born in 2007,

2) 2014 hive 0.13.0 is very popular (it is relatively stable for the first time)

3) 2015hive1.2.0 (relatively only an upgrade)

4) 2016hive2.1

October 12, 2021By Simo Hadoop foundation, hive, KnowledgeLeave a Comment

Simulated Telnet Protocol C Language Client Program

First of all, we must understand the telnet protocol. The following two blogs gave me preliminary ideas.
https://www.cnblogs.com/liang-ling/p/5833489.html This one is relatively basic Introduction

October 12, 2021By Simo Hadoop Agreement, C language, client, program, simulation, telnetLeave a Comment

Delivery problem with SQOOP imports

Use the following sqoop1 import when importing

sqoop import –connect jdbc:oracle:thin:@ip:port/ORCL –username user –password pwd –table db.table –target-dir /path –delete-target-dir -m

October 12, 2021By Simo Hadoop export, import, problem, separator, Sqoop, time, utilizationLeave a Comment

HBase (2) – Build Standalone HBase

HBase build–Standalone HBase 1, build method description the setup of a single-node standalone HBase. A standalone instance has all
HBase daemons — the Master, RegionServers, and ZooKeeper — runni

October 12, 2021By Simo Hadoop built, HBase, Second, StandaloneLeave a Comment

Event – How do I adjust my event observer in Magento?

My observer is not called. I want to know how to dispatch the event so that I can debug it. < div class="answer"> This is a breakdown of Magento’s Mage_Core_Model_App::dispatchEvent() call, which

October 12, 2021By Simo Hadoop China, debugging, Event, How, Magento, my, observationLeave a Comment

Big Data Quarry Presto, 10 times faster than HIVE

At present, the most popular big data query engine is Hive. It is a SQL-like query tool based on MR. It interprets the input query SQL as MapReduce, which can greatly reduce the threshold for using

October 12, 2021By Simo Hadoop big, cutlery, data, deployment, Double, FAST, hive, Presto, queryLeave a Comment

Hadoop fs -cp, said that the file does not exist?

The new.txt file is certain; I don’t know why when I try to enter the hdfs directory, it says the file does not exist.

deepak@deepak:/$cd $HOME/fs
deepak@deepak:~/fs$ls
new.txt
deepak@deepak:

October 12, 2021By Simo Hadoop cp, existence, file, fs, Hadoop, sayLeave a Comment

Hadoop sequence data access

According to Hadoop authoritative guidelines:

HDFS is a filesystem designed for storing very large files with
streaming or sequential data access patterns

What is streaming or sequenti

October 12, 2021By Simo Hadoop Access, data, Hadoop, orderLeave a Comment

Hadoop pseudo-distribution

The virtual machine creation and basic linux configuration are skipped, and the key configuration for building a pseudo-distributed hadoop cluster on a single node is recorded.

Get the hadoop

October 12, 2021By Simo Hadoop distributed, Hadoop, Pseudo, setLeave a Comment

Several ways of HBase brings

The editor here introduces two ways to import data, one is based on hive, and the other is to generate HFile based on basic files.

This method requires a jar package support:
download link:

October 12, 2021By Simo Hadoop data, Guide, HBase, methods, severalLeave a Comment