Skip to navigation Skip to content
Simon Technology Blog
  • Architecture
  • Cloud
  • Database
  • Develop
  • Hardware
  • Industry
  • Language
  • Mobile
  • Opensource
  • OS
  • Web
Main Navigation

Tag: Hadoop

hadoop – Apache Hive regexp_extract UDF

I encountered a piece of code in Apache Hive, such as regexp_extract(input,'[0-9] *’,0), can someone explain to me what this code does? Thank you Starting from the Hive manual DDL, it returns the

October 12, 2021By Simo Hadoop apache, extract, Hadoop, hive, regexp, UDFLeave a Comment

MapReduce on Hadoop said “output file already exists”

I ran a wordcount example using Mapreduce for the first time, and it worked. Then, I stopped the cluster, started it temporarily, and followed the same steps. This error is displayed:

10P:/$

October 12, 2021By Simo Hadoop already existed, file, Hadoop, MapReduce, Output, saidLeave a Comment

Hadoop fs -cp, said that the file does not exist?

The new.txt file is certain; I don’t know why when I try to enter the hdfs directory, it says the file does not exist.

deepak@deepak:/$cd $HOME/fs
deepak@deepak:~/fs$ls
new.txt
deepak@deepak:

October 12, 2021By Simo Hadoop cp, existence, file, fs, Hadoop, sayLeave a Comment

Hadoop sequence data access

According to Hadoop authoritative guidelines:

HDFS is a filesystem designed for storing very large files with
streaming or sequential data access patterns

What is streaming or sequenti

October 12, 2021By Simo Hadoop Access, data, Hadoop, orderLeave a Comment

Hadoop pseudo-distribution

The virtual machine creation and basic linux configuration are skipped, and the key configuration for building a pseudo-distributed hadoop cluster on a single node is recorded.

Get the hadoop

October 12, 2021By Simo Hadoop distributed, Hadoop, Pseudo, setLeave a Comment

Mac deployed Hadoop3 (pseudo-distributed)

Environmental information Operating system: macOS Mojave 10.14.6 JDK: 1.8.0_211 (installation location: /Library /Java/JavaVirtualMachines/jdk1.8.0_211.jdk/Contents/Home) hadoop: 3.2.1

In “S

September 24, 2021By Simo Macbook deployment, distributed, Hadoop, Hadoop3, Mac, PseudoLeave a Comment

(Heavy pound) fastest Hadoop fully distributed operation

1. Prepare the virtual machine Clone 3 linux virtual machines, only the machine with centos minimal mode installed

Network allocation table

Host name

IP address

hadoop1

<

August 22, 2021By Simo Hadoop complete, distributed, fastest, Hadoop, Heavy pound, runLeave a Comment

9, Hadoop-HDFS Overview

1. Background and definition of HDFS generation Background generation

As the amount of data becomes larger and larger, it is stored in a system If you don’t have all the data, you need to all

August 22, 2021By Simo Hadoop Hadoop, HDFS, overviewLeave a Comment

6-Hadoop operating mode (fully distributed) (on)

Note: In actual production and development, fully distributed is used

1) Prepare 3 clients (close firewall, static ip, host name)

2) Install JDK

3) Configure environment Variables

August 22, 2021By Simo Hadoop complete, distributed, Hadoop, mode, runLeave a Comment

Posts navigation

Page 1 … Page 3 Page 4
Recent Posts
  • Sencha-Touch-2 – Sencha Touch 2, Nested XML Analysis NodeValue
  • Add a separation line and format XML content
  • Is there a norm of simplified XML subsets?
  • Look at it when you write React
  • ReactJS – Present React Redux React-Router App to add the server to the Firebase hosted by the Firebase
Categories
  • Android
  • Apache
  • Apache Kafka
  • Asp
  • Auto-Test
  • Automated Build
  • Aws
  • Bitcoin
  • Browser
  • C & C++
  • C#
  • Centos
  • Cgi
  • Character
  • Cloud Service
  • Cocos2dx
  • Cordova
  • CSS
  • Data Structure
  • Delphi
  • Design Pattern
  • Dojo
  • Dubbo
  • ELK
  • Flex
  • football
  • Game
  • Hadoop
  • Hibernate
  • HTML
  • Hybrid
  • Intel
  • IOS
  • Ipad
  • iPhone
  • Java
  • Javascript
  • Jetty
  • JQuery
  • Jsp
  • Linux
  • Load Balance
  • Lua
  • Macbook
  • Macos
  • Mathematics
  • Micro Services
  • Monitoring
  • Motherboard
  • Mysql
  • Network Hardware
  • Network Marketing
  • Nginx
  • NodeJs
  • Nosql
  • Oracle
  • Os Theory
  • Performance
  • PHP
  • Postgresql
  • Power Designer
  • React
  • Redis
  • Regexp
  • Rom
  • Rss
  • Ruby
  • Search Engines
  • Shell Script
  • Silicon Valley
  • Silverlight
  • Software Design
  • Spring
  • Sql
  • Sqlite
  • Sqlserver
  • Storage
  • Storm
  • Surface
  • SVN
  • Swift
  • System Architecture
  • Tablet
  • Uncategorized
  • Unix
  • Visual Basic
  • Visual Studio
  • Web Crawler
  • WebService
  • Windows
  • Wireless
  • XML
  • ZooKeeper
Archives
  • October 2021
  • September 2021
  • August 2021
  • May 2021
  • April 2021
  • September 2020
  • September 2019
  • August 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
© Simon Technology Blog 2025 • ThemeCountry Powered by WordPress