Skip to navigation Skip to content
Simon Technology Blog
  • Architecture
  • Cloud
  • Database
  • Develop
  • Hardware
  • Industry
  • Language
  • Mobile
  • Opensource
  • OS
  • Web
Main Navigation

Tag: Hadoop

Hadoop stand-alone in Linux _ Pseudo-distribution _ installation and configuration

Step 1: Create a hadoop user and authorize the hadoop user

(1) In a new Linux system CentOS-7-x86_64-DVD-1708 In the .iso, if the initial user is root and not a hadoop user, then one needs to

October 16, 2021By Simo Linux configuration, Distribution, Hadoop, installation, linux, middle, Pseudo, stand-aloneLeave a Comment

CentOS 7 Install Hadoop Cluster

Prepare three virtual machines, ip are 192.168.220.10 (master), 192.168.220.11 (slave1), 192.168.220.12 (slave2)

Ready jdk-6u45-linux- x64.bin and hadoop-1.2.1-bin.tar.gz, placed in the /usr/

October 14, 2021By Simo Centos centos, cluster, Hadoop, installLeave a Comment

Hadoop – ‘SparkContext’ Object No Properties ‘TextFile’

I tried to load the file using the following code:

textdata = sc.textfile(‘hdfs://localhost:9000 /file.txt’) Error message:

AttributeError:’SparkContext’ object has no attribute’textfil

October 12, 2021By Simo Hadoop Hadoop, no, object, Property, Sparkcontext, TextFileLeave a Comment

Hadoop has one or more files per mapper?

Does the mapper process multiple files at the same time or the mapper can only process one file at a time? I want to know the default behavior >By default, a typical Mapreduce job follows one inp

October 12, 2021By Simo Hadoop each, file, Hadoop, Mapping, middle, multiple, oneLeave a Comment

Hadoop – What is the difference between FirstInfirstoutPrioritizer and OldestFlowFileFirstPrioritizer?

The user guide https://nifi.apache.org/docs/nifi-docs/html/user-guide.html has the following detailed information about the priority sorter, please help I understand these differences and provide a

October 12, 2021By Simo Hadoop between, Difference, FirstInfirstoutprioritizer, Hadoop, NiFi, OldestflowFilefirstPrioritizer, whatLeave a Comment

Hadoop-HDFS-Storage Model – Architecture Model – Role Introduction

October 12, 2021By Simo Hadoop architecture, Hadoop, HDFS, Introduction, model, role, storageLeave a Comment

Why happened to hadoop spilling?

I am very new to the Hadoop system and the learning phase.

I noticed that Spill occurs as long as the MapOutputBuffer reaches 80% in the Shuffle and Sort phases (I think this can also be conf

October 12, 2021By Simo Hadoop Hadoop, spilling, why, will occurLeave a Comment

Hadoop uses Java to set FSPERMISSION to DIR in recursive way

Hi I have a test program, load the file to hdfs user/user1/data/app/type/file.gz on this path now this test program is run multiple times by multiple users . So I want to set the file permissions t

October 12, 2021By Simo Hadoop DIR, FSPERMISSION, Hadoop, Java, Recurns, setting, wayLeave a Comment

Hadoop – How to make S3DISTCP complies with the wrap

I have millions of small one-line s3 files that I want to merge together. I have s3distcp syntax, but I found that after merging the files, the merged set does not contain newline characters.

October 12, 2021By Simo Hadoop DISTCP, Hadoop, How, Merge, newline, S3DistCPLeave a Comment

Hadoop – Hive query stays 99%

I use left join to insert records in Hive. When I set limit 1 query, but for all record queries, it stays at 99% to reduce jobs.

Insert overwrite table tablename select a.id, b.name from a le

October 12, 2021By Simo Hadoop 99%, Hadoop, hive, inquiry, stayLeave a Comment

Posts navigation

Page 1 Page 2 … Page 4
Recent Posts
  • Sencha-Touch-2 – Sencha Touch 2, Nested XML Analysis NodeValue
  • Add a separation line and format XML content
  • Is there a norm of simplified XML subsets?
  • Look at it when you write React
  • ReactJS – Present React Redux React-Router App to add the server to the Firebase hosted by the Firebase
Categories
  • Android
  • Apache
  • Apache Kafka
  • Asp
  • Auto-Test
  • Automated Build
  • Aws
  • Bitcoin
  • Browser
  • C & C++
  • C#
  • Centos
  • Cgi
  • Character
  • Cloud Service
  • Cocos2dx
  • Cordova
  • CSS
  • Data Structure
  • Delphi
  • Design Pattern
  • Dojo
  • Dubbo
  • ELK
  • Flex
  • football
  • Game
  • Hadoop
  • Hibernate
  • HTML
  • Hybrid
  • Intel
  • IOS
  • Ipad
  • iPhone
  • Java
  • Javascript
  • Jetty
  • JQuery
  • Jsp
  • Linux
  • Load Balance
  • Lua
  • Macbook
  • Macos
  • Mathematics
  • Micro Services
  • Monitoring
  • Motherboard
  • Mysql
  • Network Hardware
  • Network Marketing
  • Nginx
  • NodeJs
  • Nosql
  • Oracle
  • Os Theory
  • Performance
  • PHP
  • Postgresql
  • Power Designer
  • React
  • Redis
  • Regexp
  • Rom
  • Rss
  • Ruby
  • Search Engines
  • Shell Script
  • Silicon Valley
  • Silverlight
  • Software Design
  • Spring
  • Sql
  • Sqlite
  • Sqlserver
  • Storage
  • Storm
  • Surface
  • SVN
  • Swift
  • System Architecture
  • Tablet
  • Uncategorized
  • Unix
  • Visual Basic
  • Visual Studio
  • Web Crawler
  • WebService
  • Windows
  • Wireless
  • XML
  • ZooKeeper
Archives
  • October 2021
  • September 2021
  • August 2021
  • May 2021
  • April 2021
  • September 2020
  • September 2019
  • August 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
© Simon Technology Blog 2025 • ThemeCountry Powered by WordPress