hadoop sqoop (instance) day-1
sqoop: is an open source tool, mainly used to transfer data between Hadoop and traditional databases (mysql). Import data from a relational database into Hadoop’s HDFS, or import HDFS data into a relational database.
Example 1: (Import table HDFS from mysql)
Create script: vim sqoop-customer.sh
#! /bin/bash
sqoop import \ –connect jdbc:mysql://localhost:3306/database name\ –table customers \ (customers is the mysql table name) –username root \ –password ok \ –target-dir /data/user \(create w file name imported into hdfs)
Example 2: (filter to hdfs through where statement)
#! /bin/bash
sqoop import \ –connect jdbc:mysql://localhost:3306/test \ –table customers \ –where “customer_id<10" \ --username root \ - password ok \ --delete-target-dir \ (delete if the original file exists) --target-dir /data/user/customers
Example three: (filter to hdfs through the columns statement)
h5>
#! /bin/bash
sqoop import \ –connect jdbc:mysql://localhost:3306/test \ –table customers \ –columns “customer_id,customer_fname,customer_lname” \ –username root \ –password ok \ –delete-target-dir \ –target-dir /data/user/customers
Example 4: (filter to hdfs by select statement)
# ! /bin/bash
sqoop import \ –connect jdbc:mysql://localhost:3306/test \ –query “select * from customer where customer_id=1and$CONDITIONS” –username root \- -password ok \ –split-by customer_id –delete-target-dir \ –target-dir /data/user/customers
Example 5: (append additional data)
#! /bin/bash
sqoop import \ –connect jdbc:mysql://localhost:3306/test \ –table customers \ –username root \ –password ok \ –where’customer_id< 100' \ --incremental append \ --check-column customer_id \ --last-value 10 \ --target-dir /data/user/customers