Recently, due to work needs, I need to use hue to organize tasks and use sqoop to import data. The import script statement is as follows
sqoop import --connect jdbc:oracle:thin:@ip:port/db --username user --password pwd --query "select col1,col2 from db.table where \$CONDITIONS" --target-dir /user/kjxydata/src/LT_READER_${date_time} --delete-target-dir -m 1 --null-string'\\N' --null-non -string'\\N' --as-textfile --fields-terminated-by "\t" --hive-drop-import-delims
But there is a runtime error.
When writing sqoop import statements with hue, there are several pitfalls.
- 1. Do not add sqoop in the command window, start directly from import.
- 2. There is a problem with using query in the command window. For sql after query, because hue calls oozie, oozie disassembles sql into multiple parameters when parsing the command, instead of treating it as one parameter, causing the command to be unable to be parsed at runtime.
For the second problem, there are currently two solutions:
- 1. Use ssh to run the script directly in hue
< li>2. Empty the command box, and enter the command in the parameter box
In order to keep all the sqoop form commands consistent, I personally use the second method. The specific solution is shown in the figure:
, pay attention to The query statement is written in an arg.
There is another point, please note that select col1,col2 from db.table where \$CONDITIONS
, if you use query in sqoop, you need to add where $CONDITIONS
code>, if it is used in the script, remember to add \
, but do not add \
in the parameter window.
For details, please refer to this cloudera question. The content after xml is more clear.