Configuring unit query performance is not good - configuration, not good, performance, query, Unit

I am adding 3 HIVE large tables (billion-row tables). Collected all the statistics, but the performance is still poor (query time is 40 minutes).

Can I set any parameters in the HIVE prompt for better performance?

When I try to execute, the information I see is like

Sep 4, 2015 7:40:23 AM INFO: parquet.hadoop. ParquetInputFormat: Total input paths to process: 1
Sep 4, 2015 7:40:23 AM INFO: parquet.hadoop.ParquetFileReader: reading another 1 footers

All tables are created in BigSql, The storage parameter is “STORED AS PARQUETFILE”

How to suppress the job progress details when running HIVE query?

About HIVE version

hive> set system:sun.java.command;
system:sun.java.command=org.apache .hadoop.util.RunJar /opt/ibm/biginsights/hive/lib/hive-cli-0.12.0.jar org.apache.hadoop.hive.cli.CliDriver -hiveconf hive.aux.jars.path=file:/ //opt/ibm/biginsights/hive/lib/hive-hbase-handler-0.12.0.jar,file:///opt/ibm/biginsights/hive/lib/hive-contrib-0.12.0.jar,file :///opt/ibm/biginsights/hive/lib/hbase-client-0.96.0.jar,file:///opt/ibm/biginsights/hive/lib/hbase-common-0.96.0.jar,file :///opt/ibm/biginsights/hive/lib/hbase-hadoop2-compat-0.96.0.jar,file:///opt/ibm/biginsights/hive/lib/hbase-prefix-tree-0.96.0 .jar,file:///opt/ibm/biginsights/hive/lib/hbase-protocol-0.96.0.jar,file:///opt/ibm/biginsights/hive/lib/hbase-server-0.96.0 .jar,file:///opt/ibm/biginsights/hive/lib/htrace-core-2.01.jar,file:///opt/ibm/biginsights/hive/lib/zookeeper-3.4.5.jar,file :///opt/ibm/biginsights/sheets/libext/piggybank.jar,file:///opt/ibm/biginsights/sheets/libext/pig-0.11.1.jar,file:///opt/ibm/ biginsights/sheets/libext/avro-1.7.4.jar,file:///opt/ibm/biginsights/sheets/libext/opencsv-1.8.jar,file:///opt/ibm/biginsights/sheets/libext/ json-simple-1.1.jar,file:///opt/ibm/biginsights/sheets/libext/joda-time-1.6.jar,file:///opt/ibm/biginsights/sheets/libext/bigsheets.jar, file:///opt/ibm/biginsights/sheets/libext/bigsheets-serdes-1.0.0.jar,file:///opt/ibm/biginsights/lib/parquet/parquet-mr/parquet-column-1.3. 2.jar,file:///opt/ibm/biginsights/lib/parquet/parquet-mr/parquet-common-1.3.2.jar,file:///opt/ibm/biginsights/lib/parquet/parquet- mr/parquet-encoding-1.3.2.jar,file:///opt/ibm/biginsights/lib/parquet/parquet-mr/parquet-generator-1.3.2.jar,file:///opt/ibm/ biginsights/lib/parquet/parquet-mr/parquet-hadoop-bundle-1.3.2.jar,file:///opt/ibm/biginsights/lib/parquet/parquet-mr/parquet-hive-bundle-1.3.2 .jar,file:///opt/ibm/biginsights/lib/parquet/parquet-mr/parquet-thrift-1.3.2.jar,file:///opt/ibm/biginsights/hive/lib/guava-11.0 .2.jar

Koushik – This question I asked for a month will give you a good insight into the performance of ORC vs Parquet.

Let me ask this question! What is your data structure? Is this nesting or flattery? If this is a flatter data, such as data ingested from RDBMS, ORC is better because it stores the optical index along with the data and makes data retrieval faster.

Hope this can help

I am joining 3 HIVE large tables (billion-row tables). All statistics are collected, but the performance is still poor (query time is 40 minutes). < p>