hadoop - Sqoop import as Avro error -
stack : installed hdp-2.3.2.0-2950 using ambari 2.1
i trying import sql server table onto hdfs.
[sqoop@l1038lab root]$ sqoop import --connect 'jdbc:sqlserver://dbserver;database=dbname' --username someusername --password somepassword --as-avrodatafile --table dimsampledesc --warehouse-dir /dataload/tohdfs/reio/odpdw/may2016 --verbose
there 1 error in output :
writing avro schema file: /tmp/sqoop-sqoop/compile/bbbd98974f09b50a9335cedde30f73a5/dimsampledesc.avsc 16/05/09 13:09:00 debug mapreduce.datadrivenimportjob: not move avro schema file code output directory. java.io.filenotfoundexception: destination directory '.' not exist [createdestdir=true] @ org.apache.commons.io.fileutils.movefiletodirectory(fileutils.java:2865) @ org.apache.sqoop.mapreduce.datadrivenimportjob.writeavroschema(datadrivenimportjob.java:146) @ org.apache.sqoop.mapreduce.datadrivenimportjob.configuremapper(datadrivenimportjob.java:92) @ org.apache.sqoop.mapreduce.importjobbase.runimport(importjobbase.java:260) @ org.apache.sqoop.manager.sqlmanager.importtable(sqlmanager.java:673) @ org.apache.sqoop.manager.sqlservermanager.importtable(sqlservermanager.java:163) @ org.apache.sqoop.tool.importtool.importtable(importtool.java:497) @ org.apache.sqoop.tool.importtool.run(importtool.java:605) @ org.apache.sqoop.sqoop.run(sqoop.java:148) @ org.apache.hadoop.util.toolrunner.run(toolrunner.java:70) @ org.apache.sqoop.sqoop.runsqoop(sqoop.java:184) @ org.apache.sqoop.sqoop.runtool(sqoop.java:226) @ org.apache.sqoop.sqoop.runtool(sqoop.java:235) @ org.apache.sqoop.sqoop.main(sqoop.java:244)
the contents of /tmp/sqoop-sqoop/compile/bbbd98974f09b50a9335cedde30f73a5/ :
[sqoop@l1038lab root]$ ls -lrt /tmp/sqoop-sqoop/compile/bbbd98974f09b50a9335cedde30f73a5/ total 104 -rw-r--r--. 1 sqoop hadoop 61005 may 9 13:08 dimsampledesc.java -rw-r--r--. 1 sqoop hadoop 28540 may 9 13:08 dimsampledesc.class -rw-r--r--. 1 sqoop hadoop 9568 may 9 13:08 dimsampledesc.jar -rw-r--r--. 1 sqoop hadoop 3659 may 9 13:09 dimsampledesc.avsc
the contents of warehouse-dir :
[sqoop@l1038lab root]$ hadoop fs -ls /dataload/tohdfs/reio/odpdw/may2016 found 1 items drwxr-xr-x - sqoop hdfs 0 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc [sqoop@l1038lab root]$ [sqoop@l1038lab root]$ hadoop fs -ls /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc found 7 items -rw-r--r-- 3 sqoop hdfs 0 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/_success -rw-r--r-- 3 sqoop hdfs 2660 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00000.avro -rw-r--r-- 3 sqoop hdfs 5039870 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00001.avro -rw-r--r-- 3 sqoop hdfs 1437143 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00002.avro -rw-r--r-- 3 sqoop hdfs 1486327 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00003.avro -rw-r--r-- 3 sqoop hdfs 595550 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00004.avro -rw-r--r-- 3 sqoop hdfs 4792 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00005.avro [sqoop@l1038lab root]$ [sqoop@l1038lab root]$
i copied avsc , other files manually.
[sqoop@l1038lab root]$ hadoop fs -copyfromlocal /tmp/sqoop-sqoop/compile/d039c1b0b2a2b224d65943df1de34cdd/* /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/
now files in 1 place :
[sqoop@l1038lab root]$ hadoop fs -ls /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/ found 11 items -rw-rw-rw- 3 sqoop hdfs 3659 2016-05-09 13:49 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/dimsampledesc.avsc -rw-rw-rw- 3 sqoop hdfs 28540 2016-05-09 13:49 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/dimsampledesc.class -rw-rw-rw- 3 sqoop hdfs 9568 2016-05-09 13:49 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/dimsampledesc.jar -rw-rw-rw- 3 sqoop hdfs 61005 2016-05-09 13:49 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/dimsampledesc.java -rw-rw-rw- 3 sqoop hdfs 0 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/_success -rw-rw-rw- 3 sqoop hdfs 2660 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00000.avro -rw-rw-rw- 3 sqoop hdfs 5039870 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00001.avro -rw-rw-rw- 3 sqoop hdfs 1437143 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00002.avro -rw-rw-rw- 3 sqoop hdfs 1486327 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00003.avro -rw-rw-rw- 3 sqoop hdfs 595550 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00004.avro -rw-rw-rw- 3 sqoop hdfs 4792 2016-05-09 13:09 /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/part-m-00005.avro
now created hive table , described it:
create external table dimsampledesc row format serde 'org.apache.hadoop.hive.serde2.avro.avroserde' stored inputformat 'org.apache.hadoop.hive.ql.io.avro.avrocontainerinputformat' outputformat 'org.apache.hadoop.hive.ql.io.avro.avrocontaineroutputformat' tblproperties ( 'avro.schema.url'='hdfs://l1031lab.sss.se.com:8020/dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/dimsampledesc.avsc'); ok time taken: 0.166 seconds hive> hive> > describe formatted dimsampledesc; ok # col_name data_type comment smapiname_ver string smapicolname string charttype int x_indexet int y_indexet int x_tick string y_tick string x_tickrange string x_tickrangefrom string x_tickrangetom string y_tickrange string y_tickrangefrom string y_tickrangetom string indexcount int x_indexcount int y_indexcount int x_symbol string x_symbolname string x_symboldescr string y_symbol string y_symbolname string y_symboldescr string smapiname string incorrect_ver_fl boolean # detailed table information database: odp_dw_may2016 owner: hive createtime: mon may 09 14:46:40 cest 2016 lastaccesstime: unknown protect mode: none retention: 0 location: hdfs://l1031lab.sss.se.com:8020/apps/hive/warehouse/odp_dw_may2016.db/dimsampledesc table type: external_table table parameters: column_stats_accurate false external true avro.schema.url hdfs://l1031lab.sss.se.com:8020/dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/dimsampledesc.avsc numfiles 0 numrows -1 rawdatasize -1 totalsize 0 transient_lastddltime 1462798000 # storage information serde library: org.apache.hadoop.hive.serde2.avro.avroserde inputformat: org.apache.hadoop.hive.ql.io.avro.avrocontainerinputformat outputformat: org.apache.hadoop.hive.ql.io.avro.avrocontaineroutputformat compressed: no num buckets: -1 bucket columns: [] sort columns: [] storage desc params: serialization.format 1 time taken: 0.416 seconds, fetched: 56 row(s) hive> >
but no data found :
hive> > > select * dimsampledesc; ok time taken: 0.098 seconds hive>
the schema file :
[sqoop@l1038lab root]$ hadoop fs -cat /dataload/tohdfs/reio/odpdw/may2016/dimsampledesc/dimsampledesc.avsc { "type" : "record", "name" : "dimsampledesc", "doc" : "sqoop import of dimsampledesc", "fields" : [ { "name" : "smapiname_ver", "type" : [ "null", "string" ], "default" : null, "columnname" : "smapiname_ver", "sqltype" : "12" }, { "name" : "smapicolname", "type" : [ "null", "string" ], "default" : null, "columnname" : "smapicolname", "sqltype" : "12" }, { "name" : "charttype", "type" : [ "null", "int" ], "default" : null, "columnname" : "charttype", "sqltype" : "4" }, { "name" : "x_indexet", "type" : [ "null", "int" ], "default" : null, "columnname" : "x_indexet", "sqltype" : "4" }, { "name" : "y_indexet", "type" : [ "null", "int" ], "default" : null, "columnname" : "y_indexet", "sqltype" : "4" }, { "name" : "x_tick", "type" : [ "null", "string" ], "default" : null, "columnname" : "x_tick", "sqltype" : "-9" }, { "name" : "y_tick", "type" : [ "null", "string" ], "default" : null, "columnname" : "y_tick", "sqltype" : "-9" }, { "name" : "x_tickrange", "type" : [ "null", "string" ], "default" : null, "columnname" : "x_tickrange", "sqltype" : "-9" }, { "name" : "x_tickrangefrom", "type" : [ "null", "string" ], "default" : null, "columnname" : "x_tickrangefrom", "sqltype" : "-9" }, { "name" : "x_tickrangetom", "type" : [ "null", "string" ], "default" : null, "columnname" : "x_tickrangetom", "sqltype" : "-9" }, { "name" : "y_tickrange", "type" : [ "null", "string" ], "default" : null, "columnname" : "y_tickrange", "sqltype" : "-9" }, { "name" : "y_tickrangefrom", "type" : [ "null", "string" ], "default" : null, "columnname" : "y_tickrangefrom", "sqltype" : "-9" }, { "name" : "y_tickrangetom", "type" : [ "null", "string" ], "default" : null, "columnname" : "y_tickrangetom", "sqltype" : "-9" }, { "name" : "indexcount", "type" : [ "null", "int" ], "default" : null, "columnname" : "indexcount", "sqltype" : "4" }, { "name" : "x_indexcount", "type" : [ "null", "int" ], "default" : null, "columnname" : "x_indexcount", "sqltype" : "4" }, { "name" : "y_indexcount", "type" : [ "null", "int" ], "default" : null, "columnname" : "y_indexcount", "sqltype" : "4" }, { "name" : "x_symbol", "type" : [ "null", "string" ], "default" : null, "columnname" : "x_symbol", "sqltype" : "-9" }, { "name" : "x_symbolname", "type" : [ "null", "string" ], "default" : null, "columnname" : "x_symbolname", "sqltype" : "-9" }, { "name" : "x_symboldescr", "type" : [ "null", "string" ], "default" : null, "columnname" : "x_symboldescr", "sqltype" : "-9" }, { "name" : "y_symbol", "type" : [ "null", "string" ], "default" : null, "columnname" : "y_symbol", "sqltype" : "-9" }, { "name" : "y_symbolname", "type" : [ "null", "string" ], "default" : null, "columnname" : "y_symbolname", "sqltype" : "-9" }, { "name" : "y_symboldescr", "type" : [ "null", "string" ], "default" : null, "columnname" : "y_symboldescr", "sqltype" : "-9" }, { "name" : "smapiname", "type" : [ "null", "string" ], "default" : null, "columnname" : "smapiname", "sqltype" : "12" }, { "name" : "incorrect_ver_fl", "type" : [ "null", "boolean" ], "default" : null, "columnname" : "incorrect_ver_fl", "sqltype" : "-7" } ], "tablename" : "dimsampledesc" }[sqoop@l1038lab root]$ [sqoop@l1038lab root]$
what root cause , how proceed ?
use same avroschema file when sqoop create tables on top of in hive. can using avrotools.jar.
check if table in sql server has same data.
Comments
Post a Comment