当前位置:   article > 正文

Sqoop全量导入mysql表数据到HDFS_sqoop从mysql导入数据到hdfs

sqoop从mysql导入数据到hdfs

                  我是在三个节点运行的,主节点只有namenode和mysql数据库

1.开启服务

具有NameNode和DataNode

start-all.sh

2.进入sqoop的目录下并且输入代码

  1. 下面的命令用于从 MySQL 数据库服务器中的 emp 表导入 HDFS。
  2. bin/sqoop import \
  3. --connect jdbc:mysql://Master:3306/userdb \
  4. --username root \
  5. --password 000000 \
  6. --delete-target-dir \
  7. --target-dir /sqoopresult \
  8. --table emp --m 1
  9. /*
  10. 这里的sqoop import是固定搭配
  11. Master是主节点,如果本地可以写localhost, userdb是mysql所选数据库
  12. username/password 是mysql的账号和密码
  13. delete-target-dir 它的意思是查看hdfs上是否有所创的文件夹,如果有就删去重新覆盖
  14. target-dir /sqoopresult 是在hdfs上创一文件夹sqoopresult,将emp表导入hdfs
  15. emp 是mysql数据库的表名
  16. --m 1 是运行map操作的一个进程
  17. */

 3.运行错误点

这里有时会报错例如:

ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@291ae is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@291ae is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.

可以检查MySQL驱动和更新connection

4.运行展示

  1. [root@master sqoop]# bin/sqoop import \
  2. > --connect jdbc:mysql://Master:3306/userdb \
  3. > --username root \
  4. > --password 000000 \
  5. > --delete-target-dir \
  6. > --target-dir /sqoopresult \
  7. > --table emp --m 1
  8. Warning: /usr/local/src/sqoop/../hcatalog does not exist! HCatalog jobs will fail.
  9. Please set $HCAT_HOME to the root of your HCatalog installation.
  10. Warning: /usr/local/src/sqoop/../accumulo does not exist! Accumulo imports will fail.
  11. Please set $ACCUMULO_HOME to the root of your Accumulo installation.
  12. 22/10/21 12:47:30 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
  13. 22/10/21 12:47:30 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using
  14. 22/10/21 12:47:30 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
  15. 22/10/21 12:47:30 INFO tool.CodeGenTool: Beginning code generation
  16. Fri Oct 21 12:47:30 CST 2022 WARN: Establishing SSL connection without server's identity verification is not reconnection must be established by default if explicit option isn't set. For compliance with existing applicationsneed either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for ser
  17. 22/10/21 12:47:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `emp` AS t LIMIT 1
  18. 22/10/21 12:47:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `emp` AS t LIMIT 1
  19. 22/10/21 12:47:31 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/src/hadoop-2.6.0
  20. 注: /tmp/sqoop-root/compile/376a663c8523374b2cfb22cf997e628c/emp.java使用或覆盖了已过时的 API。
  21. 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
  22. 22/10/21 12:47:33 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/376a663c8523374b2cfb22c
  23. SLF4J: Class path contains multiple SLF4J bindings.
  24. SLF4J: Found binding in [jar:file:/usr/local/src/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/o
  25. SLF4J: Found binding in [jar:file:/usr/local/src/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerB
  26. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  27. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
  28. 22/10/21 12:47:34 INFO tool.ImportTool: Destination directory /sqoopresult is not present, hence not deleting.
  29. 22/10/21 12:47:34 WARN manager.MySQLManager: It looks like you are importing from mysql.
  30. 22/10/21 12:47:34 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
  31. 22/10/21 12:47:34 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
  32. 22/10/21 12:47:34 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
  33. 22/10/21 12:47:34 INFO mapreduce.ImportJobBase: Beginning import of emp
  34. 22/10/21 12:47:34 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
  35. 22/10/21 12:47:34 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.map
  36. 22/10/21 12:47:34 INFO client.RMProxy: Connecting to ResourceManager at Master/10.7.75.205:8032
  37. Fri Oct 21 12:47:42 CST 2022 WARN: Establishing SSL connection without server's identity verification is not reconnection must be established by default if explicit option isn't set. For compliance with existing applicationsneed either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for ser
  38. 22/10/21 12:47:42 INFO db.DBInputFormat: Using read commited transaction isolation
  39. 22/10/21 12:47:43 INFO mapreduce.JobSubmitter: number of splits:1
  40. 22/10/21 12:47:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1666320636626_0001
  41. 22/10/21 12:47:44 INFO impl.YarnClientImpl: Submitted application application_1666320636626_0001
  42. 22/10/21 12:47:44 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_16663206366
  43. 22/10/21 12:47:44 INFO mapreduce.Job: Running job: job_1666320636626_0001
  44. 22/10/21 12:47:51 INFO mapreduce.Job: Job job_1666320636626_0001 running in uber mode : true
  45. 22/10/21 12:47:51 INFO mapreduce.Job: map 0% reduce 0%
  46. 22/10/21 12:47:53 INFO mapreduce.Job: map 100% reduce 0%
  47. 22/10/21 12:47:53 INFO mapreduce.Job: Job job_1666320636626_0001 completed successfully
  48. 22/10/21 12:47:53 INFO mapreduce.Job: Counters: 32
  49. File System Counters
  50. FILE: Number of bytes read=0
  51. FILE: Number of bytes written=0
  52. FILE: Number of read operations=0
  53. FILE: Number of large read operations=0
  54. FILE: Number of write operations=0
  55. HDFS: Number of bytes read=100
  56. HDFS: Number of bytes written=133393
  57. HDFS: Number of read operations=131
  58. HDFS: Number of large read operations=0
  59. HDFS: Number of write operations=9
  60. Job Counters
  61. Launched map tasks=1
  62. Other local map tasks=1
  63. Total time spent by all maps in occupied slots (ms)=1678
  64. Total time spent by all reduces in occupied slots (ms)=0
  65. TOTAL_LAUNCHED_UBERTASKS=1
  66. NUM_UBER_SUBMAPS=1
  67. Total time spent by all map tasks (ms)=1678
  68. Total vcore-seconds taken by all map tasks=1678
  69. Total megabyte-seconds taken by all map tasks=1718272
  70. Map-Reduce Framework
  71. Map input records=5
  72. Map output records=5
  73. Input split bytes=87
  74. Spilled Records=0
  75. Failed Shuffles=0
  76. Merged Map outputs=0
  77. GC time elapsed (ms)=31
  78. CPU time spent (ms)=520
  79. Physical memory (bytes) snapshot=172040192
  80. Virtual memory (bytes) snapshot=3014025216
  81. Total committed heap usage (bytes)=60882944
  82. File Input Format Counters
  83. Bytes Read=0
  84. File Output Format Counters
  85. Bytes Written=151
  86. 22/10/21 12:47:53 INFO mapreduce.ImportJobBase: Transferred 130.2666 KB in 19.0367 seconds (6.8429 KB/sec)
  87. 22/10/21 12:47:53 INFO mapreduce.ImportJobBase: Retrieved 5 records.

 可以到hdfs的web页面查看

 

 这里的_SUCCESS是MR运行成功标志

part-m-0000就是只运行了m操作没有r操作

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/你好赵伟/article/detail/519520
推荐阅读
相关标签
  

闽ICP备14008679号