当前位置:   article > 正文

数据采集-->kafka-->hdfs

数据采集-->kafka-->hdfs

数据采集到kafka

flume:

  1. a1.sources = r1
  2. a1.channels = c1
  3. a1.sources.r1.type = TAILDIR
  4. a1.sources.r1.filegroups = f1
  5. a1.sources.r1.filegroups.f1 = /opt/installs/flume1.9/job/a.log
  6. a1.sources.r1.positionFile = /opt/installs/flume1.9/job/taildir-kafka.json
  7. a1.channels.c1.type = org.apache.flume.channel.kafka.KafkaChannel
  8. a1.channels.c1.kafka.bootstrap.servers =hadoop11:9092,hadoop12:9092,hadoop13:9092
  9. a1.channels.c1.kafka.topic = topica
  10. a1.channels.c1.parseAsFlumeEvent = false
  11. a1.sources.r1.channels = c1

执行命令:

flume-ng agent --conf conf  --name a1 --conf-file job/taildir-kafka.conf -Dflume.root.logger=INFO,console

 向a.log添加测试数据: 

消费者:

 数据从kafka到hdfs

flume:

 

  1. a1.sources = r1
  2. a1.channels = c1
  3. a1.sinks = k1
  4. a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
  5. a1.sources.r1.batchSize=5000
  6. a1.sources.r1.batchDurationMillis=2000
  7. a1.sources.r1.kafka.bootstrap.servers =hadoop11:9092,hadoop12:9092,hadoop13:9092
  8. a1.sources.r1.kafka.topics = topica
  9. a1.sources.r1.kafka.consumer.group.id = g1
  10. a1.channels.c1.type = memory
  11. a1.channels.c1.capacity=5000
  12. a1.channels.c1.transactionCapacity=5000
  13. a1.sinks.k1.type = hdfs
  14. a1.sinks.k1.batchSize = 5000
  15. a1.sinks.k1.hdfs.path = hdfs://hadoop11:8020/flume/date=%Y-%m-%d
  16. a1.sinks.k1.hdfs.useLocalTimeStamp = true
  17. a1.sinks.k1.hdfs.fileType = DataStream
  18. a1.sinks.k1.hdfs.round = true
  19. a1.sinks.k1.hdfs.rollInterval =0
  20. a1.sinks.k1.hdfs.rollSize = 1048576
  21. a1.sinks.k1.hdfs.rollCount = 0
  22. a1.sources.r1.channels = c1
  23. a1.sinks.k1.channel = c1

执行命令:

flume-ng agent --conf conf  --name a1 --conf-file job/kafka-hdfs.conf -Dflume.root.logger=INFO,console

向a.log添加测试数据: 

消费者:

hdfs:

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/正经夜光杯/article/detail/1016919
推荐阅读
相关标签
  

闽ICP备14008679号