当前位置:   article > 正文

airflow安装与使用

airflow 安装 dummy operator

1.安装环境

  1. CentOS-6.5
  2. Python-2.7.12
  3. setuptools-29.0.1
  4. pip-9.0.1

2.编译Python

  1. sudo yum install -y gcc
  2. sudo yum install -y gcc-c++
  3. sudo yum install -y wget
  4. sudo yum install -y mysql
  5. sudo yum install -y mysql-devel
  6. sudo yum install -y python-devel
  7. sudo yum install -y zlib-devel
  8. sudo yum install -y openssl-devel
  9. sudo yum install -y sqlite-devel
  10. wget https://www.python.org/ftp/python/2.7.12/Python-2.7.12.tgz
  11. sudo mkdir /usr/local/python27
  12. sudo tar zxfv Python-2.7.12.tgz -C /usr/local/
  13. cd /usr/local/Python-2.7.12/
  14. ./configure --prefix=/usr/local/python27
  15. make
  16. make install
  17. sudo mv /usr/bin/python /usr/bin/python2.6
  18. sudo ln -sf /usr/local/python/bin/python /usr/bin/python2.7
  19. vim /usr/bin/yum
  20. #!/usr/bin/python2.6
  21. vim /etc/profile
  22. export PYTHON_HOME=/usr/bin/python2.6
  23. export PATH=$PYTHON_HOME/bin:$PATH
  24. wget https://pypi.python.org/packages/59/88/2f3990916931a5de6fa9706d6d75eb32ee8b78627bb2abaab7ed9e6d0622/setuptools-29.0.1.tar.gz#md5=28ecfd0f2574b489b9a18343879a7324
  25. tar zxfv setuptools-29.0.1.tar.gz
  26. cd setuptools-29.0.1
  27. python setup.py install
  28. wget https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#md5=35f01da33009719497f01a4ba69d63c9
  29. tar zxfv pip-9.0.1.tar.gz
  30. cd pip-9.0.1
  31. python setup.py install
  32. pip install --upgrade pip
  33. wget https://pypi.python.org/packages/a5/e9/51b544da85a36a68debe7a7091f068d802fc515a3a202652828c73453cad/MySQL-python-1.2.5.zip#md5=654f75b302db6ed8dc5a898c625e030c
  34. unzip MySQL-python-1.2.5.zip
  35. cd MySQL-python-1.2.5
  36. python setup.py install
  37. #第三方包 /usr/local/python27/lib/python2.7/site-packages

3.安装

    airflow通过pip可以方便的安装到系统中。

  1. # airflow needs a home, ~/airflow is the default,
  2. # but you can lay foundation somewhere else if you prefer
  3. # (optional)
  4. export AIRFLOW_HOME=/usr/local/airflow
  5. # install from pypi using pip
  6. pip install airflow
  7. pip install airflow[hive]
  8. # initialize the database
  9. airflow initdb
  10. # start the web server, default port is 8080
  11. airflow webserver -p 8080

4.设置mysql为元数据库

  1. #首先要安装mysql客户端
  2. sudo yum install -y mysql
  3. sudo yum install -y mysql-devel
  4. CREATE USER airflow;
  5. CREATE DATABASE airflow;
  6. CREATE DATABASE celery_result_airflow;
  7. GRANT all privileges on airflow.* TO 'airflow'@'%' IDENTIFIED BY 'airflow';
  8. GRANT all privileges on celery_result_airflow.* TO 'airflow'@'%' IDENTIFIED BY 'airflow';
  9. #安装mysql模块
  10. wget https://pypi.python.org/packages/a5/e9/51b544da85a36a68debe7a7091f068d802fc515a3a202652828c73453cad/MySQL-python-1.2.5.zip#md5=654f75b302db6ed8dc5a898c625e030c
  11. unzip MySQL-python-1.2.5.zip
  12. cd MySQL-python-1.2.5
  13. python setup.py install
  14. #在airflow的配置文件中配置mysql为元数据的存储库
  15. sudo vi $AIRFLOW_HOME/airflow.cfg
  16. #更改数据库链接:
  17. sql_alchemy_conn = mysql://airflow:airflow@localhost:3306/airflow
  18. #对应字段解释如下:
  19. dialect+driver://username:password@host:port/database
  20. #初始化元数据库
  21. airflow initdb
  22. #重置元数据库
  23. airflow resetdb

5.安装登录模块

  1. #安装password模块
  2. pip install airflow[password]
  3. #在airflow的配置文件中修改需要认证
  4. sudo vi $AIRFLOW_HOME/airflow.cfg
  5. [webserver]
  6. authenticate = True
  7. filter_by_owner = True
  8. auth_backend = airflow.contrib.auth.backends.password_auth

运行以下代码将用户名密码写入元数据库中

  1. import airflow
  2. from airflow import models, settings
  3. from airflow.contrib.auth.backends.password_auth import PasswordUser
  4. user = PasswordUser(models.User())
  5. user.username = 'quzhengpeng'
  6. user.email = 'quzhengpeng@163.com'
  7. user.password = 'quzhengpeng'
  8. session = settings.Session()
  9. session.add(user)
  10. session.commit()
  11. session.close()
  12. exit()

5.启动守护进程

    启动后台守护进程了之后,Airflow才能实时监控任务的调度情况。将任务脚本放到${AIRFLOW_HOME}/dags下在web UI 就能看到任务执行情况。

airflow scheduler

6.启动web服务

  1. #启动web进程
  2. airflow webserver -p 8080
  3. #关闭CentOS6的防火墙
  4. sudo service iptables stop
  5. #关闭CentOS6的SELinux
  6. setenforce 0
  7. #关闭CentOS7的防火墙
  8. systemctl stop firewalld.service
  9. #禁止firewall开机启动
  10. systemctl disable firewalld.service

 

Celery+MySQL

  1. #Celery文档 http://docs.jinkan.org/docs/celery/index.html
  2. #Celery4.0.0在airflow中有一些问题,所以安装Celery3
  3. pip install -U Celery==3.1.24
  4. pip install airflow[celery]

修改配置文件

  1. vi airflow.cfg
  2. [core]
  3. executor = CeleryExecutor
  4. [celery]
  5. broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow
  6. celery_result_backend = db+mysql://airflow:airflow@localhost:3306/airflow

启动airflow

  1. airflow webserver -p 8080
  2. airflow scheduler
  3. #以非root用户运行
  4. airflow worker
  5. #启动Celery WebUI 查看celery任务
  6. airflow flower
  7. http://localhost:5555/

 

Celery+RabbitMQ

  1. wget http://www.rabbitmq.com/releases/rabbitmq-server/v3.6.5/rabbitmq-server-3.6.5-1.noarch.rpm
  2. #安装RabbitMQ的依赖包
  3. yum install erlang
  4. yum install socat
  5. #如果下载了rabbitmq的yum源 sudo yum install -y rabbitmq-server
  6. rpm -ivh rabbitmq-server-3.6.5-1.noarch.rpm

启动RabbitMQ服务

  1. #启动rabbitmq服务
  2. sudo service rabbitmq-server start
  3. #或者
  4. sudo rabbitmq-server
  5. #添加 -detached 属性来让它在后台运行(注意:只有一个破折号)
  6. sudo rabbitmq-server -detached
  7. #设置开机启动rabbitmq服务
  8. chkconfig rabbitmq-server on
  9. #永远不要用 kill 停止 RabbitMQ 服务器,而是应该用 rabbitmqctl 命令
  10. sudo rabbitmqctl stop

设置RabbitMQ

  1. #创建一个RabbitMQ用户
  2. rabbitmqctl add_user airflow airflow
  3. #创建一个RabbitMQ虚拟主机
  4. rabbitmqctl add_vhost vairflow
  5. #将这个用户赋予admin的角色
  6. rabbitmqctl set_user_tags airflow admin
  7. #允许这个用户访问这个虚拟主机
  8. rabbitmqctl set_permissions -p vairflow airflow ".*" ".*" ".*"
  9. # no usage
  10. rabbitmq-plugins enable rabbitmq_management

修改airflow配置文件支持Celery

  1. vi $AIRFLOW_HOME/airflow/airflow.cfg
  2. #更改Executor为CeleryExecutor
  3. executor = CeleryExecutor
  4. #更改broker_url
  5. broker_url = amqp://airflow:airflow@localhost:5672/vairflow
  6. Format explanation: transport://userid:password@hostname:port/virtual_host
  7. #更改celery_result_backend
  8. celery_result_backend = amqp://airflow:airflow@localhost:5672/vairflow
  9. Format explanation: transport://userid:password@hostname:port/virtual_host

安装airflow的celery和rabbitmq模块

  1. pip install airflow[celery]
  2. pip install airflow[rabbitmq]

 

airflow使用DAG(Directed Acyclic Graph,有向无环图为)来管理作业流的

  1. #创建DAG
  2. from datetime import datetime, timedelta
  3. from airflow.models import DAG
  4. args = {
  5. 'owner': 'airflow',
  6. 'start_date': seven_days_ago,
  7. 'email': ['airflow@airflow.com'],
  8. 'email_on_failure': True,
  9. 'email_on_retry': True,
  10. 'retries': 3,
  11. 'retries_delay': timedelta(seconds=60),
  12. 'depends_on_past': True
  13. }
  14. dag = DAG(
  15. dag_id='dag',
  16. default_args=args,
  17. schedule_interval='0 0 * * *',
  18. dagrun_timeout=timedelta(minutes=60)
  19. )

创建任务将任务添加到DAG中

  1. from airflow.operators.bash_operator import BashOperator
  2. from airflow.operators.dummy_operator import DummyOperator
  3. demo = DummyOperator(
  4. task_id='demo',
  5. dag=dag
  6. )
  7. last_execute = BashOperator(
  8. task_id='last_execute',
  9. bash_command='echo 1',
  10. dag=dag
  11. )

配置任务的依赖关系

demo.set_downstream(last_execute)

 

https://hub.docker.com/r/camil/airflow/

https://dwtobigdata.wordpress.com/2016/01/14/designing-workflow-with-airflow/

http://www.jianshu.com/p/59d69981658a

https://segmentfault.com/a/1190000005078547

http://www.tuicool.com/articles/A3yIri6

http://ju.outofmemory.cn/entry/245373

http://blog.csdn.net/permike/article/details/51898213

http://www.cnblogs.com/harrychinese/p/airflow.html

http://stackoverflow.com/questions/37785061/unable-to-start-airflow-worker-flower-and-need-clarification-on-airflow-architec?rq=1

http://stackoverflow.com/questions/19689510/celery-flower-security-in-production

转载于:https://my.oschina.net/u/2297683/blog/751880

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/很楠不爱3/article/detail/656637
推荐阅读
相关标签
  

闽ICP备14008679号