博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
azkaban使用
阅读量:6297 次
发布时间:2019-06-22

本文共 3605 字,大约阅读时间需要 12 分钟。

新建一个text文件,a.job,打包成zip包传到azkaban即可

 

方式1:job流 

 1. a.job内容范例:

type=commandcommand=hive shellcommand=hive -e 'set hive.execution.engine=tez;'command=hive -e 'select * from hivedb.t_user limit 5;'

2.写法2:

zip包里放job和sql文件,如图

vip_ods.job:

type=commandcommand=hive -f vip_ods.sql

vip_ods.sql:

use hivedb;insert overwrite table hive_tb1 SELECT id,name from user;insert overwrite table hive_tb2 SELECT id,name from food;

以下是依赖关系阐述: 第二个job:bar.job依赖foo.job

# bar.jobtype=commanddependencies=foocommand=echo bar

 

方式2:flow流

以下摘自官网

Creating Flows

This section covers how to create your Azkaban flows using Azkaban Flow 2.0. Flow 1.0 will be deprecated in the future.

Flow 2.0 Basics

Step 1:

Create a simple file called flow20.project. Add azkaban-flow-version to indicate this is a Flow 2.0 Azkaban project:

azkaban-flow-version: 2.0

Step 2:

Create another file called basic.flow. Add a section called nodes, which will contain all the jobs you want to run. You need to specify name and type for all the jobs. Most jobs will require the config section as well. We will talk more about it later. Below is a simple example of a command job.

nodes:  - name: jobA type: command config: command: echo "This is an echoed text."

Step 3:

Select the two files you’ve already created and right click to compress them into a zip file called Archive.zip. You can also create a new directory with these two files and then cd into the new directory and compress: zip -r Archive.zip . Please do not zip the new directory directly.

Make sure you have already created a project on Azkaban ( See  ). You can then upload Archive.zip to your project through Web UI ( See  ).

Now you can click Execute Flow to test your first Flow 2.0 Azkaban project!

Job Dependencies

Jobs can have dependencies on each other. You can use dependsOn section to list all the parent jobs. In the below example, after jobA and jobB run successfully, jobC will start to run.

nodes:  - name: jobC type: noop # jobC depends on jobA and jobB dependsOn: - jobA - jobB - name: jobA type: command config: command: echo "This is an echoed text." - name: jobB type: command config: command: pwd

You can zip the new basic.flow and flow20.project again and then upload to Azkaban. Try to execute the flow and see the difference.

Job Config

Azkaban supports many job types. You just need to specify it in type, and other job-related info goes to config section in the format of key: value pairs. Here is an example for a Pig job:

nodes:  - name: pigJob type: pig config: pig.script: sql/pig/script.pig

You need to write your own pig script and put it in your project zip and then specify the path for the pig.script in the config section.

Flow Config

Not only can you configure individual jobs, you can also config the flow parameters for the entire flow. Simply add a config section at the beginning of the basic.flow file. For example:

---config:  user.to.proxy: foo failure.emails: noreply@foo.com nodes: - name: jobA type: command config: command: echo "This is an echoed text."

When you execute the flow, the user.to.proxy and failure.emails flow parameters will apply to all jobs inside the flow.

Embedded Flows

Flows can have subflows inside the flow just like job nodes. To create embedded flows, specify the type of the node as flow. For example:

nodes:  - name: embedded_flow type: flow config: prop: value nodes: - name: jobB type: noop dependsOn: - jobA - name: jobA type: command config: command: pwd

Download Examples

You can download the simple Flow 2.0 project zip examples to start playing with Azkaban:

 

转载于:https://www.cnblogs.com/xiaoliu66007/p/9641670.html

你可能感兴趣的文章
使用Swagger2构建强大的RESTful API文档(2)(二十三)
查看>>
Docker容器启动报WARNING: IPv4 forwarding is disabled. Networking will not work
查看>>
(转)第三方支付参与者
查看>>
程序员修炼之道读后感2
查看>>
DWR实现服务器向客户端推送消息
查看>>
js中forEach的用法
查看>>
Docker之功能汇总
查看>>
!!a标签和button按钮只允许点击一次,防止重复提交
查看>>
(轉貼) Eclipse + CDT + MinGW 安裝方法 (C/C++) (gcc) (g++) (OS) (Windows)
查看>>
还原数据库
查看>>
作业调度框架 Quartz.NET 2.0 beta 发布
查看>>
mysql性能的检查和调优方法
查看>>
项目管理中的导向性
查看>>
Android WebView 学习
查看>>
(转)从给定的文本中,查找其中最长的重复子字符串的问题
查看>>
HDU 2159
查看>>
spring batch中用到的表
查看>>
资源文件夹res/raw和assets的使用
查看>>
UINode扩展
查看>>
LINUX常用命令
查看>>