当前位置:   article > 正文

离线数仓——(DWD层订单表,DWD层订单产品表,DWS层订单明细表,DWS层订单明细宽表,ADS层订单分析表)_dws明细表

dws明细表

DWD层订单表及订单产品表

  1. DWD要处理的表有两张:订单表、订单产品表。

  2. 订单表是周期性事实表;为保留订单状态,可以使用拉链表进行处理;

  3. 订单产品表普通的事实表,用常规的方法进行处理;
    1. 如果有数据清洗、数据转换的业务需求,ODS => DWD
    2. 如果没有数据清洗、数据转换的业务需求,保留在ODS,不做任何变化【没有业务处理了】

  4. 订单状态:

    1. -3:用户拒收
    2. -2:未付款的订单
    3. -1:用户取消
    4. 0:待发货
    5. 1:配送中
    6. 2:用户确认收货
  5. 订单从创建到最终完成,是有时间限制的;业务上也不允许订单在一个月之后,状态仍然在发生变化;

订单表建表:

  1. 与维表不同,订单事实表的记录数非常多

  2. 订单有生命周期;订单的状态不可能永远处于变化之中(订单的生命周期一般在15天左右)

  3. 订单是一个拉链表,而且是分区表

  4. 分区的目的:订单一旦终止,不会重复计算

  5. 分区的条件:订单创建日期;保证相同的订单在用一个分区

    -- 订单事实表(拉链表)
    create table dwd.dwd_trade_orders(
    	`orderId` int,
    	`orderNo` string,
    	`userId` bigint,
    	`status` tinyint,
    	`productMoney` decimal,
    	`totalMoney` decimal,
    	`payMethod` tinyint,
    	`isPay` tinyint,
    	`areaId` int,
    	`tradeSrc` tinyint,
    	`tradeType` int,
    	`isRefund` tinyint,
    	`dataFlag` tinyint,
    	`createTime` string,
    	`payTime` string,
    	`modifiedTime` string,
    	`start_date` string,
    	`end_date` string
    ) COMMENT '订单事实拉链表'
    partitioned by (dt string)
    STORED AS PARQUET;
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23

订单表数据加载:

#!/bin/bash

source /etc/profile

if [ -n "$1" ]
then
	do_date=$1
else
	do_date=`date -d "-1 day" +%F`
fi

sql="
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
INSERT OVERWRITE TABLE dwd.dwd_trade_orders
partition(dt)
 SELECT orderId,
		orderNo,
		userId,
		status,
		productMoney,
		totalMoney,
		payMethod,
		isPay,
		areaId,
		tradeSrc,
		tradeType,
		isRefund,
		dataFlag,
		createTime,
		payTime,
		modifiedTime,
		case when modifiedTime is not null
			then from_unixtime(unix_timestamp(modifiedTime,'yyyy-MM-dd HH:mm:ss'),'yyyy-MM-dd')
			else from_unixtime(unix_timestamp(createTime,'yyyy-MM-dd HH:mm:ss'), 'yyyy-MM-dd')
			end as start_date,
			'9999-12-31' as end_date,
			from_unixtime(unix_timestamp(createTime, 'yyyy-MM-dd HH:mm:ss'), 'yyyy-MM-dd') as dt
	FROM ods.ods_trade_orders
  WHERE dt='$do_date'

union all

 SELECT A.orderId,
		A.orderNo,
		A.userId,
		A.status,
		A.productMoney,
		A.totalMoney,
		A.payMethod,
		A.isPay,
		A.areaId,
		A.tradeSrc,
		A.tradeType,
		A.isRefund,
		A.dataFlag,
		A.createTime,
		A.payTime,
		A.modifiedTime,
		A.start_date,
		CASE WHEN B.orderid IS NOT NULL AND A.end_date > '$do_date'
				THEN date_add('$do_date', -1)
				ELSE A.end_date END AS end_date,
				from_unixtime(unix_timestamp(A.createTime, 'yyyy-MM-dd HH:mm:ss'), 'yyyy-MM-dd') as dt
	FROM (SELECT * FROM dwd.dwd_trade_orders WHERE dt>date_add('$do_date', -15)) A
			left outer join (SELECT * FROM ods.ods_trade_orders
WHERE dt='$do_date') B
			ON A.orderId = B.orderId;
"


hive -e "$sql"
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72

DWS层订单明细表及订单明细宽表

  1. DWS层订单明细表及订单明细宽表可包含当天:
    1. 全国所有订单信息
    2. 全国、一级商品分类订单信息
    3. 全国、二级商品分类订单信息
    4. 大区所有订单信息
    5. 大区、一级商品分类订单信息
    6. 大区、二级商品分类订单信息
    7. 城市所有订单信息
    8. 城市、一级商品分类订单信息
    9. 城市、二级商品分类订单信息
  2. 计算上述数据需要的信息有:订单表、订单商品表、商品信息维表、商品分类维表、商家地域维表
    1. 订单表 => 订单id、订单状态
    2. 订单商品表 => 订单id、商品id、商家id、单价、数量
    3. 商品信息维表 => 商品id、三级分类id
    4. 商品分类维表 => 一级名称、一级分类id、二级名称、二级分类id、三级名称、三级分类id
    5. 商家地域维表 => 商家id、区域名称、区域id、城市名称、城市id
    6. 订单表、订单商品表、商品信息维表 => 订单id、商品id、商家id、三级分类id、单价、数量(订单明细表)
    7. 订单明细表、商品分类维表、商家地域维表 => 订单id、商品id、商家id、三级分类名称、三级分类名称、三级分类名称、单价、数量、区域、城市 => 订单明细宽表

dws_trade_orders(订单明细)由以下表轻微聚合而成

  1. dwd.dwd_trade_orders (拉链表、分区表)
  2. ods.ods_trade_order_product (分区表)
  3. dim.dim_trade_product_info(维表、拉链表)

dws_trade_orders_w(订单明细宽表)由以下表组成:

  1. ads.dws_trade_orders (分区表)
  2. dim.dim_trade_product_cat(分区表)
  3. dim.dim_trade_shops_org(分区表)
-- 订单明细表(轻度汇总事实表)。每笔订单的明细
DROP TABLE IF EXISTS dws.dws_trade_orders;
create table if not exists dws.dws_trade_orders(
	orderid string, -- 订单id
	cat_3rd_id string, -- 商品三级分类id
	shopid string, -- 店铺id
	paymethod tinyint, -- 支付方式
	productsnum bigint, -- 商品数量
	paymoney double, -- 订单商品明细金额
	paytime string -- 订单时间
)
partitioned by (dt string)
STORED AS PARQUET;


-- 订单明细表宽表
DROP TABLE IF EXISTS dws.dws_trade_orders_w;
create table if not exists dws.dws_trade_orders_w(
	orderid string, -- 订单id
	cat_3rd_id string, -- 商品三级分类id
	thirdname string, -- 商品三级分类名称
	secondname string, -- 商品二级分类名称
	firstname string, -- 商品一级分类名称
	shopid string, -- 店铺id
	shopname string, -- 店铺名
	regionname string, -- 店铺所在大区
	cityname string, -- 店铺所在城市
	paymethod tinyint, -- 支付方式
	productsnum bigint, -- 商品数量
	paymoney double, -- 订单明细金额
	paytime string -- 订单时间
)
partitioned by (dt string)
STORED AS PARQUET;
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34

DWS层订单明细表及订单明细宽表加载数据:

  1. dws_trade_orders/dws_trade_orders_w 中一笔订单可能出现多条记录
#!/bin/bash

source /etc/profile
if [ -n "$1" ]
then
	do_date=$1
else
	do_date=`date -d "-1 day" +%F`
fi

sql="
insert overwrite table dws.dws_trade_orders
partition(dt='$do_date')
 select t1.orderid as orderid,
		t3.categoryid as cat_3rd_id,
		t3.shopid as shopid,
		t1.paymethod as paymethod,
		t2.productnum as productsnum,
		t2.productnum*t2.productprice as pay_money,
		t1.paytime as paytime
	from (select orderid, paymethod, paytime
			from dwd.dwd_trade_orders
		   where dt='$do_date') T1
	
	 	left join
		
		(select orderid, productid, productnum, productprice from ods.ods_trade_order_product
			where dt='$do_date') T2
		on t1.orderid = t2.orderid

left join

		(select productid, shopid, categoryid
			from dim.dim_trade_product_info
		 where start_dt <= '$do_date'
			and end_dt >= '$do_date' ) T3
		on t2.productid=t3.productid;

insert overwrite table dws.dws_trade_orders_w
partition(dt='$do_date')
 select t1.orderid,
		t1.cat_3rd_id,
		t2.thirdname,
		t2.secondname,
		t2.firstname,
		t1.shopid,
		t3.shopname,
		t3.regionname,
		t3.cityname,
		t1.paymethod,
		t1.productsnum,
		t1.paymoney,
		t1.paytime
from (select orderid,
			 cat_3rd_id,
			 shopid,
			 paymethod,
			 productsnum,
			 paymoney,
			 paytime
		 from dws.dws_trade_orders
		where dt='$do_date') T1
join

	(select thirdid, thirdname, secondid, secondname,firstid, firstname
		 	from dim.dim_trade_product_cat
		 where dt='$do_date') T2
			on T1.cat_3rd_id = T2.thirdid
join

		(select shopid, shopname, regionname, cityname
			from dim.dim_trade_shops_org
			where dt='$do_date') T3
			on T1.shopid = T3.shopid
"

hive -e "$sql"
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77

ADS层订单分析表

ADS层订单分析表需求:计算当天

  1. 全国所有订单信息
  2. 全国、一级商品分类订单信息
  3. 全国、二级商品分类订单信息
  4. 大区所有订单信息
  5. 大区、一级商品分类订单信息
  6. 大区、二级商品分类订单信息
  7. 城市所有订单信息
  8. 城市、一级商品分类订单信息
  9. 城市、二级商品分类订单信息

ADS层订单分析表

create table if not exists ads.ads_trade_order_analysis(
	areatype string, -- 区域范围:区域类型(全国、大区、城市)
	regionname string, -- 区域名称
	cityname string, -- 城市名称
	categorytype string, -- 商品分类类型(一级、二级)
	category1 string, -- 商品一级分类名称
	category2 string, -- 商品二级分类名称
	totalcount bigint, -- 订单数量
	total_productnum bigint, -- 商品数量
	totalmoney double -- 支付金额
)
partitioned by (dt string)
row format delimited fields terminated by ',';
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

ADS层订单分析表加载数据

  1. 1笔订单,有多个商品;多个商品有不同的分类;这会导致一笔订单有多个分类,它们是分别统计的;
#!/bin/bash

source /etc/profile

if [ -n "$1" ]
then
	do_date=$1
else
	do_date=`date -d "-1 day" +%F`
fi

sql="
with mid_orders as (
 select regionname,
		cityname,
		firstname category1,
		secondname category2,
		count(distinct orderid) as totalcount,
		sum(productsnum) as total_productnum,
		sum(paymoney) as totalmoney
	from dws.dws_trade_orders_w
  where dt='$do_date'
group by regionname, cityname, firstname, secondname
)
insert overwrite table ads.ads_trade_order_analysis partition(dt='$do_date')
 select '全国' as areatype,
		'' as regionname,
		'' as cityname,
		'' as categorytype,
		'' as category1,
		'' as category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders

union all

 select '全国' as areatype,
		'' as regionname,
		'' as cityname,
		'一级' as categorytype,
		category1,
		'' as category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by category1

union all

 select '全国' as areatype,
		'' as regionname,
		'' as cityname,
		'二级' as categorytype,
		'' as category1,
		category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by category2

union all

 select '大区' as areatype, 
		regionname,
		'' as cityname,
		'' as categorytype,
		'' as category1,
		'' as category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by regionname

union all

 select '大区' as areatype,
		regionname,
		'' as cityname,
		'一级' as categorytype,
		category1,
		'' as category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by regionname, category1

union all

 select '大区' as areatype,
		regionname,
		'' as cityname,
		'二级' as categorytype,
		'' as category1,
		category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by regionname, category2

union all

 select '城市' as areatype,
		'' as regionname,
		cityname,
		'' as categorytype,
		'' as category1,
		'' as category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by cityname

union all

 select '城市' as areatype,
		'' as regionname,
		cityname,
		'一级' as categorytype,
		category1,
		'' as category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by cityname, category1

union all

 select '城市' as areatype,
		'' as regionname,
		cityname,
		'二级' as categorytype,
		'' as category1,
		category2,
		sum(totalcount),
		sum(total_productnum),
		sum(totalmoney)
	from mid_orders
group by cityname, category2;
"
hive -e "$sql"
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/943111
推荐阅读
相关标签
  

闽ICP备14008679号