Author: Zizhun Guo
作者: 郭子谆
写于:
对于手机游戏来讲,活动礼包推送在游戏商业化运营中所占比重巨大,在正确的时间给正确的人推送正确的礼包(价格和内容)是需要技巧经验的技术活,我们不能期盼着用户始终对产品不离不弃,哪怕是我们认定的忠实用户,也需要不断的激励和回馈。因此,我们需要对活动推送的转化效果进行监控,以便于我们对活动推送的效果进行评估,从而优化活动推送的效果。
push_date | level_group | goods_price | total_users | login_users | paid_users | total_revenue | paid_percentage |
---|---|---|---|---|---|---|---|
2023-11-11 | 1-5 | 6 | 1000 | 500 | 100 | 600 | 20% |
2023-11-18 | 1-5 | 6 | 1200 | 600 | 130 | 780 | 20% |
以上报表为某次活动推送的转化报表,其中: push_date:活动推送日期 level_group:活动推送等级组 goods_price:活动推送礼包价格 total_users:活动推送用户总数 login_users:活动推送用户中登陆用户数 paid_users:活动推送用户中付费用户数 total_revenue:活动推送用户中付费用户的总收入 paid_percentage:活动推送用户中付费用户占比
对于案例中2023-11-11和2023-11-18两次活动推送,该报表记录了两次活动推送的转化效果,其中2023-11-11的活动推送等级组为1-5,礼包价格为6,活动推送用户总数为1000,登陆用户数为500,付费用户数为100,付费用户总收入为600,付费用户占比为20%。同理,2023-11-18的活动推送等级组为1-5,礼包价格为6,活动推送用户总数为1200,登陆用户数为600,付费用户数为130,付费用户总收入为780,付费用户占比为20%。
如何产出该报表,是本文的主要内容。
日志表: ods_game_activity_push_log
uid | level | createtime | endtime |
---|---|---|---|
1001 | 5 | 2023-11-18 08:00:00 | 2023-11-18 08:30:00 |
1002 | 7 | 2023-11-18 09:00:00 | 2023-11-18 09:45:00 |
1003 | 10 | 2023-11-18 10:00:00 | 2023-11-18 11:00:00 |
1004 | 12 | 2023-11-18 11:15:00 | 2023-11-18 12:15:00 |
该日志表记录了活动推送的用户,等级组,生效时间和失效时间,其中生效时间和失效时间是活动推送的生效时间区间,例如1001用户在2023-11-18 08:00:00到2023-11-18 08:30:00这个时间段内,可以看到活动推送的内容,而1001用户在2023-11-18 08:00:00之前或者2023-11-18 08:30:00之后,都无法看到活动推送的内容。
日志表: ods_user_login_log
uid | logintime | logouttime |
---|---|---|
1001 | 2023-11-18 07:45:00 | 2023-11-18 08:35:00 |
1002 | 2023-11-18 08:50:00 | 2023-11-18 09:50:00 |
1003 | 2023-11-18 09:55:00 | 2023-11-18 11:05:00 |
1004 | 2023-11-18 11:10:00 | 2023-11-18 12:20:00 |
1001 | 2023-11-18 12:30:00 | 2023-11-18 13:00:00 |
该日志表记录了用户的登陆和登出时间,其中登陆时间和登出时间是用户的登陆时间区间,例如1001用户在2023-11-18 07:45:00到2023-11-18 08:35:00这个时间段内,登陆了游戏,而1001用户在2023-11-18 07:45:00之前或者2023-11-18 08:35:00之后,都没有登陆游戏。
日志表: ods_order_log
uid | createtime | price | good |
---|---|---|---|
1001 | 2023-11-18 08:05:00 | 6 | Game Credits |
1002 | 2023-11-18 09:10:00 | 648 | Expansion Pack |
1003 | 2023-11-18 10:20:00 | 128 | Power Boost |
1004 | 2023-11-18 11:30:00 | 6.00 | Cosmetic Item |
1001 | 2023-11-18 08:40:00 | 18 | Subscription |
该日志表记录了用户的订单,其中订单时间是用户的订单时间,例如1001用户在2023-11-18 08:05:00购买了6元的游戏币,而1001用户在2023-11-18 08:05:00之前或者2023-11-18 08:05:00之后,都没有购买过6元的游戏币。
产出要求
解决方法
WITH A AS (
SELECT
,t.uid
,level
,createtime
,endtime
,SUM(IF(login_logout_time >= createtime AND login_logout_time <= endtime, 1, 0) as n_login
,IF(SUM(IF(login_logout_time >= createtime AND login_logout_time <= endtime, 1, 0)) > 0, 1, 0) AS bv_loggedin
FROM
db.ods_game_activity_push_log t
LEFT JOIN (
SELECT
uid
,logintime AS login_logout_time
FROM db.ods_user_login_log
WHERE dt BETWEEN '2023-10-10' AND '2023-10-17'
UNION
SELECT
uid
,logouttime AS login_logout_time
FROM db.ods_user_login_log
WHERE dt BETWEEN '2023-10-10' AND '2023-10-17'
) tt ON t.uid = tt.uid
WHERE
t.dt = '2023-10-10'
GROUP BY
t.uid
,level
,createtime
,endtime
),
INSERT OVERWRITE TABLE db.activity_push_uid_batch_logedin_order_result
PARTITION (push_date = '2023-10-10')
SELECT
A.*
,tt.createtime AS ordertime
,tt.price
,tt.goodid
FROM
A
LEFT JOIN (
SELECT
uid
,createtime
,price
,goodid
FROM
db.ods_order_log tt
WHERE
dt BETWEEN '2023-10-10' AND '2023-10-17'
) tt ON A.uid = tt.uid AND tt.createtime >= A.createtime AND tt.createtime <= A.endtime
;
如果任务不使用分区,扫描量会非常大,需要先通过date()function将timestamp转换成yyyy-mm-dd的格式,再做范围筛选,其实是完全没用利用到分区的功能。会plan出很多stage,每个stage都会扫描全表,这样的话,如果数据量很大,会导致任务执行时间很长,而且会占用很多资源,影响其他任务的执行。
-- SQL 1
CREATE TABLE db.activity_push_batch_uid AS
SELECT
uid
,level
,createtime
,endtime
,dense_rank() over(order by hour(createtime), minute(createtime)) as batch
FROM
db.ods_game_activity_push_log
WHERE
dt = '2023-10-10'
order by
createtime asc;
SQL 1: 使用活动推送日志’2023-10-10’的分区表,并按照createtime排序,生成批次号,统计粒度为小时-分钟
-- SQL 2
CREATE TABLE if not exists db.activity_push_batch_createtime AS
SELECT
batch
,createtime
,endtime
FROM
(
SELECT
createtime
,endtime
,batch
,row_number() over (partition by hour(createtime), minute(createtime) order by createtime asc) as rank
FROM
db.activity_push_batch_uid T
) t
WHERE
t.rank = 1
;
SQL 2: 使用SQL 1生成的批次号,按照批次号分组,取每组第一条记录的createtime和endtime,作为该批次的生效时间区间
-- SQL 3
CREATE TABLE db.bv_logint_logout uid AS
WITH A as (
SELECT
uid
,logintime as login_logout_time
FROM
db.ods_user_login_log
WHERE
dt >= '2023-10-10' and dt < '2023-10-13'
UNION
SELECT
uid
,logouttime as login_logout_time
FROM
db.ods_user_login_log
WHERE
dt >= '2023-10-10' and dt < '2023-10-13'
)
SELECT
DISTINCT
T.batch
,A.uid
FROM
A
INNER JOIN
db.activity_push_batch_createtime T
on
A.login_logout_time >= T.createtime
and A.login_logout_time <= T.endtime
;
SQL 3: 使用玩家登陆日志’2023-10-10’到’2023-10-13’的分区表, 得到登陆登出时间字段,使用SQL 2生成的批次号,过滤出每个批次的登陆登出记录,得到每个批次的登陆用户
-- SQL 4
CREATE TABLE guozizhun.game_god_grant_push_batch_order_20231020 AS
WITH A AS (
SELECT
uid
,createtime
,price
,goodid
FROM
db.ods_order_log
WHERE
dt >= '2023-10-10' and dt < '2023-10-13'
)
,
B AS (
SELECT
A.uid
,T.batch
,A.createtime
,A.price
,A.goodid
FROM
A
INNER JOIN
db.activity_push_batch_createtime T
on
A.createtime >= T.createtime
and A.createtime <= T.endtime
)
SELECT
A.*
,B.batch
,B.uid
,B.createtime as ordertime
,B.price
,B.goodid
,DATEDIFF(DATE(B.createtime), DATE(A.createtime)) as days_from_createtime
FROM
guozizhun.bv_logint_logout A
LEFT JOIN
B
on
A.uid = B.uid
and A.batch = B.batch
;
SQL 4: 使用玩家订单日志’2023-10-10’到’2023-10-13’的分区表, 得到订单时间字段,使用SQL 2生成的批次号,过滤出每个批次的订单记录,通过left join得到对应每个批次的订单记录,得到每个批次的订单用户
按照小时-分钟分组,并取每组第一条记录的时间作为该组时间,会造成误差,根据具体产出任务需要统计登陆转化和订单转化,
参考以下文档对误差定义,也可求助于大模型 https://docs.citrix.com/zh-cn/citrix-virtual-apps-desktops/director/data-retention.html
Copyright @ 2021 Zizhun Guo. All Rights Reserved.