Uplift Modeling采用随机科学控制，不仅可以衡量事务行为的有效性，还可以建立预测模型、预测行为的增量响应。它是一种数据挖掘技术，主要应用于金融服务、电信和零售直销行业，用于追加销售、交叉销售、客户流失和扣除留置。

通常的Propensity Model和Response Model只是给目标用户打了个分，并没有确保模型的结果可以使得活动的提升最大化，它没有告诉市场营销人员哪个用户最有可能提升活动响应，因此需要另一个统计模型来定位那些可以被营销推广活动明显驱动他们偏好响应的用户，也就是“营销敏感”用户。Uplift Model的最终目标就是找到最有可能被营销活动影响的用户，从而提升活动的反响（r(test)-r(control)），提升ROI（投资回收率），提升整体的市场响应率。

下面说明进行Uplift Modeling的方法。

（1）建立两个Logistic模型：

Logit(Ptest(response|X,treatment=1))=a+b*X+c*treatment

Logit(Pcontrol(response|X,treatment=0))=a+b*X

（2）将两个得分相减，计算Uplift Score：

Score=Ptest(response|X,treatment=1)-Pcontrol(response|X,treatment=0)

训练样本：

由于强化学习需要用到的是反馈数据，因此训练样本的及时、自动更新是比较重要的方面（尤其是label的更新和实时特征的更新），才能体现出强化学习优于机器学习的地方，使用用户反馈的标注样本来更新训练样本库，可以使得反馈及时地得到学习，从而优化算法效果。

【例1】实验环境是Jupyter Notebook。

我们将使用模拟数据，目标是能够预测Uplift，即每个人的治疗产生的结果概率的差异。

In [1]:%pylab inline
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
Populating the interactive namespace from numpy and matplotlib

1．加载数据（Loading the data）

2．矫形（Reshaping）

打印结果：

done Node1
done Node2
done Node3
done Node4
done Node5
done Node6
done Node7
done Node8
done Node9
done Node10
done Node11
done Node12
done Node13
done Node14
done Node15
done Node17
done Node18
done Node19
done Node20

In [10]:train_df = df[df["train_test"]=="train"]
test_df = df[df["train_test"]=="test"]

print train_df.shape
print test_df.shape

打印结果：

(7952, 81)
(2048, 81)

3．两类模型（Two model approach）

用目标数据集的结果概率减去控制数据集的结果概率的差来建模Uplift：

In [11]:target = train_df[train_df["target_control"]=='target']
control = train_df[train_df["target_control"]=='control']

print target.shape
print control.shape

打印结果：

(3934, 81)
(4018, 81)
In [12]:target_X = target[features]
control_X = control[features]
target_Y = target[['outcome']]
control_Y = control[['outcome']]
test_X = test_df[features]

4．训练（training）

5．得分（scoring）

In [36]:test_df["proba_outcome_target"] = clf1.predict_proba(test_X)[:,1]
test_df["proba_outcome_control"] = clf2.predict_proba(test_X)[:,1]
# uplift is just the difference.
test_df["uplift_1"]        =         test_df["proba_outcome_target"]  -
test_df["proba_outcome_control"]

6．类别修正（Class Modification approach）

实现类别修正的方法如下：

·　堆叠目标和控制数据。

·　翻转控制数据集的目标。

·　在这个目标上训练一个模型。

·　Uplift是预测概率的2倍减去1。

7．保存预测结果（Save Predictions）

In [44]:test_df.to_csv("/Users/uplift_predictions.csv")

Python 数据分析实例——Uplift Modeling

1．加载数据（Loading the data）

2．矫形（Reshaping）

3．两类模型（Two model approach）

4．训练（training）

5．得分（scoring）

6．类别修正（Class Modification approach）

7．保存预测结果（Save Predictions）

继续阅读

来自python的【条件控制/语句循环/break/continue/else/pass】一、条件控制二、语句循环

无法解析的外部符号 wmain，该符号在函数 "void cdecl mainCRTStartupHelper(struct HINSTANCE *,unsigned short con......

TestLink导出用例转换工具(XML2Excel)

YAML简介和PyYAML安全操作YAML支持的类型YAML的优点：yaml的基本语法python操作

Small tricks

libsvm for python 安装

学习软件测试基础测试第七天

Zeppelin 配置访问 REST APIApache Zeppelin Configuration REST API

【Torch】最简洁logging使用指南

27. Remove Element(列表)题目代码

Cloud Studio初体验

使用 ctypes 进行 Python 和 C 的混合编程

【python】【数据处理】画多维数据分布图

【python】netconf协议对接管理设备

「Python 网络自动化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 网络设备

在python中创建excel并写入

Python 数据分析实例——Uplift Modeling

1．加载数据（Loading the data）

2．矫形（Reshaping）

3．两类模型（Two model approach）

4．训练（training）

5．得分（scoring）​

6．类别修正（Class Modification approach）

7．保存预测结果（Save Predictions）

继续阅读

5．得分（scoring）