T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

《T-Finder: A Recommender System for Finding Passengers and Vacant Taxis》2013 IEEE TKDE Best Paper

Abstract

本文从众多的出租车轨迹数据中，学习了高收益出租车司机的驾驶行为和乘客的移动模式，提出了taxi-passenger recommender system（出租车-乘客推荐系统）:
taxi recommender 能够为出租车准确地预测接客等待时间和可能的乘客上车点,
passenger recommender 能够向乘客推荐附近可能的上车点
该系统分为offline 和 online 两部分:
offline:处理原始轨迹数据，检测和分类parking places，分割道路，通过机器学习的方法得到有关parking places和road segments的知识；
online:根据实时的gps和时间输入，运用offline所学习到的知识，推荐合理的上车点。

BASRLINE:

1、OVERVIEW

(1) Motivation

通过分析近一年的出租车轨迹数据，发现出租车行驶的路程越多，并不代表赚的越多
经验丰富的出租车司机的收入很高，而新手出租车司机的收入很低，这说明出租车载客是有经验和规律可寻的
于是我们可以用机器学习的方法探索内部的规律

(2) Preliminary

Definition 1 (Road Segment)：包含 direction 和 road level
Definition 2 (Route)：a sequence of connected road segments
Definition 3 (State)：occupied (O), cruising (C) ,parked (P) (taxi drivers wait somewhere for business)
Definition 4 (Trajectory and Trip):Trajectory是原始数据，Trip是整理后的数据，按照车的ID进行整理
Definition 5 (Action RP )：到达P点的路径是R：r1->r2>r3…

(3) Framework

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

Brief introduction for offline processing:

detect the parking places from GPS trajectories
segment the GPS trajectories according to Definition 4
map-match the GPS trajectories to road networks using the IVMM algorithm
learn the time-dependent statistical results based on a probabilistic model
to tackle the data sparseness problem, we devise a road segment clustering method and perform statistical learning on each road segment cluster instead of a single road segment

2、PARKING PLACES DETECTION

(1) Candidates Detection

对于每辆车的trip，我们把trip中的点做如下‘聚类’：

1、center所在的圆形区域的半径为 d

2、在圆形区域中，所有点离center的旅程时间不超过 t

满足以上条件的点作为一个类，否则作为下一个类，并继续搜索属于下一个类的点

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

贴上算法：

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

(2) Filtering

把parking places 和 traffic jams区分开，运用了如下特征，作为滤波设计的判据：

Spatial-Temporal features
POI feature
Collaborative feature

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

(3) Parking Place Clustering

由于许多Parking Place 有着相似的特性，有很大的概率可以归为一类，所以我们对这些Parking Place 做聚类是有必要的，文中所用的聚类方法是：

density based clustering method OPTICS

3、Recommender Systerm

given the current location Lc and time T0 of a taxi driver or a passenger, the taxi recommender provides the driver with top-k parking places and routes to these parking places while the passenger recommender suggests a set of road segments within a walking distance

对于出租车司机分别采取不同的策略，推荐系统给司机直接提供top-k parking place，以及提供一条最有可能在途中或者终点接到乘客的routes；对于乘客，系统给乘客推荐最有可能车经过的road segment，以步行时间和等的时间为代价优化。

3.1 Taxi Recommender

provide the taxi drivers with the best parking places and the routes to these parking places

为了实现出租车推荐系统的最优化，本节提出了如下四个cost or benefit 函数，我们的目的是使这些函数值达到最小或最大值：

The Probability of Picking up the Next Passenger
Duration Before the Next Trip T
Queue Length
Distance/Travel Time of the Next Trip DN, TN

3.2 Passenger Recommender

目的：

the passengers do not want to walk too long for hailing a taxi

解决方法（有两种方法，第二种方法需要用到road segment）:

If a passenger is close to at least one parking place, we provide him with the nearby parking places with the maximum expected queue length calculated with the method described in 4.1.3.

Otherwise, we suggest the passenger with the nearby road segments (in a walking distance) considering two criteria 1) the possibility to find avacant taxi and 2) the average waiting time.

对于第二种方法，本节提出了如下两个cost or benefit 函数，我们的目的是使这些函数值达到最小或最大值：

Probability of Finding a Vacant Taxi
Average Waiting Time

4、ROAD SEGMENT CLUSTERING

动机：

On some road segments, the data is too sparse to perform the statistical learning proposed in Section 4.

解决：

Instead of computing the probability on each road segment, we first conduct a road segment clustering to integrate the road segments with similar features

feature：

road structure & POIs

由于前文中road segment的结果过稀疏，不利于统计学习，所以我们需要对road segment的结果进行聚类，将稀疏的数据聚类，这里作者所用的方法是Bisected clustering，其本质是多次的2-means clustering，我们将聚类后的结果road segment clustering 代替前面的road segment，用于统计学习，计算公式中相应的变量也随之替换

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

5、ONLINE RECOMMENDATION

线上推荐系统就是利用前面训练得到的模型，结合当前的GPS和时间，得到最优的推荐结果，对于taxi和passenger分别有不同的策略：

1、taxi recommender：

思路：

1、根据出租车当前的位置和时间，找到附附近可能的parking place（P）

2、对每个P计算成功概率，以及各种期望

3、对这些值进行综合排序

4、设定一些阈值，用于筛选最合适的P

cost or benifit fun:

Most profitable
Fastest
Highest probability
Shortest queue

2、passenger recommender：

cost or benifit fun:

Probability of finding a vacant taxi
Average waiting time

6、SOME RESULT

反正就是各种liubi，放一些图

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis（阅读笔记）20171225

《T-Finder: A Recommender System for Finding Passengers and Vacant Taxis》2013 IEEE TKDE Best Paper

Abstract

BASRLINE:

1、OVERVIEW

(1) Motivation

(2) Preliminary

(3) Framework

2、PARKING PLACES DETECTION

(1) Candidates Detection

(2) Filtering

(3) Parking Place Clustering

3、Recommender Systerm

3.1 Taxi Recommender

3.2 Passenger Recommender

4、ROAD SEGMENT CLUSTERING

5、ONLINE RECOMMENDATION

1、taxi recommender：

思路：

cost or benifit fun:

2、passenger recommender：

cost or benifit fun:

6、SOME RESULT

继续阅读

Machine Learning Foundation Lecure 01 The Learning Problem 学习笔记

Coursera Machine Learning 学习笔记（十）

Coursera Machine Learning 学习笔记（二）

Coursera Machine Learning 学习笔记（十一）

Coursera Machine Learning 学习笔记（四）

Coursera Machine Learning 学习笔记（九）

误差逆传播计算

Seq2Seq模型中的贪心搜索(Greedy Search)1. 引入2. 贪心算法：Greedy Search参考

机器学习的应用场景

机器学习 Machine Learning

Machine Learning Basis

Text Recognition with ML KitText Recognition with ML Kit

【吴恩达机器学习笔记】7支持向量机12支持向量机（Support Vector Machines）

scikit-learn中的SVM

ML - 贷款用户逾期情况分析6 - Final思路

SVM支持向量机二（Lagrange Duality）SVM支持向量机二（Lagrange Duality）