在生产环境中每天会有大量的用户访问网站将会产生大量的日志文件那么怎么才能知道相对于昨天今天用户又访问了哪些新的网址下面的这个python脚本可以做到.
1.首先我们要先把昨天和今天的日志文件的url部分提取出来并分别放入不同的文件中
然后在运行下面的python脚本
U_F2 = input('please enter the URL file you have extracted today:')
U_F1 = input('please enter the URL file that was extracted yesterday:')
with open (U_F2) as f2:
s2 = set(f2)
with open(U_F1) as f1:
s1 = set(f1)
U_F3 = input('please enter the file name of the data you want to store:')
with open(U_F3,'w') as f3:
f3.writelines(s2-s1)
欢迎大家观看我的视频教程:Python入门到进阶
