1、下载heritrix-1.14.4-src.zip和heritrix-1.14.4.zip两个压缩包,并解压,以后分别简称SRC包和ZIP包; 可以在crawler.archive.org/downloads.html下载
2、在Eclipse下新建Java项目,取名Heritrix;
3、复制SRC包下面src/java文件夹下org、com、st三个文件夹到src目录下(即D:\eclipse\Heritrix\src);
4、复制SRC包下src下resources文件夹到项目根目录;复制conf到项目根目录。
5、复制SRC包下lib文件夹到项目根目录;
6、复制ZIP包下webapps文件夹到项目根目录;
7、在Eclipse中修改项目conf下heritrix.properties文件(
@VERSION@ 改为1.14.4
heritrix.cmdline.admin = admin:admin
heritrix.cmdline.port = 9090
8、在项目Heritrix上右键选择构建路径->配置构建路径->库选项卡->添加Jar,将lib目录下的所有.jar文件选中,点击完成!
9、在项目/src/org.archive.crawler包下Heritrix.java上点击右键选运行方式->运行配置->classpath->点击右边的ADVANCED->ADDFOLDER->选择根目录下的conf->RUN 即可
控制台出现一下信息说明已成功.
07:33:40.174 EVENT Starting Jetty/4.2.23
07:33:40.215 WARN!! Delete existing temp dirC:\Users\gztzho\AppData\Local\Temp\Jetty_127_0_0_1_8080__ forWebApplicationContext[/,jar:file:/D:/workspace/MyHeritrix/webapps/admin.war!/]
07:33:40.294 EVENT StartedWebApplicationContext[/,Heritrix Console]
07:33:40.358 EVENT Started SocketListener on127.0.0.1:8080
07:33:40.359 EVENT [email protected]
Heritrix version: 1.14.4
然后在浏览器里输入http://localhost:9090既可以访问了