10 Lucene：03.中文分析器

10 Lucene：03.中文分析器

12.查看分析器的分析效果、13.中文分析器介绍、14.中文分析器测试、15.在代码中使用中文分析器

16.常用的field使用

Lucene的官网下载点击Download

本例程使用：lucene-7.4.0.zip

运行环境要求jdk1.9版本或以上

开发工具：IntelliJ IDEA 2019.2.2

本入门案例用到的jar包：

commons-io-2.6.jar

lucene-core-7.4.0.jar

lucene-analyzers-common-7.4.0.jar

一、本教程之前的环境搭建源码

=====================================

10 Lucene：01.全文检索基本介绍

10 Lucene：02.lucene入门案例

=====================================

二、查看分析器的分析效果

10 Lucene：03.中文分析器

@Test
    public void testTokenStream() throws Exception{
        //1) 创建一个Analyzer对象，StandarAnalyzer对象
        Analyzer analyzer = new StandardAnalyzer();
        //2) 使用分析器对象的tokensStream方法获得一个TokenStream对象
        TokenStream tokenStream =analyzer.tokenStream("","The Spring Framework provides a comprehensive programming and configuration model for modern\n" +
                "Java-based enterprise applications - on any kind of deployment platform. ");
        //3) 向TokenStream对象中设置一个引用(相当于数一个指针)
        CharTermAttribute charTermAttribute = tokenStream.addAttribute(CharTermAttribute.class);
        //4) 调用TokenStream对象的rest方法，如果不调用抛异常
        tokenStream.reset();
        //5) 使用while循环遍历TokenStream对象
        while (tokenStream.incrementToken()){
            System.out.println(charTermAttribute.toString());
        }
        //6) 关闭TokenStream对象
        tokenStream.close();
    }

三、中文分析器测试

10 Lucene：03.中文分析器

把配置词典和XML配置文件复制到src目录

10 Lucene：03.中文分析器

@Test
    public void testTokenStreamCHN() throws Exception{
        //1) 创建一个Analyzer对象，第三方的中文 IKAnalyzer对象
        Analyzer analyzer = new IKAnalyzer();
        //2) 使用分析器对象的tokensStream方法获得一个TokenStream对象
        TokenStream tokenStream =analyzer.tokenStream("","全文检索从最初的字符串匹配程序已经演进到能对超大文本java、java，语音、java，图像、活动影像等非结构化数据进行综合管理的大型软件。本教程只讨论文本检索。");
        //3) 向TokenStream对象中设置一个引用(相当于数一个指针)
        CharTermAttribute charTermAttribute = tokenStream.addAttribute(CharTermAttribute.class);
        //4) 调用TokenStream对象的rest方法，如果不调用抛异常
        tokenStream.reset();
        //5) 使用while循环遍历TokenStream对象
        while (tokenStream.incrementToken()){
            System.out.println(charTermAttribute.toString());
        }
        //6) 关闭TokenStream对象
        tokenStream.close();
    }

15.在代码中使用中文分析器

@Test
    public void createIndexCHN() throws Exception{
        //1、创建一个Director对象，指定索引库保存的位置。
        //把索引库保存在磁盘上
        Path path = new File("C:\\FFOutput\\index").toPath();
        Directory directory = FSDirectory.open(path);
        //2、基于Directory对象创建一个IndexWriter对象
        //此处配置使用中文分析器
        IndexWriterConfig config = new IndexWriterConfig(new IKAnalyzer());
        IndexWriter indexWriter =new IndexWriter(directory,config);
        //3、读取磁盘上的文件，对应每个文件创建一个文档对象
        File dir= new File("C:\\FFOutput\\searchsource");
        File[] files =dir.listFiles();
        for (File file:files
        ) {
            //取文件名
            String fileName =file.getName();
            //文件路径
            String filePath =file.getPath();
            //文件的内容commons-io工具类
            String fileContent= FileUtils.readFileToString(file,"utf-8");
            //文件的大小
            long fileSize = FileUtils.sizeOf(file);
            //创建Field
            //参数1：域的名称；参数2：域的内容；参数3：是否存储
            Field fieldName = new TextField("name",fileName,Field.Store.YES);
            Field fieldPath = new TextField("path",filePath,Field.Store.YES);
            Field fieldContent = new TextField("content",fileContent,Field.Store.YES);
            Field fieldSize = new TextField("size",fileSize +"",Field.Store.YES);
            //4、创建文档对象
            Document document = new Document();
            //向文档对象中添加域
            document.add(fieldName);
            document.add(fieldPath);
            document.add(fieldContent);
            document.add(fieldSize);
            //5、把文档对象写入索引库
            indexWriter.addDocument(document);

        }
        //6、关闭indexWriter对象
        indexWriter.close();

    }

======================

end

10 Lucene：03.中文分析器

一、本教程之前的环境搭建源码

二、查看分析器的分析效果

三、中文分析器测试

15.在代码中使用中文分析器

继续阅读

Lucene 基本原理

05 SpringMVC：day01\03.SpringMVC常用注解

ajax技术学习网址

Ajax学习--网址备忘录

4-3 Thumbnailator图片处理和封装Util【通过java代码实现给图片打上水印】4-3 Thumbnailator图片处理和封装Util

开放源代码搜索引擎

转：基于lucene实现自己的推荐引擎

基于LUCENE实现自己的推荐引擎

Lucene.net和盘古分词使用小结

Apache Lucene 5.x 集成中文分词库 IKAnalyzer

JFLex用户手册中文版安装与配置运行JFLEX 配置文件编写

svn配置权限

MySQL和Lucene索引对比分析1. MySQL索引实现2. Lucene索引实现3. MySQL与Lucence对比参考：

Lucence的基本原理

lucene 关键字高亮

专家访谈：搜索开源力量：Lucene技术前景