今天就来和大家讲讲如何在es中安装dynamic-synonym插件,首先我们需要去github上下载与es版本对应的插件,一般github上基本都是本地词库和远程文本词库的,在gitee上可以找到采用数据库作为词库的源码,大致思路就是修改一些参数配置,然后自己创建一个表作为同义词词库,最后将打包好的jar包插件丢到es-plugins目录下面,最后重启一下就能跑起来了。但是!!!作者没有跑起来,遇到了好多问题【哭泣泣】,因为我是在docker容器中运行的es,而容器一直报的是Java权限问题,我在网络上找了一圈才东拼西凑的把这个问题给解决,真的太高兴啦!!!

接下来就开始讲讲思路

  1. 下载源码,修改dynamic-synonym配置
  2. 新增MySQL代码
  3. 创建一个dynamic-synonym的表
  4. 修改docker中es容器的Java.policy文件**【非常重要】**
  5. 将打包好的jar包放入到 {es-root}/es-plugins目录下面
  6. docker重启es容器
  7. 新建es的dynamic-synonym索引测试

**文章末尾会给出作者已经配置好的插件代码!!!!!! 请注意签收!!!!!**可以直接跳到四或者五,根据你自己的需求来选择

一、下载源码并且修改配置

github好多好多的源码啊,真的是看都看不过来,下载之后要结合自己es版本切换分支,这里建议直接下载最原始的源码,链接为:https://github.com/bells/elasticsearch-analysis-dynamic-synonym,下载好了之后需要切换与es版本对应代码分支,作者的es版本为7.12.1,修改一下pom文件的配置

elasticsearch安装dynamic-synonym插件_elasticsearch

elasticsearch安装dynamic-synonym插件_elasticsearch_02

1.1 修改pom.xml文件

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.bellszhu.elasticsearch</groupId>
    <artifactId>elasticsearch-analysis-dynamic-synonym</artifactId>
    <version>7.12.1</version>
    <packaging>jar</packaging>
    <name>elasticsearch-dynamic-synonym</name>
    <description>Analysis-plugin for synonym</description>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <elasticsearch.version>${project.version}</elasticsearch.version>
        <maven.compiler.target>1.8</maven.compiler.target>
        <elasticsearch.plugin.name>analysis-dynamic-synonym</elasticsearch.plugin.name>
        <elasticsearch.assembly.descriptor>${project.basedir}/src/main/assemblies/plugin.xml
        </elasticsearch.assembly.descriptor>
        <elasticsearch.plugin.classname>com.bellszhu.elasticsearch.plugin.DynamicSynonymPlugin
        </elasticsearch.plugin.classname>
        <elasticsearch.plugin.jvm>true</elasticsearch.plugin.jvm>
    </properties>

    <licenses>
        <license>
            <name>The Apache Software License, Version 2.0</name>
            <url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>
            <distribution>repo</distribution>
        </license>
    </licenses>

    <parent>
        <groupId>org.sonatype.oss</groupId>
        <artifactId>oss-parent</artifactId>
        <version>9</version>
    </parent>

    <scm>
        <connection>scm:git:git@github.com:bells/elasticsearch-analysis-dynamic-synonym.git</connection>
        <developerConnection>scm:git:git@github.com:bells/elasticsearch-analysis-dynamic-synonym.git
        </developerConnection>
        <url>https://github.com/bells/elasticsearch-analysis-dynamic-synonym</url>
    </scm>

    <dependencies>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>${elasticsearch.version}</version>
        </dependency>
        <dependency>
            <groupId>org.codelibs.elasticsearch.module</groupId>
            <artifactId>analysis-common</artifactId>
            <version>7.10.2</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.13.1</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.5.13</version>
        </dependency>
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>8.0.22</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.13.2</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>2.11.1</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.codelibs</groupId>
            <artifactId>elasticsearch-cluster-runner</artifactId>
            <version>7.10.2.0</version>
            <scope>test</scope>
        </dependency>
    </dependencies>


    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.3.2</version>
                <configuration>
                    <source>${maven.compiler.target}</source>
                    <target>${maven.compiler.target}</target>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.11</version>
                <configuration>
                    <includes>
                        <include>**/*Tests.java</include>
                    </includes>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-source-plugin</artifactId>
                <version>2.1.2</version>
                <executions>
                    <execution>
                        <id>attach-sources</id>
                        <goals>
                            <goal>jar</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <appendAssemblyId>false</appendAssemblyId>
                    <outputDirectory>${project.build.directory}/releases/</outputDirectory>
                    <descriptors>
                        <descriptor>${basedir}/src/main/assemblies/plugin.xml</descriptor>
                    </descriptors>
                    <archive>
                        <manifest>
                            <mainClass>fully.qualified.MainClass</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

这里在做链接MySQL数据的时候要注意一下MySQL的驱动jar包,不同版本的url会有所区别。

二、新增MySQL代码

2.1 新增MysqlRemoteSynonymFile文件

public class MySqlRemoteSynonymFile implements SynonymFile{

    /**
     * 数据库配置文件名
     */
    private final static String DB_PROPERTIES = "jdbc-reload.properties";
    private static Logger logger = LogManager.getLogger("dynamic-synonym");

    private String format;

    private boolean expand;

    private boolean lenient;

    private Analyzer analyzer;

    private Environment env;

    // 数据库配置
    private String location;

    /**
     * 数据库地址
     */
    private static final String JDBC_URL = "jdbc.url";
    /**
     * 数据库驱动
     */
    private static final String JDBC_DRIVER = "jdbc.driver";
    /**
     * 数据库用户名
     */
    private static final String JDBC_USER = "jdbc.user";
    /**
     * 数据库密码
     */
    private static final String JDBC_PASSWORD = "jdbc.password";

    /**
     * 当前节点的同义词版本号
     */
    private LocalDateTime thisSynonymVersion = LocalDateTime.now();

    private static Connection connection = null;

    private Statement statement = null;

    private Properties props;

    private Path conf_dir;

    MySqlRemoteSynonymFile(Environment env, Analyzer analyzer,
                           boolean expand, boolean lenient, String format, String location) {
        this.analyzer = analyzer;
        this.expand = expand;
        this.format = format;
        this.lenient = lenient;
        this.env = env;
        this.location = location;
        this.props = new Properties();

        //读取当前 jar 包存放的路径
        Path filePath = PathUtils.get(new File(DynamicSynonymPlugin.class.getProtectionDomain().getCodeSource()
                .getLocation().getPath())
                .getParent(), "config")
                .toAbsolutePath();
        this.conf_dir = filePath.resolve(DB_PROPERTIES);

        //判断文件是否存在
        File configFile = conf_dir.toFile();
        InputStream input = null;
        try {
            input = new FileInputStream(configFile);
        } catch (FileNotFoundException e) {
            logger.info("jdbc-reload.properties 数据库配置文件没有找到, " + e);
        }
        if (input != null) {
            try {
                props.load(input);
            } catch (IOException e) {
                logger.error("数据库配置文件 jdbc-reload.properties 加载失败," + e);
            }
        }
        isNeedReloadSynonymMap();
    }

    /**
     * 加载同义词词典至SynonymMap中
     * @return SynonymMap
     */
    @Override
    public SynonymMap reloadSynonymMap() {
        try {
            logger.info("start reload local synonym from {}.", location);
            Reader rulesReader = getReader();
            SynonymMap.Builder parser = RemoteSynonymFile.getSynonymParser(rulesReader, format, expand, lenient, analyzer);
            return parser.build();
        } catch (Exception e) {
            logger.error("reload local synonym {} error! cause: {}", location, e.getMessage());
            throw new IllegalArgumentException(
                    "could not reload local synonyms file to build synonyms", e);
        }
    }

    /**
     * 判断是否需要进行重新加载
     * @return true or false
     */
    @Override
    public boolean isNeedReloadSynonymMap() {
        try {
            LocalDateTime mysqlLastModify = getMySqlSynonymLastModify();
            if (!thisSynonymVersion.isEqual(mysqlLastModify)) {
                thisSynonymVersion = mysqlLastModify;
                return true;
            }
        } catch (Exception e) {
            logger.error(e);
        }
        return false;
    }

    /**
     * 获取MySql中同义词版本号信息
     * 用于判断同义词是否需要进行重新加载
     *
     * @return getLastModify
     */
    public LocalDateTime getMySqlSynonymLastModify() {
        ResultSet resultSet = null;
        LocalDateTime mysqlSynonymLastModify = null;
        try {
            if (statement == null) {
                statement = getConnection(props);
            }
            resultSet = statement.executeQuery(props.getProperty("jdbc.reload.swith.synonym.last_modify"));
            while (resultSet.next()) {
                Timestamp lastModify = resultSet.getTimestamp("last_modify");
                mysqlSynonymLastModify = lastModify.toLocalDateTime();
                // logger.info("当前MySql同义词最后修改时间为:{}, 当前节点同义词库最后修改时间为:{}", mysqlSynonymLastModify, thisSynonymVersion);
            }
        } catch (SQLException e) {
            e.printStackTrace();
        } finally {
            try {
                if (resultSet != null) {
                    resultSet.close();
                }
            } catch (SQLException e) {
                e.printStackTrace();
            }
        }
        return mysqlSynonymLastModify;
    }

    /**
     * 查询数据库中的同义词
     * @return DBData
     */
    public ArrayList<String> getDbData() {
        ArrayList<String> arrayList = new ArrayList<>();
        ResultSet resultSet = null;
        try {
            if (statement == null) {
                statement = getConnection(props);
            }
            logger.info("正在执行SQL查询同义词列表,SQL:{}", props.getProperty("jdbc.reload.synonym.sql"));
            resultSet = statement.executeQuery(props.getProperty("jdbc.reload.synonym.sql"));
            while (resultSet.next()) {
                String theWord = resultSet.getString("words");
                arrayList.add(theWord);
            }
        } catch (SQLException e) {
            logger.error(e);
        } finally {
            try {
                if (resultSet != null) {
                    resultSet.close();
                }
            } catch (SQLException e) {
                e.printStackTrace();
            }

        }
        return arrayList;
    }

    /**
     * 同义词库的加载
     * @return Reader
     */
    @Override
    public Reader getReader() {

        StringBuilder sb = new StringBuilder();
        try {
            ArrayList<String> dbData = getDbData();
            for (String dbDatum : dbData) {
                logger.info("正在加载同义词:{}", dbDatum);
                // 获取一行一行的记录,每一条记录都包含多个词,形成一个词组,词与词之间使用英文逗号分割
                sb.append(dbDatum)
                        .append(System.getProperty("line.separator"));
            }
        } catch (Exception e) {
            logger.error("同义词加载失败");
        }
        return new StringReader(sb.toString());
    }

    /**
     * 获取数据库可执行连接
     * @param props 配置文件
     * @throws SQLException 获取连接失败
     */
    private static Statement getConnection(Properties props) throws SQLException {
        try {
            Class.forName(props.getProperty(JDBC_DRIVER));
        } catch (ClassNotFoundException e) {
            logger.error("驱动加载失败", e);
        }
        if (connection == null) {
            connection = DriverManager.getConnection(
                    props.getProperty(JDBC_URL),
                    props.getProperty(JDBC_USER),
                    props.getProperty(JDBC_PASSWORD));
        }
        return connection.createStatement();
    }
}

2.2 在getSynonymFile新增MySQL的连接方式

修改的DynamicSynonymTokenFilterFactory的资源获取代码

SynonymFile getSynonymFile(Analyzer analyzer) {
    try {
        SynonymFile synonymFile;
        if ("MySql".equals(location)) {
            synonymFile = new MySqlRemoteSynonymFile(environment, analyzer, expand, lenient, format, location);
        } else if (location.startsWith("http://") || location.startsWith("https://")) {
            synonymFile = new RemoteSynonymFile(
                environment, analyzer, expand, lenient,  format, location);
        } else {
            synonymFile = new LocalSynonymFile(
                environment, analyzer, expand, lenient, format, location);
        }
        if (scheduledFuture == null) {
            scheduledFuture = pool.scheduleAtFixedRate(new Monitor(synonymFile),
                                                       interval, interval, TimeUnit.SECONDS);
        }
        return synonymFile;
    } catch (Exception e) {
        logger.error("failed to get synonyms: " + location, e);
        throw new IllegalArgumentException("failed to get synonyms : " + location, e);
    }
}

三、创建一个dynamic-synonym的表

3.1 建库建表

作者这边的数据库名称为word,表名为synonym

/*
 Navicat Premium Data Transfer

 Source Server         : localhost
 Source Server Type    : MySQL
 Source Server Version : 50717
 Source Host           : localhost:3306
 Source Schema         : auth

 Target Server Type    : MySQL
 Target Server Version : 50717
 File Encoding         : 65001

 Date: 05/01/2022 17:01:31
*/

SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;

-- ----------------------------
-- Table structure for synonym
-- ----------------------------
DROP TABLE IF EXISTS `synonym`;
CREATE TABLE `synonym`  (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键',
  `words` text CHARACTER SET utf8 COLLATE utf8_bin NULL COMMENT '同义词',
  `last_modify` timestamp(0) NULL DEFAULT CURRENT_TIMESTAMP(0) ON UPDATE CURRENT_TIMESTAMP(0) COMMENT '最后更新时间',
  PRIMARY KEY (`id`) USING BTREE
) ENGINE = InnoDB AUTO_INCREMENT = 2 CHARACTER SET = utf8 COLLATE = utf8_bin ROW_FORMAT = Dynamic;

-- ----------------------------
-- Records of synonym
-- ----------------------------
INSERT INTO `synonym` VALUES (1, '西红柿,番茄,洋柿子', '2022-01-05 16:48:24');

SET FOREIGN_KEY_CHECKS = 1;

3.2 修改数据库连接的配置文件

在项目的src同级目录下新增config/jdbc-reload.properties文件

# permission java.net.SocketPermission "*", "connect,resolve";
# CHCP 65001

jdbc.url=jdbc:mysql://192.168.255.132:3306/word?serverTimezone=GMT
jdbc.user=root
jdbc.driver=com.mysql.cj.jdbc.Driver
jdbc.password=123456
# 查询词库
jdbc.reload.synonym.sql=select words from synonym
# 查询更新时间
jdbc.reload.swith.synonym.last_modify=SELECT MAX(last_modify) last_modify FROM synonym

四、修改docker中es容器的Java.policy文件**【非常重要】**

这里作者用的是docker容器化部署,如果是直接装在windows系统或者centos系统下,就要去修改es依赖的Jdk,直接修改系统的jdk的java.policy文件。在这里不直接修改系统jdk的java.policy文件是因为docker容器化部署的es是独立于系统的jdk运行的,这个es有一套自己的输出逻辑。

4.1 找到Java.policy

首先进入到容器内部操作 docker exec -it es /bin/bash,然后直接打开 cd /usr/share/elasticsearch/jdk/conf/security/文件夹,找到Java.policy文件。

elasticsearch安装dynamic-synonym插件_apache_03

[root@localhost ~]# docker exec -it es /bin/bash
[root@ee5fd3f35131 elasticsearch]# cd /usr/share/elasticsearch/jdk/conf/security/
[root@ee5fd3f35131 security]# ls
java.policy  java.security  policy
[root@ee5fd3f35131 security]# vi java.policy

4.2 修改java.policy文件

elasticsearch安装dynamic-synonym插件_maven_04

下面文件的全部内容:

//
// This system policy file grants a set of default permissions to all domains
// and can be configured to grant additional permissions to modules and other
// code sources. The code source URL scheme for modules linked into a
// run-time image is "jrt".
//
// For example, to grant permission to read the "foo" property to the module
// "com.greetings", the grant entry is:
//
// grant codeBase "jrt:/com.greetings" {
//     permission java.util.PropertyPermission "foo", "read";
// };
//
grant codeBase "file:${{java.ext.dirs}}/*" {
    permission java.security.AllPermission;
};

// default permissions granted to all domains
grant {
    // allows anyone to listen on dynamic ports
    permission java.net.SocketPermission "localhost:0", "listen";

    // "standard" properies that can be read by anyone
    permission java.util.PropertyPermission "java.version", "read";
    permission java.util.PropertyPermission "java.vendor", "read";
    permission java.util.PropertyPermission "java.vendor.url", "read";
    permission java.util.PropertyPermission "java.class.version", "read";
    permission java.util.PropertyPermission "os.name", "read";
    permission java.util.PropertyPermission "os.version", "read";
    permission java.util.PropertyPermission "os.arch", "read";
    permission java.util.PropertyPermission "file.separator", "read";
    permission java.util.PropertyPermission "path.separator", "read";
    permission java.util.PropertyPermission "line.separator", "read";
    permission java.util.PropertyPermission
                   "java.specification.version", "read";
    permission java.util.PropertyPermission "java.specification.vendor", "read";
    permission java.util.PropertyPermission "java.specification.name", "read";
    permission java.util.PropertyPermission
                   "java.vm.specification.version", "read";
    permission java.util.PropertyPermission
                   "java.vm.specification.vendor", "read";
    permission java.util.PropertyPermission
                   "java.vm.specification.name", "read";
    permission java.util.PropertyPermission "java.vm.version", "read";
    permission java.util.PropertyPermission "java.vm.vendor", "read";
    permission java.util.PropertyPermission "java.vm.name", "read";
    permission java.net.SocketPermission "*", "connect,resolve";
    permission java.lang.RuntimePermission "setContextClassLoader";
    permission java.lang.RuntimePermission "accessDeclaredMembers";
    permission java.lang.RuntimePermission "createClassLoader";
    permission java.security.AllPermission;
};

五、将打包好的jar包放入到 {es-root}/es-plugins目录下面

5.1 在打包之前一定要注意自己es的版本号

elasticsearch安装dynamic-synonym插件_elasticsearch_05

5.2 打包完成之后解压文件并且上传到服务器中的es的plugins目录

这里作者用的docker的容器部署,如果是windows本地直接找到plugins目录放进去就可以了。

elasticsearch安装dynamic-synonym插件_apache_06

六、docker重启es容器

如果直接安装在系统上,就直接去找到elasticsearch/bin目录下重启一下就可以啦。作者这里是容器部署的哈。

docker restart es

容器重启之后记得查看一下docker的控制台输出,看看有没有什么问题,如果出现权限之类的问题,那基本上就是java.policy文件没有配置正确,如果出现数据库之类的问题,请在本地建个Java项目连接一下试试,看看能不能跑的起来。

docker logs -f es

七、新建es的dynamic-synonym索引测试

PUT synonyms_index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1,
      "analysis": {
        "analyzer": {
          "synonym": {
            "type":"custom",
            "tokenizer": "ik_smart",
            "filter": ["synonym_custom"]
          }
        },
        "filter": {
          "synonym_custom": {
            "type": "dynamic_synonym",
            "synonyms_path": "MySql"
          }
        }
    }
  },
  "mappings": {
      "properties": {
        "name": {
          "type": "text",
          "analyzer": "synonym"
        }
      }
  }
}
GET /synonyms_index/_analyze
{
  "text": "西红柿",
  "analyzer": "synonym"
}

elasticsearch安装dynamic-synonym插件_apache_07

这样子就算运行成功啦,开心撒花!!!

delete synonyms_index

八、总结

8.1 源码地址

为了做这个项目,作者搞了大概得有一天,为了让大家节省时间,这里可以直接下载我已经配置好的源码

8.2 小节

经过一天的研究,终于大致弄明白es插件的运行过程了,为后续实现自动补全功能、优化搜索、广告推荐、聚合查询做好了前提条件。

以后如果做这些功能了再将博客补上,最后,感谢大家的支持

阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6