String.replace()和String.replaceAll()性能对比

Java中有些常用的API其实值得仔细研究一下，比如String.replace()和String.replaceAll()。以Android7.1源代码为例，仔细研究一下这两个API。定义如下：

/**
     * Replaces each substring of this string that matches the literal target
     * sequence with the specified literal replacement sequence. The
     * replacement proceeds from the beginning of the string to the end, for
     * example, replacing "aa" with "b" in the string "aaa" will result in
     * "ba" rather than "ab".
     *
     * @param  target The sequence of char values to be replaced
     * @param  replacement The replacement sequence of char values
     * @return  The resulting string
     * @throws NullPointerException if <code>target</code> or
     *         <code>replacement</code> is <code>null</code>.
     * @since 
     */
    public String replace(CharSequence target, CharSequence replacement)

/**
     * Replaces each substring of this string that matches the given <a
     * href="../util/regex/Pattern.html#sum">regular expression</a> with the
     * given replacement.
     *
     * <p> An invocation of this method of the form
     * <i>str</i><tt>.replaceAll(</tt><i>regex</i><tt>,</tt> <i>repl</i><tt>)</tt>
     * yields exactly the same result as the expression
     *
     * <blockquote><tt>
     * {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#compile
     * compile}(</tt><i>regex</i><tt>).{@link
     * java.util.regex.Pattern#matcher(java.lang.CharSequence)
     * matcher}(</tt><i>str</i><tt>).{@link java.util.regex.Matcher#replaceAll
     * replaceAll}(</tt><i>repl</i><tt>)</tt></blockquote>
     *
     *<p>
     * Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in the
     * replacement string may cause the results to be different than if it were
     * being treated as a literal replacement string; see
     * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
     * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
     * meaning of these characters, if desired.
     *
     * @param   regex
     *          the regular expression to which this string is to be matched
     * @param   replacement
     *          the string to be substituted for each match
     *
     * @return  The resulting <tt>String</tt>
     *
     * @throws  PatternSyntaxException
     *          if the regular expression's syntax is invalid
     *
     * @see java.util.regex.Pattern
     *
     * @since 
     * @spec JSR-
     */
    public String replaceAll(String regex, String replacement)

可以看到，

（1）replace()：返回输入字符串的一个副本，该副本将字符串中所有出现的target子字符串都替换成replacement。

（2）replaceAll()：返回输入字符串的一个副本，改副本将字符串中所有出现的满足正则表达式regex的子字符串都替换成replacement。

显然，两个API的使用场景不同。replaceAll()的功能更强大一些。同时，因为replaceAll()需要处理正则表达式，性能上应该会弱于replace()。但对于同样的需求，性能上有多大的差别的？用一个简单的例子来试验：

long now = System.currentTimeMillis();
        for (int i =  ; i <  ; i++) {
            "aabbbc".replace("b", "a");
        }
        Log.i("TEST", "replace() : " + (System.currentTimeMillis() - now));

        now = System.currentTimeMillis();
        for (int i =  ; i <  ; i++) {
            "aabbbc".replaceAll("b", "a");
        }
        Log.i("TEST", "replaceAll() : " + (System.currentTimeMillis() - now));

结果：

10-20 16:26:08.401 19518 19670 I TEST    : replace() : 2170
10-20 16:27:47.828 19518 19670 I TEST    : replaceAll() : 99427

可以看到，在将规模放大到100万量级，replaceAll()耗时是replace()的接近50倍。（这里暂时不考虑系统线程调度，仅以开始、结束的系统时间戳作为计时依据。另外，上述例子仅仅是为了对于，实际使用中如果是长度为1的字符串的替换，更合适的API当然是replace(char,char)。）

当然，对于固定字符串的替换，一般情况下都会使用replace()；对于复杂的正则表达式，也不能不用replaceAll。两者的交叉点往往在于简单的组合，譬如

replace(“a”,”1”).replace(“b”,”1”) vs replaceAll(“[ab]”,”1”)

先不考虑代码整洁与否，只关注性能。从前面的50倍差距来看，直观感觉，是否50是一个临界点呢？也就是说，使用replace()需要调用50次，而replaceAll()实际上需要调用1次。临界点是否存在？继续用试验来探讨。

先用这样一段代码来试探：

long now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz"
                            .replace("a", "1")
                            .replace("b", "1")
                            .replace("c", "1")
                            .replace("d", "1")
                            .replace("e", "1")
                            .replace("f", "1")
                            .replace("g", "1")
                            .replace("h", "1")
                            .replace("i", "1")
                            .replace("j", "1");
                }
                Log.i("TEST", "replace() : " + (System.currentTimeMillis() - now));

                now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz".replaceAll("[a-j]", "1");
                }
                Log.i("TEST", "replaceAll() : " + (System.currentTimeMillis() - now));

规模10万，测试临界点10，结果：

10-20 17:04:55.656 24206 24326 I TEST    : replace() : 3274
10-20 17:05:13.844 24206 24326 I TEST    : replaceAll() : 18188

依然有约6倍的耗时，具体到单个replace()就是60倍的耗时。

好，临界点扩大到26（满26个英文字母）：

long now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz"
                            .replace("a", "1")
                            .replace("b", "1")
                            .replace("c", "1")
                            .replace("d", "1")
                            .replace("e", "1")
                            .replace("f", "1")
                            .replace("g", "1")
                            .replace("h", "1")
                            .replace("i", "1")
                            .replace("j", "1")
                            .replace("k", "1")
                            .replace("l", "1")
                            .replace("m", "1")
                            .replace("n", "1")
                            .replace("o", "1")
                            .replace("p", "1")
                            .replace("q", "1")
                            .replace("r", "1")
                            .replace("s", "1")
                            .replace("t", "1")
                            .replace("u", "1")
                            .replace("v", "1")
                            .replace("w", "1")
                            .replace("x", "1")
                            .replace("y", "1")
                            .replace("z", "1");
                }
                Log.i("TEST", "replace() : " + (System.currentTimeMillis() - now));

                now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz".replaceAll("[a-z]", "1");
                }
                Log.i("TEST", "replaceAll() : " + (System.currentTimeMillis() - now));

结果：

10-20 17:02:50.440 22178 22248 I TEST    : replace() : 8232
10-20 17:03:21.954 22178 22248 I TEST    : replaceAll() : 31514

仍有接近4倍的耗时，具体到单个replace()就是100倍的耗时。

随着正则表达式本身的膨胀，replaceAll()的耗时也在增加。

以上已经将26个字母都涵盖到，依然没有达到临界点，所以基本上可以得到结论，对于简单型的替换而言，单以性能考虑，显然replace()是更好的选择。

String.replace()和String.replaceAll()性能对比

继续阅读

Java小案例——随机数猜测随机数猜测

nginx location中斜线的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的简单使用

neo4j之cypher使用文档

GitHub连夜封杀！这份阿里 10W 字内部 Java 字面试手册到底有多强？

spark/scala关于【资源文件】加载方法概述外部文件加载方案测试资源文件打包入jar包中小结

mybatis_入门程序Mybatis入门

AOP编程_Android优雅权限框架(1)概念基础，2021金三银四前言正文大纲正文

GridView终极用法(一)

Effective Java 8:通用程序设计

OOM三种类型

工厂模式-三种类型

【递归】高效率求2的n次幂

win10本地scala和spark安装安装scala安装spark

scala (3) Function 和 Method