String.replace()和String.replaceAll()性能對比

Java中有些常用的API其實值得仔細研究一下，比如String.replace()和String.replaceAll()。以Android7.1源代碼為例，仔細研究一下這兩個API。定義如下：

/**
     * Replaces each substring of this string that matches the literal target
     * sequence with the specified literal replacement sequence. The
     * replacement proceeds from the beginning of the string to the end, for
     * example, replacing "aa" with "b" in the string "aaa" will result in
     * "ba" rather than "ab".
     *
     * @param  target The sequence of char values to be replaced
     * @param  replacement The replacement sequence of char values
     * @return  The resulting string
     * @throws NullPointerException if <code>target</code> or
     *         <code>replacement</code> is <code>null</code>.
     * @since 
     */
    public String replace(CharSequence target, CharSequence replacement)

/**
     * Replaces each substring of this string that matches the given <a
     * href="../util/regex/Pattern.html#sum">regular expression</a> with the
     * given replacement.
     *
     * <p> An invocation of this method of the form
     * <i>str</i><tt>.replaceAll(</tt><i>regex</i><tt>,</tt> <i>repl</i><tt>)</tt>
     * yields exactly the same result as the expression
     *
     * <blockquote><tt>
     * {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#compile
     * compile}(</tt><i>regex</i><tt>).{@link
     * java.util.regex.Pattern#matcher(java.lang.CharSequence)
     * matcher}(</tt><i>str</i><tt>).{@link java.util.regex.Matcher#replaceAll
     * replaceAll}(</tt><i>repl</i><tt>)</tt></blockquote>
     *
     *<p>
     * Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in the
     * replacement string may cause the results to be different than if it were
     * being treated as a literal replacement string; see
     * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
     * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
     * meaning of these characters, if desired.
     *
     * @param   regex
     *          the regular expression to which this string is to be matched
     * @param   replacement
     *          the string to be substituted for each match
     *
     * @return  The resulting <tt>String</tt>
     *
     * @throws  PatternSyntaxException
     *          if the regular expression's syntax is invalid
     *
     * @see java.util.regex.Pattern
     *
     * @since 
     * @spec JSR-
     */
    public String replaceAll(String regex, String replacement)

可以看到，

（1）replace()：傳回輸入字元串的一個副本，該副本将字元串中所有出現的target子字元串都替換成replacement。

（2）replaceAll()：傳回輸入字元串的一個副本，改副本将字元串中所有出現的滿足正規表達式regex的子字元串都替換成replacement。

顯然，兩個API的使用場景不同。replaceAll()的功能更強大一些。同時，因為replaceAll()需要處理正規表達式，性能上應該會弱于replace()。但對于同樣的需求，性能上有多大的差别的？用一個簡單的例子來試驗：

long now = System.currentTimeMillis();
        for (int i =  ; i <  ; i++) {
            "aabbbc".replace("b", "a");
        }
        Log.i("TEST", "replace() : " + (System.currentTimeMillis() - now));

        now = System.currentTimeMillis();
        for (int i =  ; i <  ; i++) {
            "aabbbc".replaceAll("b", "a");
        }
        Log.i("TEST", "replaceAll() : " + (System.currentTimeMillis() - now));

結果：

10-20 16:26:08.401 19518 19670 I TEST    : replace() : 2170
10-20 16:27:47.828 19518 19670 I TEST    : replaceAll() : 99427

可以看到，在将規模放大到100萬量級，replaceAll()耗時是replace()的接近50倍。（這裡暫時不考慮系統線程排程，僅以開始、結束的系統時間戳作為計時依據。另外，上述例子僅僅是為了對于，實際使用中如果是長度為1的字元串的替換，更合适的API當然是replace(char,char)。）

當然，對于固定字元串的替換，一般情況下都會使用replace()；對于複雜的正規表達式，也不能不用replaceAll。兩者的交叉點往往在于簡單的組合，譬如

replace(“a”,”1”).replace(“b”,”1”) vs replaceAll(“[ab]”,”1”)

先不考慮代碼整潔與否，隻關注性能。從前面的50倍差距來看，直覺感覺，是否50是一個臨界點呢？也就是說，使用replace()需要調用50次，而replaceAll()實際上需要調用1次。臨界點是否存在？繼續用試驗來探讨。

先用這樣一段代碼來試探：

long now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz"
                            .replace("a", "1")
                            .replace("b", "1")
                            .replace("c", "1")
                            .replace("d", "1")
                            .replace("e", "1")
                            .replace("f", "1")
                            .replace("g", "1")
                            .replace("h", "1")
                            .replace("i", "1")
                            .replace("j", "1");
                }
                Log.i("TEST", "replace() : " + (System.currentTimeMillis() - now));

                now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz".replaceAll("[a-j]", "1");
                }
                Log.i("TEST", "replaceAll() : " + (System.currentTimeMillis() - now));

規模10萬，測試臨界點10，結果：

10-20 17:04:55.656 24206 24326 I TEST    : replace() : 3274
10-20 17:05:13.844 24206 24326 I TEST    : replaceAll() : 18188

依然有約6倍的耗時，具體到單個replace()就是60倍的耗時。

好，臨界點擴大到26（滿26個英文字母）：

long now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz"
                            .replace("a", "1")
                            .replace("b", "1")
                            .replace("c", "1")
                            .replace("d", "1")
                            .replace("e", "1")
                            .replace("f", "1")
                            .replace("g", "1")
                            .replace("h", "1")
                            .replace("i", "1")
                            .replace("j", "1")
                            .replace("k", "1")
                            .replace("l", "1")
                            .replace("m", "1")
                            .replace("n", "1")
                            .replace("o", "1")
                            .replace("p", "1")
                            .replace("q", "1")
                            .replace("r", "1")
                            .replace("s", "1")
                            .replace("t", "1")
                            .replace("u", "1")
                            .replace("v", "1")
                            .replace("w", "1")
                            .replace("x", "1")
                            .replace("y", "1")
                            .replace("z", "1");
                }
                Log.i("TEST", "replace() : " + (System.currentTimeMillis() - now));

                now = System.currentTimeMillis();
                for (int i =  ; i <  ; i++) {
                    "abcdefghijklmnopqrstuvwxyz".replaceAll("[a-z]", "1");
                }
                Log.i("TEST", "replaceAll() : " + (System.currentTimeMillis() - now));

結果：

10-20 17:02:50.440 22178 22248 I TEST    : replace() : 8232
10-20 17:03:21.954 22178 22248 I TEST    : replaceAll() : 31514

仍有接近4倍的耗時，具體到單個replace()就是100倍的耗時。

随着正規表達式本身的膨脹，replaceAll()的耗時也在增加。

以上已經将26個字母都涵蓋到，依然沒有達到臨界點，是以基本上可以得到結論，對于簡單型的替換而言，單以性能考慮，顯然replace()是更好的選擇。

String.replace()和String.replaceAll()性能對比

繼續閱讀

Java小案例——随機數猜測随機數猜測

nginx location中斜線的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

GridView終極用法(一)

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method