天天看点

Java中的Replace和ReplaceAll的区别1、概述2、相关类String、Pattern、Matcher3、相关方法4、结论

replace和replaceAll是String类中提供的两种用于字符/字符串替换的方法。如果只从字面意思理解,很容易误解为replace表示替换单个匹配项,而replaceAll表示替换所有匹配项;而事实上则完全不是这样:P

1、概述

2、相关类String、Pattern、Matcher

3、相关方法

3.1、Matcher

3.2、Pattern

3.3、String

4、结论

1、概述

String类中一共提供了四种替换字符/字符串相关的方法,分别是replace的两个重载方法、replaceAll方法和replaceFirst方法。

  • replace(字符):全部匹配的都替换;参数为字符(char)类型;不调用Pattern和Matcher方法。
  • replace(字串接口实现类):全部匹配的都替换;参数为字串接口实现类(如String);不支持正则匹配,调用Pattern(不匹配正则模式)和Matcher的replaceAll方法。
  • replaceAll:全部匹配的都替换,参数为String类型,支持正则匹配;调用Pattern(匹配正则模式)和Matcher的replaceAll方法。
  • replaceFirst:第一个匹配到的替换,参数为String类型,支持正则匹配;调用Pattern(匹配正则模式)和Matcher的replaceFirst方法。

2、相关类String、Pattern、Matcher

  • String类:
public final class String implements java.io.Serializable, Comparable<String>, CharSequence
           

           字符串和相关方法的类:The String class represents character strings. All string literals in Java programs, such as "abc", are implemented as instances of this class.

详细介绍见以下两篇博客:

  • Pattern && Matcher

正则表达式捕获组的概念:https://blog.csdn.net/kofandlizi/article/details/7323863

Pattern和Matcher大概介绍:https://blog.csdn.net/yin380697242/article/details/52049999

总的来说,Pattern类的作用在于编译正则表达式后创建一个匹配模式,Matcher类使用Pattern实例提供的模式信息对正则表达式进行匹配。

  • String、Pattern、Matcher类的相关方法调用图
Java中的Replace和ReplaceAll的区别1、概述2、相关类String、Pattern、Matcher3、相关方法4、结论

3、相关方法

3.1、Matcher

详细见这篇博文:https://www.cnblogs.com/SQP51312/p/6134324.html

  • Matcher(Pattern parent, CharSequence text);

Matcher的构造函数,包访问权限,不允许外部生成Matcher的实例

/**

* All matchers have the state used by Pattern during a match.

*/

Matcher(Pattern parent, CharSequence text) {

    this.parentPattern = parent;

    this.text = text;



    // Allocate state storage

    int parentGroupCount = Math.max(parent.capturingGroupCount, 10);

    groups = new int[parentGroupCount * 2];    //数组groups是组使用的存储。存储的是当前匹配的各捕获组的first和last信息。

    locals = new int[parent.localCount];



    // Put fields into initial states

    reset();

}
           
  • public Matcher appendReplacement(StringBuffer sb, String replacement);

将当前匹配子串替换为指定字符串,并将从上次匹配结束后到本次匹配结束后之间的字符串添加到一个StringBuffer对象中,最后返回其字符串表示形式。

/**

* Implements a non-terminal append-and-replace step.

*

* <p> This method performs the following actions: </p>

*

* <ol>

*

* <li><p> It reads characters from the input sequence, starting at the

* append position, and appends them to the given string buffer. It

* stops after reading the last character preceding the previous match,

* that is, the character at index {@link

* #start()}&nbsp;<tt>-</tt>&nbsp;<tt>1</tt>. </p></li>

*

* <li><p> It appends the given replacement string to the string buffer.

* </p></li>

*

* <li><p> It sets the append position of this matcher to the index of

* the last character matched, plus one, that is, to {@link #end()}.

* </p></li>

*

* </ol>

*

* <p> The replacement string may contain references to subsequences

* captured during the previous match: Each occurrence of

* <tt>${</tt><i>name</i><tt>}</tt> or <tt>$</tt><i>g</i>

* will be replaced by the result of evaluating the corresponding

* {@link #group(String) group(name)} or {@link #group(int) group(g)</tt>}

* respectively. For <tt>$</tt><i>g</i><tt></tt>,

* the first number after the <tt>$</tt> is always treated as part of

* the group reference. Subsequent numbers are incorporated into g if

* they would form a legal group reference. Only the numerals '0'

* through '9' are considered as potential components of the group

* reference. If the second group matched the string <tt>"foo"</tt>, for

* example, then passing the replacement string <tt>"$2bar"</tt> would

* cause <tt>"foobar"</tt> to be appended to the string buffer. A dollar

* sign (<tt>$</tt>) may be included as a literal in the replacement

* string by preceding it with a backslash (<tt>\$</tt>).

*

* <p> Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in

* the replacement string may cause the results to be different than if it

* were being treated as a literal replacement string. Dollar signs may be

* treated as references to captured subsequences as described above, and

* backslashes are used to escape literal characters in the replacement

* string.

*

* <p> This method is intended to be used in a loop together with the

* {@link #appendTail appendTail} and {@link #find find} methods. The

* following code, for example, writes <tt>one dog two dogs in the

* yard</tt> to the standard-output stream: </p>

*

* <blockquote><pre>

* Pattern p = Pattern.compile("cat");

* Matcher m = p.matcher("one cat two cats in the yard");

* StringBuffer sb = new StringBuffer();

* while (m.find()) {

* m.appendReplacement(sb, "dog");

* }

* m.appendTail(sb);

* System.out.println(sb.toString());</pre></blockquote>

*

* @param sb

* The target string buffer

*

* @param replacement

* The replacement string

*

* @return This matcher

*

* @throws IllegalStateException

* If no match has yet been attempted,

* or if the previous match operation failed

*

* @throws IllegalArgumentException

* If the replacement string refers to a named-capturing

* group that does not exist in the pattern

*

* @throws IndexOutOfBoundsException

* If the replacement string refers to a capturing group

* that does not exist in the pattern

*/

public Matcher appendReplacement(StringBuffer sb, String replacement) {



    // If no match, return error

    if (first < 0)

        throw new IllegalStateException("No match available");



    // Process substitution string to replace group references with groups

    int cursor = 0;

    StringBuilder result = new StringBuilder();



    while (cursor < replacement.length()) {  // 1start

        char nextChar = replacement.charAt(cursor);

        if (nextChar == '\\') {  // 2start

            cursor++;

            nextChar = replacement.charAt(cursor);

            result.append(nextChar);

            cursor++;

        } else if (nextChar == '$') {  // 2end,3start

            // Skip past $

            cursor++;

            // A StringIndexOutOfBoundsException is thrown if

            // this "$" is the last character in replacement

            // string in current implementation, a IAE might be

            // more appropriate.

            nextChar = replacement.charAt(cursor);

            int refNum = -1;

            if (nextChar == '{') {  // 4start

                cursor++;

                StringBuilder gsb = new StringBuilder();

                while (cursor < replacement.length()) {  // 5start

                    nextChar = replacement.charAt(cursor);

                    if (ASCII.isLower(nextChar) || ASCII.isUpper(nextChar) || ASCII.isDigit(nextChar)) {  // 6start

                        gsb.append(nextChar);

                        cursor++;

                    } else {  // 6end,7start

                        break;

                    }  // 7end

                }  // 5end

                if (gsb.length() == 0)

                    throw new IllegalArgumentException("named capturing group has 0 length name");

                if (nextChar != '}')

                    throw new IllegalArgumentException("named capturing group is missing trailing '}'");

                String gname = gsb.toString();

                if (ASCII.isDigit(gname.charAt(0)))

                    throw new IllegalArgumentException("capturing group name {" + gname + "} starts with digit character");

                if (!parentPattern.namedGroups().containsKey(gname))

                    throw new IllegalArgumentException("No group with name {" + gname + "}");

                refNum = parentPattern.namedGroups().get(gname);

                cursor++;

            } else {  // 4end,8start

                // The first number is always a group

                refNum = (int)nextChar - '0';

                if ((refNum < 0)||(refNum > 9))

                    throw new IllegalArgumentException("Illegal group reference");

                cursor++;

                // Capture the largest legal group string

                boolean done = false;

                while (!done) {  // 9start

                    if (cursor >= replacement.length()) {  // 10start

                        break;

                    }  // 10end

                    int nextDigit = replacement.charAt(cursor) - '0';

                    if ((nextDigit < 0)||(nextDigit > 9)) {  // 11start

                        // not a number

                        break;

                    }  // 11end

                    int newRefNum = (refNum * 10) + nextDigit;

                    if (groupCount() < newRefNum) {  // 12start

                        done = true;

                    } else {  // 12end,13start

                        refNum = newRefNum;

                        cursor++;

                    }  // 13end

                }  // 9end    

            }  // 8end

            // Append group

            if (start(refNum) != -1 && end(refNum) != -1)

                result.append(text, start(refNum), end(refNum));

        } else {  // 3end,14start

            result.append(nextChar);

            cursor++;

        }  // 14end

    }  // 1end

    // Append the intervening text

    sb.append(text, lastAppendPosition, first);

    // Append the match substitution

    sb.append(result);



    lastAppendPosition = last;

    return this;

}
           
  • public StringBuffer appendTail(StringBuffer sb);

将最后一次匹配工作后剩余的字符串添加到一个StringBuffer对象里。

/**

* Implements a terminal append-and-replace step.

*

* <p> This method reads characters from the input sequence, starting at

* the append position, and appends them to the given string buffer. It is

* intended to be invoked after one or more invocations of the {@link

* #appendReplacement appendReplacement} method in order to copy the

* remainder of the input sequence. </p>

*

* @param sb

* The target string buffer

*

* @return The target string buffer

*/

public StringBuffer appendTail(StringBuffer sb) {

    sb.append(text, lastAppendPosition, getTextLength());

    return sb;

}
           
  • public String replaceAll(String replacement);

将匹配的子串用指定的字符串替换。此方法首先重置匹配器,然后判断是否有匹配,若有,则创建StringBuffer 对象,然后循环调用appendReplacement方法进行替换,最后调用 appendTail方法并返回StringBuffer 对象的字符串形式。

/**

* Replaces every subsequence of the input sequence that matches the

* pattern with the given replacement string.

*

* <p> This method first resets this matcher. It then scans the input

* sequence looking for matches of the pattern. Characters that are not

* part of any match are appended directly to the result string; each match

* is replaced in the result by the replacement string. The replacement

* string may contain references to captured subsequences as in the {@link

* #appendReplacement appendReplacement} method.

*

* <p> Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in

* the replacement string may cause the results to be different than if it

* were being treated as a literal replacement string. Dollar signs may be

* treated as references to captured subsequences as described above, and

* backslashes are used to escape literal characters in the replacement

* string.

*

* <p> Given the regular expression <tt>a*b</tt>, the input

* <tt>"aabfooaabfooabfoob"</tt>, and the replacement string

* <tt>"-"</tt>, an invocation of this method on a matcher for that

* expression would yield the string <tt>"-foo-foo-foo-"</tt>.

*

* <p> Invoking this method changes this matcher's state. If the matcher

* is to be used in further matching operations then it should first be

* reset. </p>

*

* @param replacement

* The replacement string

*

* @return The string constructed by replacing each matching subsequence

* by the replacement string, substituting captured subsequences

* as needed

*/

public String replaceAll(String replacement) {

    reset();

    boolean result = find();

    if (result) {

        StringBuffer sb = new StringBuffer();

        do {

        appendReplacement(sb, replacement);

        result = find();

        } while (result);

        appendTail(sb);

        return sb.toString();

    }

    return text.toString();

}
           
  • public String replaceFirst(String replacement);

将匹配的第一个子串用指定的字符串替换。

/**

* Replaces the first subsequence of the input sequence that matches the

* pattern with the given replacement string.

*

* <p> This method first resets this matcher. It then scans the input

* sequence looking for a match of the pattern. Characters that are not

* part of the match are appended directly to the result string; the match

* is replaced in the result by the replacement string. The replacement

* string may contain references to captured subsequences as in the {@link

* #appendReplacement appendReplacement} method.

*

* <p>Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in

* the replacement string may cause the results to be different than if it

* were being treated as a literal replacement string. Dollar signs may be

* treated as references to captured subsequences as described above, and

* backslashes are used to escape literal characters in the replacement

* string.

*

* <p> Given the regular expression <tt>dog</tt>, the input

* <tt>"zzzdogzzzdogzzz"</tt>, and the replacement string

* <tt>"cat"</tt>, an invocation of this method on a matcher for that

* expression would yield the string <tt>"zzzcatzzzdogzzz"</tt>. </p>

*

* <p> Invoking this method changes this matcher's state. If the matcher

* is to be used in further matching operations then it should first be

* reset. </p>

*

* @param replacement

* The replacement string

* @return The string constructed by replacing the first matching

* subsequence by the replacement string, substituting captured

* subsequences as needed

*/

public String replaceFirst(String replacement) {

    if (replacement == null)

        throw new NullPointerException("replacement");

    reset();

    if (!find())

        return text.toString();

    StringBuffer sb = new StringBuffer();

    appendReplacement(sb, replacement);

    appendTail(sb);

    return sb.toString();

}
           

3.2、Pattern

详细见这篇博文:http://www.cnblogs.com/SQP51312/p/6136304.html

  • private Pattern(String p, int f);

Pattern类的构造函数,由于私有,所以外部不能创造其实例,而是通过Pattern.compile(regex)创建pattern实例。

/**

* This private constructor is used to create all Patterns. The pattern

* string and match flags are all that is needed to completely describe

* a Pattern. An empty pattern string results in an object tree with

* only a Start node and a LastNode node.

*/

private Pattern(String p, int f) {

    pattern = p;

    flags = f;



    // to use UNICODE_CASE if UNICODE_CHARACTER_CLASS present

    if ((flags & UNICODE_CHARACTER_CLASS) != 0)

        flags |= UNICODE_CASE;



    // Reset group index count

    capturingGroupCount = 1;

    localCount = 0;



    if (pattern.length() > 0) {

        compile();

    } else {

        root = new Start(lastAccept);

        matchRoot = lastAccept;

    }

}
           
  • public Matcher matcher(CharSequence input);

供外部获取生成的Matcher实例。

/**

* Creates a matcher that will match the given input against this pattern.

* </p>

*

* @param input

* The character sequence to be matched

*

* @return A new matcher for this pattern

*/

public Matcher matcher(CharSequence input) {

    if (!compiled) {

        synchronized(this) {

        if (!compiled)

            compile();

        }

    }

    Matcher m = new Matcher(this, input);

    return m;

}
           
  • public static Pattern compile(String regex, int flags);

调用Pattern构造函数,生成pattern实例。

public static Pattern compile(String regex, int flags) {

    return new Pattern(regex, flags);

}
           
  • public static Pattern compile(String regex);
public static Pattern compile(String regex) {

    return new Pattern(regex, 0);

}
           

3.3、String

  • public String replace(char oldChar, char newChar);

 String类中对replace方法进行了重载,参数可以为单个字符,也可以为实现了CharSequence接口的类(String类是其中之一);而replace在字符替换中,采用的是新建buf数组,然后遍历源数组将需要替换的字符用新字符写入buf数组。

注意:不要望文生义,从源代码来看,replace方法仍然是替换了所有的目标字符!!!

/**

* Returns a new string resulting from replacing all occurrences of

* <code>oldChar</code> in this string with <code>newChar</code>.

* <p>

* If the character <code>oldChar</code> does not occur in the

* character sequence represented by this <code>String</code> object,

* then a reference to this <code>String</code> object is returned.

* Otherwise, a new <code>String</code> object is created that

* represents a character sequence identical to the character sequence

* represented by this <code>String</code> object, except that every

* occurrence of <code>oldChar</code> is replaced by an occurrence

* of <code>newChar</code>.

* <p>

* Examples:

* <blockquote><pre>

* "mesquite in your cellar".replace('e', 'o')

* returns "mosquito in your collar"

* "the war of baronets".replace('r', 'y')

* returns "the way of bayonets"

* "sparring with a purple porpoise".replace('p', 't')

* returns "starring with a turtle tortoise"

* "JonL".replace('q', 'x') returns "JonL" (no change)

* </pre></blockquote>

*

* @param oldChar the old character.

* @param newChar the new character.

* @return a string derived from this string by replacing every

* occurrence of <code>oldChar</code> with <code>newChar</code>.

*/

public String replace(char oldChar, char newChar) {

    if (oldChar != newChar) {

        int len = value.length;

        int i = -1;

        char[] val = value; /* avoid getfield opcode */



        while (++i < len) {

            if (val[i] == oldChar) {

                break;

            }

        }

        if (i < len) {

            char buf[] = new char[len];

            for (int j = 0; j < i; j++) {

                buf[j] = val[j];

            }

            while (i < len) {

                char c = val[i];

                buf[i] = (c == oldChar) ? newChar : c;

                i++;

            }

            return new String(buf, true);

        }

    }

return this;

}
           
  • public String replace(CharSequence target, CharSequence replacement);

这是replace方法的重载,用于字符串的全部替换。实际上是调用了Matcher的replaceAll方法。

注意:通过源码可以知道,虽然调用了Pattern.compile()方法,但是flag值为Pattern.LITERAL,即不使用正则表达式进行匹配!!!

/**

* Replaces each substring of this string that matches the literal target

* sequence with the specified literal replacement sequence. The

* replacement proceeds from the beginning of the string to the end, for

* example, replacing "aa" with "b" in the string "aaa" will result in

* "ba" rather than "ab".

*

* @param target The sequence of char values to be replaced

* @param replacement The replacement sequence of char values

* @return The resulting string

* @throws NullPointerException if <code>target</code> or

* <code>replacement</code> is <code>null</code>.

* @since 1.5

*/

public String replace(CharSequence target, CharSequence replacement) {

    return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(this).replaceAll(Matcher.quoteReplacement(replacement.toString()));

}
           
  • public String replaceAll(String regex, String replacement);

replaceAll方法,用于String类型字符串之间的全部替换。

注意:通过源码可以知道,该方法使用正则表达式进行匹配!!!

/**

* Replaces each substring of this string that matches the given <a

* href="../util/regex/Pattern.html#sum" target="_blank" rel="external nofollow" >regular expression</a> with the

* given replacement.

*

* <p> An invocation of this method of the form

* <i>str</i><tt>.replaceAll(</tt><i>regex</i><tt>,</tt> <i>repl</i><tt>)</tt>

* yields exactly the same result as the expression

*

* <blockquote><tt>

* {@link java.util.regex.Pattern}.{@link java.util.regex.Pattern#compile

* compile}(</tt><i>regex</i><tt>).{@link

* java.util.regex.Pattern#matcher(java.lang.CharSequence)

* matcher}(</tt><i>str</i><tt>).{@link java.util.regex.Matcher#replaceAll

* replaceAll}(</tt><i>repl</i><tt>)</tt></blockquote>

*

*<p>

* Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in the

* replacement string may cause the results to be different than if it were

* being treated as a literal replacement string; see

* {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.

* Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special

* meaning of these characters, if desired.

*

* @param regex

* the regular expression to which this string is to be matched

* @param replacement

* the string to be substituted for each match

*

* @return The resulting <tt>String</tt>

*

* @throws PatternSyntaxException

* if the regular expression's syntax is invalid

*

* @see java.util.regex.Pattern

*

* @since 1.4

* @spec JSR-51

*/

public String replaceAll(String regex, String replacement) {

    return Pattern.compile(regex).matcher(this).replaceAll(replacement);

}
           
  • public String replaceFirst(String regex, String replacement);

replaceFirst方法才是String类提供的局部替换的方法,替换第一个匹配到的字符串,调用的是Matcher的replaceFirst方法。

注意:通过源码可以知道,该方法使用正则表达式进行匹配!!!

public String replaceFirst(String regex, String replacement) {

    return Pattern.compile(regex).matcher(this).replaceFirst(replacement);

}
           

4、结论

String中的方法 参数 替换个数 是否正则 调用Pattern类方法 调用Matcher类方法
replace(char) char 全部替换
replace(charSequence) charSequence 全部替换 Pattern.compile(非正则模式) replaceAll
replaceAll String 全部替换 Pattern.compile(正则模式) replaceAll
replaceFirst String 替换第一个匹配的 Pattern.compile(正则模式) replaceFirst