關于java UTF-8中文與Unicode編碼之間轉換，以及将浏覽器位址欄編碼過的中文轉換UTF-8形式

1. 将浏覽器位址欄編碼過的中文轉換回中文

/**
     * @param fromStr : 要轉換的原始字元串
     * @return : 得到轉換後的字元串
     */
    private String is(String fromStr) {
        StringBuffer stringBufferResult = new StringBuffer();
        for (int i = ; i < fromStr.length(); i++) {
            char chr = fromStr.charAt(i);
            if (chr == '%') {
                StringBuffer stringTmp = new StringBuffer();
                stringTmp.append(fromStr.charAt(i + )).append(fromStr.charAt(i + ));
                //轉換字元，16進制轉換成整型
                stringBufferResult.append((char) (Integer.valueOf(stringTmp.toString(), ).intValue()));
                i = i + ;
                continue;
            }
            stringBufferResult.append(chr);
        }

        String newStr = null; //編碼轉換
        try {
            newStr = new String(stringBufferResult.toString().getBytes("Cp1252"), "UTF-8");
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        }
        return newStr;
    }

2. 将字元串編碼成 Unicode 形式的字元串與其反轉換

/**
     * 将字元串編碼成 Unicode 形式的字元串. 如 "黃" to "\u9EC4"
     * Converts unicodes to encoded \\uxxxx and escapes
     * special characters with a preceding slash
     *
     * @param theString   待轉換成Unicode編碼的字元串。
     * @param escapeSpace 是否忽略空格，為true時在空格後面是否加個反斜杠。
     * @return 傳回轉換後Unicode編碼的字元串。
     */
    public static String toEncodedUnicode(String theString, boolean escapeSpace) {
        int len = theString.length();
        int bufLen = len * ;
        if (bufLen < ) {
            bufLen = Integer.MAX_VALUE;
        }
        StringBuffer outBuffer = new StringBuffer(bufLen);
        for (int x = ; x < len; x++) {
            char aChar = theString.charAt(x);
            // Handle common case first, selecting largest block that
            // avoids the specials below
            if ((aChar > ) && (aChar < )) {
                if (aChar == '\\') {
                    outBuffer.append('\\');
                    outBuffer.append('\\');
                    continue;
                }
                outBuffer.append(aChar);
                continue;
            }
            switch (aChar) {
                case ' ':
                    if (x ==  || escapeSpace) outBuffer.append('\\');
                    outBuffer.append(' ');
                    break;
                case '\t':
                    outBuffer.append('\\');
                    outBuffer.append('t');
                    break;
                case '\n':
                    outBuffer.append('\\');
                    outBuffer.append('n');
                    break;
                case '\r':
                    outBuffer.append('\\');
                    outBuffer.append('r');
                    break;
                case '\f':
                    outBuffer.append('\\');
                    outBuffer.append('f');
                    break;
                case '=': // Fall through
                case ':': // Fall through
                case '#': // Fall through
                case '!':
                    outBuffer.append('\\');
                    outBuffer.append(aChar);
                    break;
                default:
                    if ((aChar < ) || (aChar > )) {
                        // 每個unicode有16位，每四位對應的16進制從高位儲存到低位
                        outBuffer.append('\\');
                        outBuffer.append('u');
                        outBuffer.append(toHex((aChar >> ) & ));
                        outBuffer.append(toHex((aChar >> ) & ));
                        outBuffer.append(toHex((aChar >> ) & ));
                        outBuffer.append(toHex(aChar & ));
                    } else {
                        outBuffer.append(aChar);
                    }
            }
        }
        return outBuffer.toString();
    }
    /**
     * 從 Unicode 形式的字元串轉換成對應的編碼的特殊字元串。 如 "\u9EC4" to "黃".
     * Converts encoded \\uxxxx to unicode chars
     * and changes special saved chars to their original forms
     *
     * @param in  Unicode編碼的字元數組。
     * @param off 轉換的起始偏移量。
     * @param len 轉換的字元長度。
     * @param in  轉換的緩存字元數組。
     * @return 完成轉換，傳回編碼前的特殊字元串。
     */
    public static String fromEncodedUnicode(char[] in, int off, int len) {
        char aChar;
        char[] out = new char[len]; // 隻短不長
        int outLen = ;
        int end = off + len;
        while (off < end) {
            aChar = in[off++];
            if (aChar == '\\') {
                aChar = in[off++];
                if (aChar == 'u') {
                    // Read the xxxx
                    int value = ;
                    for (int i = ; i < ; i++) {
                        aChar = in[off++];
                        switch (aChar) {
                            case '0':
                            case '1':
                            case '2':
                            case '3':
                            case '4':
                            case '5':
                            case '6':
                            case '7':
                            case '8':
                            case '9':
                                value = (value << ) + aChar - '0';
                                break;
                            case 'a':
                            case 'b':
                            case 'c':
                            case 'd':
                            case 'e':
                            case 'f':
                                value = (value << ) +  + aChar - 'a';
                                break;
                            case 'A':
                            case 'B':
                            case 'C':
                            case 'D':
                            case 'E':
                            case 'F':
                                value = (value << ) +  + aChar - 'A';
                                break;
                            default:
                                throw new IllegalArgumentException("Malformed \\uxxxx encoding.");
                        }
                    }
                    out[outLen++] = (char) value;
                } else {
                    if (aChar == 't') {
                        aChar = '\t';
                    } else if (aChar == 'r') {
                        aChar = '\r';
                    } else if (aChar == 'n') {
                        aChar = '\n';
                    } else if (aChar == 'f') {
                        aChar = '\f';
                    }
                    out[outLen++] = aChar;
                }
            } else {
                out[outLen++] = (char) aChar;
            }
        }
        return new String(out, , outLen);
    }
    private static final char[] hexDigit = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A',
            'B', 'C', 'D', 'E', 'F'};
    private static char toHex(int nibble) {
        return hexDigit[(nibble & )];
    }

關于java UTF-8中文與Unicode編碼之間轉換，以及将浏覽器位址欄編碼過的中文轉換UTF-8形式

繼續閱讀

Java小案例——随機數猜測随機數猜測

nginx location中斜線的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

GridView終極用法(一)

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method