天天看點

通用數字中文互轉換工具代碼

今天在CodeProject首次發表文章:http://www.codeproject.com/KB/cs/NumberToFromWord.aspx,這裡特别對中文數字轉換做一說明。

作為一個可擴充的工具,它能夠實作數字轉中文,以及中文轉數字。大體思路是為每個數字建立建立字典

Dictionary<int, List<string>> NumberNameDict,即

            new Dictionary<int, List<string>>

            {

                {10, new List<string>{"十","拾"}},

                {20, new List<string>{"廿"}},

                {30, new List<string>{"卅"}},

                {100, new List<string>{"百", "佰"}},

                {1000, new List<string>{"千", "仟"}},

                {10000, new List<string>{"萬"}},

                {100000000, new List<string>{"億"}}

            },

            new List<string>() { "○一二三四五六七八九", "0123456789", "零壹貳叁肆伍陸柒捌玖" } 

存儲所有對應的中文,其中List<string>僅對中文有效,存儲0-9對應的中文字元,其中第一個字元串為預設輸出格式。類的初始化程式由此建立反向查詢的字典Dictionary<string, int> WordNameDict:

             foreach (KeyValuePair<int, List<string>> kvp in NumberNameDict)

            {

                foreach (string s in kvp.Value)

                {

                    WordNameDict.Add(s, kvp.Key);

                    if (!AllWords.Contains(s))

                        AllWords.Add(s);

                }

            }

 另外對于進制數建立清單protected List<int> groupNums:

            groupNums = new List<int>();

            int max = NumberNameDict.Keys.Max();

            for (int i = firstGroupNum; i <= max; i*= 10 )

            {

                if (NumberNameDict.ContainsKey(i))

                {

                    //GroupNameDict.Add(i, NumberNameDict[i][0]);

                    groupNums.Add(i);

                }

            }

            groupNums.Sort(); 

 結果是{100000000, 10000, 1000, 100, 10},即"億萬千百十"對應的數字清單。 

數字轉中文

為實作數字轉中文, 首先對于每個非零的字元進行轉換,其位數按照groupNums解析:

            List<string> sections = new List<string>();

            int remained = number;

            for (int i = 0; i < groupNums.Count; i ++ )

            {

                if (remained < groupNums[i])

                    continue;

                int whole = remained / groupNums[i];

                sections.Add(toWords(whole));

                if (ToPlural != null && whole != 1)

                    sections.Add(ToPlural(NumberNameDict[groupNums[i]][0]));

                else

                    sections.Add(NumberNameDict[groupNums[i]][0]);

                remained -= whole * groupNums[i];

                if (remained != 0 && NeedInsertAnd(number, remained))

                //if(remained != 0 && remained < 100)

                    sections.Add(AndWords[0]);

            }

            if (remained != 0)

                sections.Add(toWords(remained)); 

最終結果對于中文就是簡單地将以上分散的字元串連接配接起來:

            StringBuilder sb = new StringBuilder();

            for (int i = 0; i < sections.Count-1; i++)

            {

                sb.Append(sections[i] + Space);

            }

            sb.Append(sections.Last());

            return sb.ToString(); 

為實作個性化的字元輸出,可以調用以下接口,對特定字元進行定義。 

         /// </summary>

        /// <param name="number">The number</param>

        /// <param name="samples">

        /// The characters shall be used to replace the default ones.

        /// <example>

        /// For example, 234002052 by default will be converted to "二億三千四百萬零二千零五十二",

        ///     but if the samples is set to "佰零壹貳叁肆拾", then the output will be "貳億叁千肆佰萬零貳千零五拾貳"

        ///     any characters appeared in the samples will replace the default ones, thus "貳" will replace any "二"s for digit of "2".

        /// </example>

        /// </param>

        /// <returns>The converted string in words.</returns>

        private string toWords(int number, string samples)

        {

            string result = ToWords(number);

            foreach (char ch in samples)

            {

                if (allCharacters.Contains(ch) && WordNameDict.ContainsKey(ch.ToString()))

                {

                    int digit = WordNameDict[ch.ToString()];

                    if (digit > 9 && !groupNums.Contains(digit))

                        continue;

                    string digitStr = NumberNameDict[digit][0];

                    if (digitStr.Length != 1 || digitStr[0] == ch)

                        continue;

                    result = result.Replace(digitStr[0], ch);

                }

            }

            return result;

        }

中文轉換為數字

最大的挑戰在于難以判斷數字對應的位數,考慮到:

  • 高位應當出現在低位之前;
  • 如果高位出現在低位之後,意味着它們應通過相乘組合為一個更高位。
  • 低位出現在高位之後,意味着對之前的數字做個确認。

是以我使用一個堆棧stack<int>實作數字的解析,相應代碼如下。

        /// <summary>

        /// Function to get number from split words.

        /// </summary>

        /// <param name="sectors">Words for each digits of the number</param>

        /// <returns>The number</returns>

        protected int fromWords(string[] sectors)

        {

            int result = 0, current, lastGroup=1, temp, maxGroup=1;

            Stack<int> stack = new Stack<int>();

            foreach (string s in sectors)

            {

                if (AllWords.Contains(s))

                {

                    if (AndWords.Contains(s))

                        continue;

                    if (WordNameDict.ContainsKey(s))

                    {

                        current = WordNameDict[s];

                        if (groupNums.Contains(current))

                        {

                            //The current group is higher than any existed group, thus the digits shall be increased: by Multiply!!!!

                            if(current>= maxGroup)

                            {

                                temp = stack.Pop();

                                while (stack.Count!= 0)

                                {

                                    temp += stack.Pop();

                                };

                                temp *= current;

                                stack.Push(temp);

                                maxGroup *= current;

                                lastGroup = 1;

                            }

                            //The current group is higher than the last group, thus shall be add

                            else if (current > lastGroup)

                            {

                                temp = 0;

                                while(stack.Peek() < current)

                                {

                                    temp += stack.Pop();

                                };

                                temp *= current;

                                stack.Push(temp);

                                lastGroup = current;

                            }

                            else

                            {

                                temp = stack.Pop();

                                temp *= current;

                                stack.Push(temp);

                                lastGroup = current;

                            }

                        }

                        else

                        {

                            stack.Push(current);

                        }

                    }

                }

                else

                    throw new Exception();

            }

            do

            {

                result += stack.Pop();

            } while (stack.Count != 0);

            return result;

        }

To parse the string to get number, the tryParse() is recommended.   

 Collapse | Copy Code

        /// <summary>

        /// The main function to try to retrieve number from string of words.

        /// </summary>

        /// <param name="numberInWords">The original word string of number</param>

        /// <param name="result">The converted number if successful</param>

        /// <returns>TRUE if parse successfully.</returns>

        protected virtual bool tryParse(string numberInWords, out int result)

        {

            result = -1;

            try

            {

                string words = IsCaseSensitive ? numberInWords.ToLower() : numberInWords;

                string[] sectors = split(words);

                var contained = from s in sectors

                                where AllWords.Contains(s)

                                select s;

                result = fromWords(contained.ToArray());

                return true;

            }

            catch

            {

                return false;

            }

        }   

 最終對數字與中英文的轉換結果如下:

5: 五  ==> 5

20: 廿  ==> 20

21: 二十一  ==> 21

99: 九十九  ==> 99

100: 一百  ==> 100

102: 一百零二  ==> 102

131: 一百三十一  ==> 131

356: 三百五十六  ==> 356

909: 九百零九  ==> 909

1000: 一千  ==> 1000

1021: 一千零二十一  ==> 1021

2037: 二千零三十七  ==> 2037

12345: 一萬二千三百四十五  ==> 12345

31027: 三萬一千零二十七  ==> 31027

40002: 四萬零二  ==> 40002

90010: 九萬零一十  ==> 90010

100232300: 一億零二十三萬二千三百  ==> 100232300

234002052: 二億三千四百萬零二千零五十二  ==> 234002052

5: five  ==> 5

20: twenty  ==> 20

21: twenty-one  ==> 21

99: ninety-nine  ==> 99

100: one hundred  ==> 100

102: one hundred and two  ==> 102

131: one hundred and thirty-one  ==> 131

356: three hundreds and fifty-six  ==> 356

909: nine hundreds and nine  ==> 909

1000: one thousand  ==> 1000

1021: one thousand and twenty-one  ==> 1021

2037: two thousands and thirty-seven  ==> 2037

12345: twelve thousands three hundreds and forty-five  ==> 12345

31027: thirty-one thousands and twenty-seven  ==> 31027

40002: forty thousands and two  ==> 40002

90010: ninety thousands and ten  ==> 90010

100232300: one hundred millions two hundreds and thirty-two thousands three hundreds  ==> 100232300

234002052: two hundreds and thirty-four millions two thousands and fifty-two  ==> 234002052

572030013: 五億七千貳佰零叁萬零壹拾叁  ==> 572030013

234002052: 貳億叁千肆佰萬零貳千零五拾貳  ==> 234002052

5: Five  ==> 5

20: Twenty  ==> 20

21: Twenty One  ==> 21

99: Ninety Nine  ==> 99

100: One Hundred  ==> 100

102: One Hundred And Two  ==> 102

131: One Hundred And Thirty One  ==> 131

356: Three Hundreds And Fifty Six  ==> 356

909: Nine Hundreds And Nine  ==> 909

1000: One Thousand  ==> 1000

1021: One Thousand And Twenty One  ==> 1021

2037: Two Thousands And Thirty Seven  ==> 2037

12345: Twelve Thousands Three Hundreds And Forty Five  ==> 12345

31027: Thirty One Thousands And Twenty Seven  ==> 31027

40002: Forty Thousands And Two  ==> 40002

90010: Ninety Thousands And Ten  ==> 90010

100232300: One Hundred Millions Two Hundreds And Thirty Two Thousands Three Hundreds  ==> 100232300

234002052: Two Hundreds And Thirty Four Millions Two Thousands And Fifty Two  ==> 234002052

第壹佰零八 張 = 108 

代碼如下:

/Files/cruisoring/NumberFromWordSource.zip

/Files/cruisoring/NumberWordTestSource.zip 

通用數字中文互轉換工具代碼