天天看點

Bash字元串處理(與Java對照) - 2.字元串的表示方式(字元串常量)Bash字元串處理(與Java對照) - 2.字元串的表示方式(字元串常量)

Bash字元串處理(與Java對照) - 2.字元串的表示方式(字元串常量)

In Java

你懂的!

使用單引号表示字元常量:'c'

使用雙引号表示字元串常量:"hello world"

Java字元轉義

\u???? 四位十進制Unicode字元

\??? 三位八進制字元

\n 換行(\u000a)(Insert a newline in the text at this point.)

\t 水準制表符(\u0009)(Insert a tab in the text at this point.)

\b 倒退(\u0008)(Insert a backspace in the text at this point.)

\r 回車(\u000d)(Insert a carriage return in the text at this point.)

\f 換頁(\u000c)(Insert a formfeed in the text at this point.)

\' 單引号(\u0027)(Insert a single quote character in the text at this point.)

\" 雙引号(\u0022)(Insert a double quote character in the text at this point.)

\\ 反斜杠(\u005c)(Insert a backslash character in the text at this point.)

注:網上有很多資料都不一定正确,還是看Oracle網站上關于Characters的内容為好:

http://download.oracle.com/javase/tutorial/java/data/characters.html

In Bash

字元串的用途

在Bash中無處不是字元串,它可以

(1)作為指令;

(2)作為指令行參數;

(3)作為變量的值;

(4)作為重定向的檔案名稱;

(5)作為輸入字元串(here string, here document);

(6)作為指令行的輸出。

字元串表示方式概述

在Bash中,字元串常量有多種表示方式

(1)不使用任何引号,比如 hello

(2)使用一對單引号括起來,比如 'hello'

(3)使用一對雙引号括起來,比如 "hello"

(4)使用美元符加上一對單引号括起來,比如 $'hello'

如果字元串中不包含特殊字元,那麼上述幾種表示方式是等效的。

友情提示:字元串的這幾種表示方式可以任意組合,比如 h'e'"l"$'lo' 與 hello是等效的。

[[email protected] ~]# echo hello

hello

[[email protected] ~]# echo 'hello'

hello

[[email protected] ~]# echo "hello"

hello

[[email protected] ~]# echo $'hello'

hello

[[email protected] ~]# echo h'e'"l"$'lo'

hello

[[email protected] ~]#

關于字元串表示的問題

在進一步對字元串的表示方式了解之前,請先回答下面的問題:

問題1:字元串中包含空格,應該怎麼表示?

問題2:字元串中包含制表符,應該怎麼表示?

問題3:字元串中包含單引号,應該怎麼表示?

問題4:字元串中包含雙引号,應該怎麼表示?

問題5:字元串中包含$,應該怎麼表示?

問題6:字元串中包含#,應該怎麼表示?

問題7:字元串中包含換行,即有多行,應該怎麼表示?

問題8:字元串中包含反斜杠,應該怎麼表示?

問題9:字元串中包含感歎号,應該怎麼表示?

無引号(one of quoting  mechanisms:  the  escape  character)

在沒有用引号括起來時,所有的特殊字元都會産生特殊的作用,如果想讓它們失去特殊作用,就需要用轉義符(\)進行轉義。

man bash 寫道 A non-quoted backslash (\) is the escape character. It preserves the

literal value of the next character that follows, with the exception of

<newline>. If a \<newline> pair appears, and the backslash is not

itself quoted, the \<newline> is treated as a line continuation (that

is, it is removed from the input stream and effectively ignored).

空格是用來分隔指令與參數、參數與參數、變量指派和指令的,是以如果字元串中有空格時,需要對空格進行轉義。

[[email protected] ~]# STR=hello world

-bash: world: command not found

[[email protected] ~]# STR=hello\ world

換行符是指令行之間的分隔符的一種(另外一種是分号),如果字元串比較長,就可以對換行符進行轉義,變成續行。

[[email protected] ~]# STR=hello\

> world

[[email protected] ~]#

在沒有任何引号括起來時,\跟上字元,那麼所跟的字元将失去特殊含義。比如 管道線(|)。

[[email protected] ~]# STR=hello|world

-bash: world: command not found

[[email protected] ~]# STR=hello\|world

[[email protected] ~]#

單引号(single quote,full quoting)

Advanced Bash-Scripting Guide: Chapter 3. Special Characters 寫道 full quoting [single quote]. 'STRING' preserves all special characters within STRING. This is a stronger form of quoting than "STRING".

在單引号中,所有的特殊字元豆漿失去其特殊含義。這樣也就導緻在單引号中無法表示單引号字元本身。

man bash QUOTING 寫道 Enclosing characters in single quotes preserves the literal value of

each character within the quotes. A single quote may not occur between

single quotes, even when preceded by a backslash.  

[[email protected] ~]# echo '$'

$

[[email protected] ~]# echo '`'

`

[[email protected] ~]# echo '\'

\

[[email protected] ~]# echo '!'

!

[[email protected] ~]# echo 'begin of string

> some strings

> end of string'

begin of string

some strings

end of string

[[email protected] ~]#

下面展示一下單引号表示的字元串在腳本中的表現。

Bash腳本:test_single_quote.sh

#!/bin/sh

echo '$'

echo '`'

echo '\'

echo '!'

echo 'start of string
some strings
end of string'
      

[[email protected] ~]# chmod +x ./test_single_quote.sh

[[email protected] ~]# ./test_single_quote.sh

$

`

\

!

start of string

some strings

end of string

[[email protected] ~]#

雙引号(double quote,partial quoting)

Advanced Bash-Scripting Guide: Chapter 3. Special Characters 寫道 partial quoting [double quote]. "STRING" preserves (from interpretation) most of the special characters within STRING.

在雙引号中,大部分特殊字元失去了其特殊含義,比如 注釋(#)、指令分隔符(;)、管道線(|)、通配符(?*)等。

[[email protected] work191]# echo *

cgen ct08 hycu2 hyfc2 jlib keyc mdbc msgc sms web_spider xqt_admin xqt_demo xqt_server zjlt_mhr_files zjlt_mhr_server zjlt_mhr_user zjlt_mhr_web

[[email protected] work191]# echo "*"

*

[[email protected] work191]# echo #

[[email protected] work191]# echo "#"

#

[[email protected] work191]#

Advanced Bash-Scripting Guide: Chapter 5. Quoting 寫道 This prevents reinterpretation of all special characters within the quoted string -- except $, ` (backquote), and \ (escape). Keeping $ as a special character within double quotes permits referencing a quoted variable ("$variable"), that is, replacing the variable with its value man bash 寫道 Enclosing characters in double quotes preserves the literal value of

all characters within the quotes, with the exception of $, `, and \.  

在雙引号中,特殊字元包括:美元符($)、反引号(`)、轉義符(\)、感歎号(!),其他的特殊字元就不用管了。(為啥在此處寫上“感歎号(!)”呢,請看後文分解)

man bash 寫道 The characters $ and ` retain their special meaning within double

quotes.  

在雙引号中,美元符($)可以引用變量,将被替換為變量的值,比如

"$VAR"   被替換為變量VAR的值,但"$VAR123"将被替換為變量VAR123的值,是以最好用大括号把變量括起來,避免意想不到的錯誤出現;

"${VAR}"   被替換為變量VAR的值,"${VAR}123"将被替換為變量VAR的值再跟上123;

"${VAR:-DEFAULT}"  當變量VAR沒有定義或者為空時,替換為DEFAULT,否則替換為變量VAR的值;

"$(command line)"  相當于 "`command line`",将被替換為指令行執行的标準輸出資訊

變量VAR沒有定義。

[[email protected] ~]# echo "$VAR"

[[email protected] ~]# echo "${VAR}"

[[email protected] ~]# echo "$VAR123"

[[email protected] ~]# echo "${VAR}123"

123

[[email protected] ~]# echo "${VAR:-DEFAULT}"

DEFAULT

現在定義VAR變量。

[[email protected] ~]# VAR="Hello World"

[[email protected] ~]# echo "$VAR"

Hello World

[[email protected] ~]# echo "${VAR}"

Hello World

[[email protected] ~]# echo "$VAR123"

[[email protected] ~]# echo "${VAR}123"

Hello World123

[[email protected] ~]# echo "${VAR:-DEFAULT}"

Hello World

現在測試一下指令執行輸出結果。

[[email protected] ~]# echo "$(pwd)"

/root

[[email protected] ~]#

在雙引号中,反引号(`)将被替換為指令行執行的标準輸出資訊

"`ls`" 相當于 "$(ls)",将被替換為ls指令的執行結果,即列出目前目錄的檔案名;

"`date +%F`"相當于"$(date +%F)",将被替換為date +%F的執行結果,即當天的日期,比如2011-09-01

[[email protected] ~]# echo "`pwd`"

/root

[[email protected] ~]# echo "`date +%F`"

2011-09-02

[[email protected] ~]# echo "$(date +%F)"

2011-09-02

[[email protected] ~]#

man bash 寫道 The backslash retains its special meaning only when followed

by one of the following characters: $, `, ", \, or <newline>. A double

quote may be quoted within double quotes by preceding it with a back-

slash.  

在雙引号中,轉義符(\)是為了友善建立包含特殊字元的字元串,特殊字元的前面需要加上\進行轉義

\"  雙引号 gives the quote its literal meaning

\$  美元符 gives the dollar sign its literal meaning (variable name following \$ will not be referenced)

\\  反斜杠、轉義符本身 gives the backslash its literal meaning

\` 反引号

\<newline> 就是\跟上換行,那麼換行的作用不再有,隻起到續行的作用。

[[email protected] ~]# echo ""

[[email protected] ~]# echo "hello

> world"

hello

world

[[email protected] ~]#

[[email protected] ~]# echo "hello world\""

hello world"

[[email protected] ~]# echo "hello world\$" 

hello world$

[[email protected] ~]# echo "hello world\\"

hello world\

[[email protected] ~]# echo "hello world\`"

hello world`

[[email protected] ~]# echo "hello world\

> yes"

hello worldyes

[[email protected] ~]#

在雙引号中,感歎号(!)的含義根據使用的場合有所不同,在指令行環境,它将被解釋為一個曆史指令,而在腳本中,則不會有特殊含義。

Advanced Bash-Scripting Guide: 5.1. Quoting Variables 寫道 Encapsulating "!" within double quotes gives an error when used from the command line. This is interpreted as a history command. Within a script, though, this problem does not occur, since the Bash history mechanism is disabled then.

在指令行環境,感歎号(!)稱之為“曆史擴充字元(the  history  expansion character)”。

[[email protected] ~]# pwd

/root

[[email protected] ~]# echo "!"

-bash: !: event not found

[[email protected] ~]# echo "!pwd"

echo "pwd"

pwd

[[email protected] ~]#

在腳本中使用感歎号,将不會進行曆史擴充。

Bash腳本:test_exclamation_mark.sh

#!/bin/sh

echo "!"

pwd
echo "!pwd"
      

[[email protected] root]# ./test_exclamation_mark.sh

!

/root

!pwd

[[email protected] root]#

$' ... '(ANSI-C Quoting)

前面講到,包圍在單引号之内的字元都不會有特殊含義,是以單引号本身并不能在一對單引号中出現。但是在前面加上$之後,就可以使用\進行轉義了,\的轉義含義與C語言中的相同。

Advanced Bash-Scripting Guide: Chapter 3. Special Characters 寫道 $' ... '

    Quoted string expansion. This construct expands single or multiple escaped octal or hex values into ASCII or Unicode characters.   Advanced Bash-Scripting Guide: 5.2. Escaping 寫道 The $' ... ' quoted string-expansion construct is a mechanism that uses escaped octal or hex values to assign ASCII characters to variables, e.g., quote=$'\042'.

man bash 寫道 Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped charac-

ters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as fol-

lows:

\a alert (bell)

\b backspace

\e an escape character (not ANSI C)

\f form feed

\n new line

\r carriage return

\t horizontal tab

\v vertical tab

\\ backslash

\' single quote

\nnn the eight-bit character whose value is the octal value nnn (one to three digits)

\xHH the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)

\cx a control-x character

The expanded result is single-quoted, as if the dollar sign had not been present.

叮當一聲。

[[email protected] ~]# echo $'\a'

叮當三聲。

[[email protected] ~]# echo $'\a\a\a'

單引号。

[[email protected] ~]# echo $'\''

'

八進制表示的ASCII字元。

[[email protected] ~]# echo $'\101\102\103\010'

ABC

[[email protected] ~]#

[[email protected] ~]#

$"..."(Locale-Specific Translation)

在雙引号之前加上$,這部分内容還未能很好的了解,就不說了。望高人指點一二。

man bash 寫道 A double-quoted string preceded by a dollar sign ($) will cause the

string to be translated according to the current locale. If the cur-

rent locale is C or POSIX, the dollar sign is ignored. If the string

is translated and replaced, the replacement is double-quoted.

問題解答

問題1:字元串中包含空格,應該怎麼表示?

問題2:字元串中包含制表符,應該怎麼表示?

問題3:字元串中包含單引号,應該怎麼表示?

問題4:字元串中包含雙引号,應該怎麼表示?

問題5:字元串中包含$,應該怎麼表示?

問題6:字元串中包含#,應該怎麼表示?

問題7:字元串中包含換行,即有多行,應該怎麼表示?

問題8:字元串中包含反斜杠,應該怎麼表示?

問題9:字元串中包含感歎号,應該怎麼表示?

感謝大家懷着極大的耐心看完了上述内容,上面9道題的答案在不久之後将公布,有興趣的請直接跟帖回複。

本文連結:http://codingstandards.iteye.com/blog/1166282   (轉載請注明出處)

傳回目錄:Java程式員的Bash實用指南系列之字元串處理(目錄) 

上節内容:Bash字元串處理(與Java對照) - 1.(字元串)變量聲明

下節内容:Bash字元串處理(與Java對照) - 3.給(字元串)變量指派