天天看點

關于strtok函數【轉】

strtok()這個函數大家都應該碰到過,但好像總有些問題, 這裡着重講下它

首先看下msdn上的解釋:

char *strtok( char *strtoken, const char *strdelimit );

parameters

strtoken

string containing token or tokens.

strdelimit

set of delimiter characters.

return value

returns a pointer to the next token found in strtoken. they return null when no more tokens are found. each call modifies strtoken by substituting a null character for each delimiter that is encountered.

remarks

the strtok function finds the next token in strtoken. the set of characters in strdelimitspecifies possible delimiters of the token to be found in strtoken on the current call.

security note    these functions incur a potential threat brought about by a buffer overrun problem. buffer overrun problems are a frequent method of system attack, resulting in an unwarranted elevation of privilege. for more information, see avoiding buffer

overruns.

on the first call to strtok, the function skips leading delimiters and returns a pointer to the first token in strtoken, terminating the token with a null character. more tokens can be broken out of the remainder of strtoken by a series of calls to strtok.

each call tostrtok modifies strtoken by inserting a null character after the token returned by that call. to read the next token from strtoken, call strtok with a null value for the strtokenargument. the null strtoken argument causes strtok to search for the

next token in the modified strtoken. the strdelimit argument can take any value from one call to the next so that the set of delimiters may vary.

note   each function uses a static variable for parsing the string into tokens. if multiple or simultaneous calls are made to the same function, a high potential for data corruption and inaccurate results exists. therefore, do not attempt to call the same function

simultaneously for different strings and be aware of calling one of these functions from within a loop where another routine may be called that uses the same function. however, calling this function simultaneously from multiple threads does not have undesirable

effects.

很暈吧? 呵呵。。。

簡單的說,就是函數傳回第一個分隔符分隔的子串後,将第一參數設定為null,函數将傳回剩下的子串。

下面我們來看一個例子:

int main() 

{

      char test1[] = "feng,ke,wei";  

      char *test2 = "feng,ke,wei";  

      char *p;  

      p = strtok(test1, ",");

      while(p)  

          {   

              printf("%s\n", p);   

              p = strtok(null, ",");     

          }      

      return 0;

 }

運作結果:

feng

ke

wei

說明:

函數strtok将字元串分解為一系列标記(token),标記就是一系列用分隔符(delimiting chracter,通常是空格或标點符号)分開的字元。注意,此的标記是由delim分割符分割的字元串喔。

例如,在一行文本中,每個單詞可以作為标記,空格是分隔符。

需要多次調用strtok才能将字元串分解為标記(假設字元串中包含多個标記)。第一次調用strtok包含兩個參數,即要标記化的字元串和包含用來分隔标記的字元的字元串(即分隔符):下列語句: tokenptr = strtok(string, " ")

将tokenptr賦給string中第一個标記的指針。strtok的第二個參數””表示string中的标記用空格分開。

函數strtok搜尋string中不是分隔符(空格)的第一個字元,這是第一個标記的開頭。然後函數尋找字元串中的下一個分隔符,将其換成null(, w,)字元,這是目前标記的終點。注意标記的開始于結束。

函數strtok儲存string中标記後面的下一個字元的指針,并傳回目前标記的指針。

後面再調用strtok時,第一個參數為null,繼續将string标記化。null參數表示調用strtok繼續從string中上次調用 strtok時儲存的位置開始标記化。

如果調用strtok時已經沒有标記,則strtok傳回null。注意strtok修改輸入字元串,是以,如果調用strtok之後還要在程式中使用這個字元串,則應複制這個字元串。

但如果用p = strtok(test2, ",")則會出現記憶體錯誤,這是為什麼呢?是不是跟它裡面那個靜态變量有關呢? 我們來看看它的原碼:

/***

*strtok.c - tokenize a string with given delimiters

*

*       copyright (c) microsoft corporation. all rights reserved.

*purpose:

*       defines strtok() - breaks string into series of token

*       via repeated calls.

*******************************************************************************/

#include <cruntime.h>

#include <string.h>

#ifdef _mt

#include <mtdll.h>

#endif  /* _mt */

*char *strtok(string, control) - tokenize string with delimiter in control

*       strtok considers the string to consist of a sequence of zero or more

*       text tokens separated by spans of one or more control chars. the first

*       call, with string specified, returns a pointer to the first char of the

*       first token, and will write a null char into string immediately

*       following the returned token. subsequent calls with zero for the first

*       argument (string) will work thru the string until no tokens remain. the

*       control string may be different from call to call. when no tokens remain

*       in string a null pointer is returned. remember the control chars with a

*       bit map, one bit per ascii char. the null char is always a control char.

*       //這裡已經說得很詳細了!!比msdn都好! 

*entry:

*       char *string - string to tokenize, or null to get next token

*       char *control - string of characters to use as delimiters

*exit:

*       returns pointer to first token in string, or if string

*       was null, to next token

*       returns null when no more tokens remain.

*uses:

*exceptions:

char * __cdecl strtok (

        char * string,

        const char * control

        )

        unsigned char *str;

        const unsigned char *ctrl = control;

        unsigned char map[32];

        int count;

        _ptiddata ptd = _getptd();

#else  /* _mt */

        static char *nextoken;                        //儲存剩餘子串的靜态變量    

        /* clear control map */

        for (count = 0; count < 32; count++)

                map[count] = 0;

        /* set bits in delimiter table */

        do {

                map[*ctrl >> 3] |= (1 << (*ctrl & 7));

        } while (*ctrl++);

        /* initialize str. if string is null, set str to the saved

         * pointer (i.e., continue breaking tokens out of the string

         * from the last strtok call) */

        if (string)

                str = string;                             //第一次調用函數所用到的原串        

else

                str = ptd->_token;

                str = nextoken;                      //将函數第一參數設定為null時調用的餘串

  /* find beginning of token (skip over leading delimiters). note that         * there is no token iff this loop sets str to point to the terminal         * null (*str == ‘\0‘) */        while ( (map[*str >> 3] & (1 << (*str & 7))) && *str )                str++;

       string = str;                                  //此時的string傳回餘串的執行結果         /* find the end of the token. if it is not the end of the string,         * put a null there. *///這裡就是處理的核心了, 找到分隔符,并将其設定為‘\0‘,當然‘\0‘也将儲存在傳回的串中        for ( ; *str ; str++ )

               if ( map[*str >> 3] & (1 << (*str & 7)) ) {                        *str++ = ‘\0‘;              //這裡就相當于修改了串的内容 ①                        break;                }        /* update nextoken (or the corresponding field in the per-thread data    

    * structure */#ifdef