天天看點

Leetcode 187. Repeated DNA Sequences

原題:

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
      

解決方法:

我采用了最簡單的一種方法,用一個哈希表來記錄基因串出現的次數,達到兩次即将其加入到結果集中。

還有很多優化的方法,但總體思路都差不多,這裡就不一一列舉了。

代碼:

vector<string> findRepeatedDnaSequences(string s) {
        vector<string> res;
        map<string,int> m;
        int n = s.size();
        if (n < 10)
            return res;
        string seq = s.substr(0, 10);
        m[seq]++;
        for(int i = 10; i < n; i++){
            seq.erase(0, 1);
            seq += s[i];
            ++m[seq];
            if (m[seq] == 2)
                res.push_back(seq);
        }
        return res;
    }