天天看點

HDU2328 Corporate Identity(字元串哈希)

題目描述

Beside other services, ACM helps companies to clearly state their “corporate identity”, which includes company logo but also other signs, like trademarks. One of such companies is Internet Building Masters (IBM), which has recently asked ACM for a help with their new identity. IBM do not want to change their existing logos and trademarks completely, because their customers are used to the old ones. Therefore, ACM will only change existing trademarks instead of creating new ones.

After several other proposals, it was decided to take all existing trademarks and find the longest common sequence of letters that is contained in all of them. This sequence will be graphically emphasized to form a new logo. Then, the old trademarks may still be used while showing the new identity.

Your task is to find such a sequence.

Input

The input contains several tasks. Each task begins with a line containing a positive integer N, the number of trademarks (2 ≤ N ≤ 4000). The number is followed by N lines, each containing one trademark. Trademarks will be composed only from lowercase letters, the length of each trademark will be at least 1 and at most 200 characters.

After the last trademark, the next task begins. The last task is followed by a line containing zero.

Output

For each task, output a single line containing the longest string contained as a substring in all trademarks. If there are several strings of the same length, print the one that is lexicographically smallest. If there is no such non-empty string, output the words “IDENTITY LOST” instead.

Sample Input

3

aabbaabb

abbababb

bbbbbabb

2

xyz

abc

Sample Output

abb

IDENTITY LOST

題目大意

給你n個字元串,求這n個字元串字典序最小的最長公共子串。

題目分析

這道題我們可以用雜湊演算法來解決。

首先因為枚舉的最長公共子串的長度具有單調性,是以我們可以通過二分的方式來找出答案串的長度。然後尋找是否有該長度的公共子串。

尋找方法為:

枚舉出所有s1中該長度的子串,然後與剩下的n-1個字元串中所有該長度的子串進行比較。如果s1中有某個子串能與剩下的n-1個字元串中的某個子串比對成功,那麼說明該長度是可以的。

然後記錄下該串,與答案串進行比較:如果兩串長度不等,保留長度更長的進入答案串。如果長度相等,保留字典序更小的進入答案串。

代碼如下
#include <iostream>
#include <cstdio>
#include <cmath>
#include <string>
#include <cstring>
#include <map>
#include <queue>
#include <vector>
#include <set>
#include <algorithm>
#include<assert.h>
#include <iomanip>
#define ULL unsigned long long
#define PSI pair<string,int>
using namespace std;
const int N=5e3+5,M=205,P=131;
int n;
char s[N][M];		//儲存n個字元串
ULL h[N][M];		//記錄n個字元串的哈希值
ULL p[M];
string ans;			//答案串
ULL get(int k,int l,int r)		//計算第k個字元串[l,r]區間的哈希值(哈希模闆)
{
    return h[k][r]-h[k][l-1]*p[r-l+1];
}
bool check(int mid)				//檢查該長度是否合法
{
    bool final=false;			//計算最終是否有解
    int s1=strlen(s[1]+1);
    for(int i=1;i+mid-1<=s1;i++)			//枚舉s[1]中長度為mid的所有子串
    {
        bool st=true;						//記錄該次比對是否成功
        ULL key=get(1,i,i+mid-1);
        for(int j=2;j<=n;j++)
        {
            bool flag=false;				//記錄第j個子串和目前s[1]的子串是否能比對成功
            int len=strlen(s[j]+1); 
            for(int k=1;k+mid-1<=len;k++)
                if(get(j,k,k+mid-1)==key)	//比對成功
                {
                    flag=true;
                    break;
                }
            
            if(!flag) {st=false; break;} 
        }
        if(st)								//s[1]的該子串能比對成功
        {
            final=true;
            string ss;						//記錄s[1]的該子串
            for(int j=i;j<=i+mid-1;j++) ss+=s[1][j];
            
            if(ss.size()>ans.size()) ans=ss;//與答案串進行比較
            else if(ss.size()==ans.size()&&ans>ss) ans=ss;
        }
    }
    return final;
}
int main()
{
    while(cin>>n,n)
    {
        ans.clear();
        for(int i=1;i<=n;i++) cin>>(s[i]+1);
        p[0]=1;
        for(int i=1;i<M;i++) p[i]=p[i-1]*P;		//計算哈希值
        for(int i=1;i<=n;i++)
        {
            int len=strlen(s[i]+1);
            for(int j=1;j<=len;j++)
                h[i][j]=h[i][j-1]*P+s[i][j];
        }
        int l=0,r=M-1;				//二分最長公共子串的長度
        while(r>l)
        {
            int mid=(l+r+1)>>1;
            if(check(mid)) l=mid;
            else r=mid-1;
        }
        if(l) cout<<ans<<endl;	
        else puts("IDENTITY LOST");	//l=0說明這n個串沒有公共子串
    }
    return 0;
}