题目:
Manuel thinks that Diego is his long lost brother. But Diego thinks Manuel is wrong, and to prove it, he got DNA samples from himself and Manuel. Now Diego has given you the DNA samples and it is your task to say whether they are brothers or not.
The DNA samples of Diego and Manuel are strings A and B, both have length n (1≤n≤10^5) and consist of only the characters 'A', 'T', 'G' and 'C'. If there is a common subsequence of A and B that has length greater than or equal to 0.99×n, then Diego and Manuel are brothers, in other case, they are not.
Input
The input consists of two lines with strings A and B, respectively.
Output
You should output a single line with "Long lost brothers D:" (without quotes) if Diego and Manuel are brothers, and "Not brothers :(" (without quotes) if they are not.
Examples
input
GAATTGCGTACAATGC
GAATTGCGTACAATGC
output
Long lost brothers D:
input
CCATAGAGAA
CGATAGAGAA
output
Not brothers :(
Note
A subsequence of a string X is any string that you can get by removing any number of characters from X.
A common subsequence of strings X and Y is a string that is a subsequence of both X and Y.
题目大意就是给出两个序列,找他们的最长公共子序列,然后判断这个子序列的长度是否大于原序列的0.99。
这是一个很典型的LCS问题,但是,当我尝试使用LCS的暴力dp(复杂度是N^2)去做的话,很显然是超时的。因为题目的数据规模为10^5。然后当时训练的时候想着是把它转换为LIS,那不就是NlogN了吗?但是事实证明这样子还是会TLE,我不知道是我代码有问题还是我的复杂度估算错了,反正就是过不了。
附:转换为LIS后仍然TLE的代码:
#include <bits/stdc++.h>
using namespace std;
#define MAXN 100002
#define INF (1 << 30)
string a, b;
int len;
vector<int> dp;
vector<int> dp_lis;
int vec[4][MAXN];
int js[4] = {0};
//A->0
//T->1
//C->2
//G->3
int ans = 0;
void lis()
{
dp_lis.resize(dp.size() + 1);
int sz = dp.size();
fill(dp_lis.begin(), dp_lis.end(), INF);
//cout<<"jsdp:"<<jsdp<<endl;
for (int i = 0; i < sz; ++i)
{
*lower_bound(dp_lis.begin(), dp_lis.end(), dp[i]) = dp[i];
}
ans = lower_bound(dp_lis.begin(), dp_lis.end(), INF) - dp_lis.begin();
}
void solve()
{
int idx;
for (int i = len - 1; i >= 0; --i)
{
if (b[i] == 'A')
idx = 0;
else if (b[i] == 'T')
idx = 1;
else if (b[i] == 'C')
idx = 2;
else if (b[i] == 'G')
idx = 3;
vec[idx][js[idx]++] = i;
}
for (int i = 0; i < len; ++i)
{
if (a[i] == 'A')
idx = 0;
else if (a[i] == 'T')
idx = 1;
else if (a[i] == 'C')
idx = 2;
else if (a[i] == 'G')
idx = 3;
int sz = js[idx];
for (int j = 0; j < sz; ++j)
{
dp.emplace_back(vec[idx][j]);
}
}
lis();
}
int main()
{
ios::sync_with_stdio(false);
cin.tie(0), cout.tie(0);
cin >> a >> b;
len = a.length();
solve();
//cout<<ans<<endl;
if (double(ans) >= 0.99 * len)
cout << "Long lost brothers D:" << endl;
else
cout << "Not brothers :(" << endl;
}
然后,在补题的时候,上网查了一下人家的方法,那真的是妙啊!既然这个问题正面求解这么复杂,而分析知,题目只是要求判断最多删除1%的的时候的LCS长度是否≥0.99*len,因此我们可以对删除的字母的数量进行dp。
设dp[i][j]表示dp[i][j]为当a删除i个字符, b删除j个字符时,LCS的最长长度
则可以进行
dp[i+1][j] = max(dp[i+1][j], dp[i][j]);
dp[i][j+1] = max(dp[i][j+1], dp[i][j]);
ans = max(ans, dp[i][j]);
最后再判断ans是否满足条件即可。
附:AC代码
#include <bits/stdc++.h>
using namespace std;
#define MAXN 100005
int dp[MAXN / 100 + 5][MAXN / 100 + 5]; //对删除的字符数进行dp
int main()
{
ios::sync_with_stdio(false);
cin.tie(0), cout.tie(0);
string a, b;
cin >> a >> b;
int len = a.length();
int mx = len / 100 + 1; //最多删除的字符的数量
int ans=0;
for (int i = 0; i <= mx; ++i)
{
for (int j = 0; j <= mx; ++j)
{
//i、j分别为a、b串删除的字符数量, dp[i][j]为当a删除i个字符, b删除j个字符时,LCS的最长长度
while (i + dp[i][j] < len && j + dp[i][j] < len && a[i + dp[i][j]] == b[j + dp[i][j]])
++dp[i][j];
dp[i+1][j] = max(dp[i+1][j], dp[i][j]);
dp[i][j+1] = max(dp[i][j+1], dp[i][j]);
ans = max(ans, dp[i][j]);
}
}
if(1.0*ans/(1.0*len)>=0.99)
cout<<"Long lost brothers D:"<<endl;
else cout<<"Not brothers :("<<endl;
}
转载请注明原文:https://www.longjin666.top/?p=1126
欢迎关注我的公众号“灯珑”,让我们一起了解更多的事物~