#P4059. [Code+#1] 找爸爸

[Code+#1] 找爸爸

Description

Xiao A has been looking for his own father recently. How? By DNA matching.

Xiao A has his own method for comparing DNA sequences. The goal is to maximize the similarity score between two DNA sequences. The steps are as follows:

  1. Given two DNA sequences, the first of length nn and the second of length mm.
  2. Insert any number of gaps at any positions in the two sequences so that the two strings have the same length.
  3. Match position by position. If at a position neither character is a gap, let the first be xx and the second be yy, then their similarity is defined as d(x,y)d(x, y). For any maximal contiguous block of gaps of length kk in either sequence, we define its similarity as g(k)=AB(k1)g(k) = -A - B(k-1).

The final similarity score of the two sequences is the sum of all d(x,y)d(x, y) values plus the similarity of all maximal gap blocks.

Now Xiao A has, by some mysterious means, obtained a segment of Xiao B’s DNA sequence. He asks you to compute the maximum similarity score between Xiao A’s DNA sequence and Xiao B’s DNA sequence.

Input Format

The first line contains a string representing Xiao A’s DNA sequence. The second line contains a string representing Xiao B’s DNA sequence. The next 44 lines each contain 44 integers separated by spaces, representing the dd array in the following order.

d(A,A) d(A,T) d(A,G) d(A,C)
d(T,A) d(T,T) d(T,G) d(T,C)
d(G,A) d(G,T) d(G,G) d(G,C)
d(C,A) d(C,T) d(C,G) d(C,C)

The last line contains two positive integers A,BA, B separated by spaces, as described above.

Output Format

Output a single line: the maximum similarity score of the two sequences.

ATGG
ATCC
5 -4 -4 -4 
-4 5 -4 -4 
-4 -4 5 -4 
-4 -4 -4 5 
2 1
4

Hint

Sample explanation.

First, pad the sequences as follows (“-” denotes a gap):

ATGG--
AT--CC

Then the sum of all d(x,y)d(x, y) is d(A,A)+d(T,T)=10d(A, A) + d(T, T) = 10. The sum of the similarity of all maximal contiguous gap blocks is g(2)+g(2)=6g(2) + g(2) = -6. The total is 44, which can be verified to be the maximum similarity.

For all test points, 0<B<A10000 < B < A \le 1000, 1000d(x,y)1000-1000 \le d(x, y) \le 1000, d(x,y)=d(y,x)d(x, y) = d(y, x), and each sequence contains only the four characters {A,T,G,C}\{A, T, G, C\}.

Test point ID n+mn+m range Special notes
11 n=m=1n = m = 1 No special requirements.
22 n+m15n+m \le 15 ^
33 n+m300n+m \le 300
44 ^
55 n+m3000n+m \le 3000 Sequences contain only one type of character.
66 ^ No special requirements.
77 ^
88
99
1010

From CodePlus 2017 November Contest, proudly presented by the Student Algorithm and Competition Association, Department of Computer Science and Technology, Tsinghua University.

Credit: Idea/Lu Zhengrong; Problem setter/Lu Zhengrong; Tester/He Haotian.

Git Repo: https://git.thusaac.org/publish/CodePlus201711

Thanks to Tencent for supporting this contest.

Translated by ChatGPT 5