Week 6

Click to Open TOC

Week 6
- Question
- Answer
  - Q1
  - Q2
  - Q3
  - Q4
  - Q5
  - Q6
  - Q7

Question

Answer

Q1

Mark

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Example

Length i	1	2	3	4
Price Pi	15	20	33	36
Limit Li	2	1	1	1

Example : length = 4
4 => 36
1,3 => 48
2,1,1 => 50 best

與上周的做法比較：
1,1,1,1 => 60
但此作法會超過

L i

的限制，需改良

Q2

xian

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

i	0	1	2	3	4	5	6	7
$p_{i}$		0.04	0.06	0.08	0.02	0.10	0.12	0.14
$q_{i}$	0.06	0.06	0.06	0.06	0.05	0.05	0.05	0.05

【解法】

Let e[i,j] = the value of optimal solution for

k_{i}, \dots, k_{j}

，而每一個最佳解的root會有左右子樹，左右子樹也分別有它的最佳解，可以利用DP二維矩陣找出最佳解，並利用例外一個二維矩陣r[i,j]來記錄root為多少時有最佳解。

\begin{aligned} e [i, j] & = {\begin{cases} m i n_{i \leq r \leq j} {e [i, r - 1] + e [r + 1, j] + w_{i, j}} \\ q_{i - 1} i f j = i - 1 \end{cases} \\ r [i, j] & = a value of r that gives the minimum \\ w_{i, j} & = Σ_{i \leq k \leq j} p_{k} + Σ_{i - 1 \leq k \leq j} q_{k} = {\begin{cases} w_{i, j - 1} + p_{j} + q_{j} \\ q_{i} if j = i - 1 \end{cases} \end{aligned}

【
$w_{i, j}$ 】

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

【
$e [i, j]$ 】

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

【
$r [i, j]$ 】

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

【ANS】

By table of e[i,j], the cost of an optimal binary search tree is 3.12.

By table of r[i,j], the structure of an optimal binary search tree is the image below.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

Q3

LXS

Edit distance

Given two sequences

x [1. . m]

and

y [1. . n]

and set of transformation operation costs, the edit distance from

x

y

is the cost of the least expensive operation sequence that transforms

x

y

. Describe a dynamic-programming algorithm that finds the edit distance from

x [1. . m]

y [1. . n]

and prints an optimal operation sequence. Analyze the running time and space requirements of your algorithm.

Operations:

dist[i][j] 表示

x [1. . i]

轉換至

y [1. . j]

的 Edit distance
題目目標為 dist[m][n] 即

x [1. . m]

轉換至

y [1. . n]

的 Edit distance

Insert a character into ith of x
dist[i][j] = dist[i][j-1]+cost[INSERT]
Delete ith character of x
dist[i][j] = dist[i-1][j]+cost[DELETE]
Replace a character from x by another character
dist[i][j]= min(dist[i][j-1], dist[i-1][j], dist[i-1][j-1])+cost[REPLACE]
Twiddle two adjacent characters from x
dist[i][j] = min(dist[i][j], dist[i-2][j-2]+cost[TWIDDLE]);
Twiddle 不一定是最小 Distance 因此兩者取 min

Generalize:

d i s t [i] [j] = min {\begin{cases} d i s t [i - 1] [j - 1] + cost(replace) \\ d i s t [i - 1] [j] + cost(delete) \\ d i s t [i] [j - 1] + cost(insert) \\ d i s t [i - 2] [j - 2] + cost(twiddle)) & if x [1. . i - 1] = y [1. . j] and x [1. . i] = y [1. . j - 1] \\ d i s t [i - 1, j - 1] & if x [1. . i] = y [1. . j] \end{cases}

Boundary:
當 i==0 或 j==0 即其中一者為空字串，最小 Edit distance 為 Insert/Delete 的次數。




for i in range(1, n+1):
    dist[0][i] = dist[0][i-1] + cost['INSERT']
for i in range(1, m+1):
    dist[i][0] = dist[i-1][0] + cost['DELETE']

Time complexity:

O (m + n)

Find Distance:





















dist = [[0 for i in range(n+1)] for j in range(m+1)]
setBoundary(dist)
for i in range(1, m+1):
    for j in range(1, n+1):
        minCost = cost['INFINITY']
        if x[i] == y[j]:
            dist[i][j] = dist[i-1][j-1]
            operations[i][j] = 'NONE'
        else:
            if i > 1 and j > 1 and (x[i-1], y[j]) == (x[i], y[j-1]) and dist[i-2][i-2] + cost['TWIDDLE'] < minCost:
                dist[i][j] = minCost = dist[i-2][i-2] + cost['TWIDDLE']
                operations[i][j] = 'TWIDDLE'
            if dist[i-1][j] + cost['DELETE'] < minCost:
                dist[i][j] = minCost = dist[i-1][j] + cost['DELETE']
                operations[i][j] = 'DELETE'
            if dist[i][j-1] + cost['INSERT'] < minCost:
                dist[i][j] = minCost = dist[i][j-1] + cost['INSERT']
                operations[i][j] = 'INSERT'
            if dist[i-1][j-1] + cost['REPLACE'] < minCost:
                dist[i][j] = minCost = dist[i-1][j-1] + cost['REPLACE']
                operations[i][j] = 'REPLACE'

Time complexity:

O (m n)

Space complexity:

O (m n)

Backtracking:


















i, j = m, n
while dist[i][j] > 0:
    match operations[i][j]:
        case 'NONE':
            i, j = i-1, j-1
        case 'INSERT':
            buf.append(f'Insert {y[j]} to x.')
            j -= 1
        case 'DELETE':
            buf.append(f'Delete {x[i]} from x.')
            i -= 1
        case 'REPLACE':
            buf.append(f'Replace {x[i]} with {y[j]}')
            i, j = i-1, j-1
        case 'TWIDDLE':
            buf.append(f'Twiddle {x[i]} and {x[i-1]}')
            i, j = i-2, j-2
print(i) for i in reversed(buf)

Time complexity:

O (m + n)

Space complexity:

O (m n)

Q4

songchiu

透過

c [i, j]

來存放目前為止，字串

x

、字串

y

共有幾個相同的數

【式子】

\begin{aligned} c [i, j] & = 0 if i=0, or j=0 \\ = c [i - 1, j - 1] + 1 if i, j>0 and x_{i} = y_{i} \\ = m a x (c [i, j - 1], c [i - 1, j]) if i, j>0 and x_{i} \neq y_{i} \end{aligned}

c[]	B	A	D	B	C	B	A
A	0	1	1 $\leftarrow$	1 $\leftarrow$	1 $\leftarrow$	1 $\leftarrow$	1
B	1 $_{#}$	1 $\leftarrow↑$	1 $\leftarrow↑$	2 $↖$	2 $\leftarrow$	2 $↖$	2 $\leftarrow$
A	1 $↑$	2 $↖$	2 $\leftarrow$	2 $\leftarrow↑$	2 $\leftarrow↑$	2 $\leftarrow↑$	3 $↖$
C	1 $↑$	2 $↑$	2 $\leftarrow↑$	2 $\leftarrow↑$	3 $↖$	3 $\leftarrow$	3 $\leftarrow↑$
D	1 $↑$	2 $↑$	3 $↖$	3 $\leftarrow$	3 $\leftarrow↑$	3 $\leftarrow↑$	3 $\leftarrow↑$
B	1	2 $↑$	3 $↑$	4 $↖$	4 $\leftarrow$	4 $↖$	4 $\leftarrow$
C	1 $↑$	2 $↑$	3 $↑$	4 $↑$	5 $↖$	5 $\leftarrow$	5 $\leftarrow$

由上表可知，LCS為《

BADBC

》，長度為

5

(順序請看綠色的字)

Q5

chung

【題目】
An algorithm to solve the LCS problem of two strings X and Y has space complexity O(|X||Y|) typically.

(a) Design an algorithm to find the length of LCS of two string X and Y just using only 2⋅min(|X|,|Y|) cells for working space.

【解題思路】

LCS在計算某一格的解

c [i] [j]

僅需要使用到其上方

c [i - 1] [j]

、左方

c [i] [j - 1]

或左上方

c [i - 1] [j - 1]

格子的值，因此僅需要利用兩行的陣列(上一行與目前此行)空間即可算出答案，並將較短的字串設為index j，即可讓空間複雜度為2⋅min(|X|,|Y|)

【蘇都扣】


















lcs-length(X, Y){
    m <- length[X]
    n <- length[Y]
    for j <- 0 to m
        row_0[j] <- 0
    row_1[0] <- 0
    for i <- 1 to n
        for j <- 1 to m
            if (x_i == y_j)
                row_1[j] = row_0[j-1]+1 
            else if (row_0[j] >= row_1[j-1])
               row_1[j] <- row_0[j]
            else
               row_1[j] <- row_1[j-1]
        for k <- 1 to m
            row_0[k] = row_1[k]
    return row_1[n]
}

(b) Design an algorithm to find the length of LCS of two string X and Y just using only 1+min(|X|,|Y|) cells for working space.

【解題思路】

計算

c [i] [j]

需要使用到

c [i - 1] [j]

(上)、

c [i] [j - 1]

(左)或

c [i - 1] [j - 1]

(左上)的值，LCS的計算順序是由左到右、由上到下，因此可將陣列精簡成一行，並使用額外常數空間暫存左上方

c [i - 1] [j - 1]

的值，因此可讓空間複雜度減少為1+min(|X|,|Y|)

【蘇都扣】
















lcs-length(X, Y){
    m <- length[X]
    n <- length[Y]
    for j <- 0 to m
        row[j] <- 0
    for i <- 1 to n
        s <- 0
        for j <- 1 to m
            t <- row[j]
            if (x_i == y_j)
                row[j] <- s + 1 
            else if (row[j] < row[j-1])
               row[j] <- row[j-1]
            s <- t 
    return row[n]
}

Q6

yee

【題目】

Let

σ

be an alphabet set,

β

denote the blank character in

σ

, and a measure function

F : σ \times σ \to R

. Where

F

is defined as followings, for any

x

and

y

σ

F (x, y) < 0

x \neq y

and

F (x, y) > 0

x = y

; whereas

F (β, β) = - \infty

. Given

X

and

Y

be two strings of

σ *

, let

X^{'}

and

Y^{'}

denote two new strings made by inserting some

β

into

X

and

Y

respectively. The similarity of

X

and

Y

is defined by measuring the maximal value of

\sum 𝐹 (𝑎_{𝑖} ̇, 𝑏_{𝑖})

𝑎_{𝑖} 𝜖 𝑋, 𝑏_{𝑖} 𝜖 𝑌

among all possible

X^{'}

and

Y^{'}

【解題思路】

與LCS很像，只是會插入空格來增加相似度，作法差不多，只是要考慮空格能插在最前面，因此建表時多加一列跟一行表示一開始空格再來建表即可

a. Design an algorithm to find the similarity of X and Y.

【遞迴式】

s c o r e [0, 0] = 0 s c o r e [i, j]_{1 \leq i \leq m, j = 0} = s c o r e [i - 1] [j] + F (Y [i], β) s c o r e [i, j]_{1 \leq j \leq n, i = 0} = s c o r e [i] [j - 1] + F (β, X [j]) s c o r e [i, j]_{1 \leq i \leq m, 1 \leq j \leq n} = m a x {\begin{cases} s c o r e [i - 1, j - 1] + 𝐹 (Y [i], X [j]) \\ s c o r e [i, j - 1] + 𝐹 (β, X [j]) \\ s c o r e [i - 1, j] + 𝐹 (Y [i], β) \end{cases}

【蘇都扣】

















int score[Y_length + 1][X_length + 1]; 
score[0][0] = 0;
for i = 1 to Y_length
    score[i][0] = score[i - 1][0] + F(Y[i],' ');
for j = 1 to X_length
    score[0][j] = score[0][j - 1] + F(' ',X[j]);

for i = 1 to Y_length
    for j = 1 to X_length
        v1 = score[i - 1][j - 1] + F(Y[i],X[j]);
        v2 = score[i - 1][j] + F(Y[i],' ');
        v3 = score[i][j - 1] + F(' ',X[j]);
        score[i][j] = max(v1,v2,v3);

//F為題目所給計分的function 
//F(x, y) < 0 if x != y, F(x, y) > 0 if x == y, F(β,β) = -∞

【複雜度】

Time :

Θ (m n)

Space :

Θ (m n)

b. Design an algorithm that describe where the blank characters are inserted to get the similarity.

【想法】

透過建好的表(score)由最右下角回推比對左上左邊及上方的分數即可知道要於何處插入空格

【蘇都扣】













//score 為上方填好的表
traceBack(i, j)
    if i or j == 0
        return
    if X[j] == Y[i]
        traceBack(i - 1, j - 1)
    else 
        if(score[i - 1][j] < score[i][j - 1])
            print("Insert the blank after the %d in Y.\n",j);
            traceBack(i, j - 1)
        else
            printf("Insert the blank after the %d in X.\n",i);    
            traceBack(i - 1, j)

【複雜度】

最壞可能就是全部都空格(m+n)，因此時間複雜度為

O (m + n)

而空間複雜度因為要建表仍然是

Θ (m n)

【C++實作】

























































































#include<iostream>
using namespace std;

int F(char a, char b);
int main(){

    string X,Y;
    cin >> X >> Y;
    X.insert(0,"1");
    Y.insert(0,"1");
    int X_length = X.length() ;
    int Y_length = Y.length() ;
    int score[Y_length][X_length];
    score[0][0] = 0;
    for( int i = 1;i < Y_length; i++)
        score[i][0] = score[i - 1][0] + F(Y[i],' ');
    
    for(int j = 1;j < X_length; j++)
        score[0][j] = score[0][j - 1] + F(' ',X[j]);

    for( int i = 1;i < Y_length; i++)
        for(int j = 1;j < X_length; j++){
            int v1 = score[i - 1][j - 1] + F(Y[i],X[j]);
            int v2 = score[i - 1][j] + F(Y[i],' ');
            int v3 = score[i][j - 1] + F(' ',X[j]);
            score[i][j] = max(max(v1,v2),v3);
        }
    int i = Y_length -1 , j = X_length - 1;
    string x , y;
    while(1){
        if(i == 0 && j == 0)
            break;
        else if (i == 0 && j == 1){
            x += X[j];
            y += '-';
            //printf("Insert the blank after the %d in Y.\n",i);
            break;
        }
        else if (i == 1 && j == 0){
            y += Y[i];
            x += '-';
            //printf("Insert the blank after the %d in X.\n",j);
            break;
        }
        if(Y[i] == X[j] || score[i - 1][j - 1] >= max(score[i - 1][j],score[i][j - 1])){
            x += X[j--];
            y += Y[i--];
        }
        else{
            if(score[i - 1][j] < score[i][j - 1]){
                x += X[j];
                y += '-';
                //printf("Insert the blank after the %d in Y.\n",j--);印出何處空白
                j--;
            }
            else{
                y += Y[i];
                x += '-';
                //printf("Insert the blank after the %d in X.\n",i--);印出何處空白  
                i--;      
            }
        }
    }
    for(int i = x.length(); i >= 0;i--)
        printf("%c ",x[i]);
    printf("\n");
    for(int i = y.length(); i >= 0;i--)
        printf("%c ",y[i]);
    
    //印出表的結果
    // for( int i = 0;i < Y_length; i++){
    //     for(int j = 0;j < X_length; j++)
    //         printf("%d ",score[i][j]);
    //     printf("\n");
    // }
}

int F(char a, char b){
    if (a == ' ' || b == ' ')
        return -1;
    else{
        if(a == b)
            return 1;
        else
            return -1;
    }
}
//TGTTACGG GGTTGACTA
//ATCC GACG

Q7

xian

【解題思路】

因為在計算LCS的c table時，如果

X [i] = Y [j]

，則

c [i] [j] = c [i - 1] [j - 1] + 1 (左 上 角)

，

c [i - 1] [j] \geq c [i] [j - 1]

，則

c [i] [j] = c [i - 1] [j]

(上)，

c [i] [j - 1] > c [i - 1] [j]

，則

c [i] [j] = c [i] [j - 1]

(左)，由上面規則可知( from課本p.394 b table的code )，我們只需要考慮的點有左上、上、左這三個點，因此我們重建LCS時只需考慮這三點，即可回推所需要的解。

【蘇都扣】













let i = m, j = n;
print_ans(c,X,Y,i,j){
    if (c[i][j] or i or j) is 0
        return;
        
    if X[i] == Y[i]    //往左上角前進
        print_ans(c,X,Y,i-1,j-1);
        print X[i];
    else c[i-1][j] >= c[i][j+1]
        print_ans(c,X,Y,i-1,j);
    else    //c[i-1][j] < c[i][j+1]
        print_ans(c,X,Y,i,j-1);
}

【時間複雜度】

因為演算法一次只會向上一格，或是向左一格，所以在考慮最差的情形下，我們會從最右下角找的最左上角(m+n步)，因此時間複雜度為

O (m + n)

。

Week 6

Question

Check

Answer

Q1

Example

Q2

【解法】

【wi,j】

【e[i,j]】

【r[i,j]】

【ANS】

Q3

Edit distance

Q4

Q5

(a) Design an algorithm to find the length of LCS of two string X and Y just using only 2⋅min(|X|,|Y|) cells for working space.

【解題思路】

【蘇都扣】

(b) Design an algorithm to find the length of LCS of two string X and Y just using only 1+min(|X|,|Y|) cells for working space.

【解題思路】

【蘇都扣】

Q6

【解題思路】

a. Design an algorithm to find the similarity of X and Y.

【遞迴式】

【蘇都扣】

【複雜度】

b. Design an algorithm that describe where the blank characters are inserted to get the similarity.

【想法】

【蘇都扣】

【複雜度】

Q7

【解題思路】

【蘇都扣】

【時間複雜度】

Read more

Week 17

補考 10

Week 7

Week 16

【
$w_{i, j}$ 】

【
$e [i, j]$ 】

【
$r [i, j]$ 】