Minimum edit distance (DP-11)

The levenshtein distance or Edit distance between two strings is the minimum number of single-character edits (Insertions, deletions and replacements) required to change on word to the other considering that each operation has one unit cost.

Given two strings, you need to find the minimum edit distance.

Sample Input: S1 = “sunday”, S2 = “saturday”
Expected Output: 3
Explanation: Delete ‘a’, ‘t’ in S2 and replace ‘r’ with ‘n’ in S2.

The problem has both optimal substructure and overlapping subproblem. To convert S1[1…m] to S2[1…n], by performing edit operations on S1, we need to convert S1[1…i] to S2[1…j] where 1<=i<=m and 1<=j<=n.

Algorithm:

  1. We consider 3 cases:
    Case 1. Last character of S1 and S2 are same. In this case nothing needs to be done. We can simple move to the next characters of the strings.

    Case 2. Last characters are different. In this case, we need to return the minimum of the 3 operations. Insert last character of S2 to S1. This reduces the size of S2 by 1. Thus the subproblem gets reduced to S1[1…i] and S2[1…(j-1)]. Delete last character of S1. This reduces the size of S1 by 1 and the subproblem becomes S1 [1…(i-1)] and S2 [1…j]. Replace current character of S1 with S2. This reduces the size of both S1 and S2 by 1 and the subproblem becomes S1 [1…(i-1)] and S2 [1…(j-1)].

    Case 3. We have reached the end of either of the strings. If substring S1 is empty, we insert all the remaining characters of S2 to S1. And if S2 is empty, we insert all the remaining characters of S1 to S2.
  2. Below is a Bottom up approach of the given problem.
  3. We create a 2D array DP to store the results of the subproblems. The size of the array is (m+1, n+1) where m and n are the size of the strings S1 and S2.
  4. If string S1 is empty, the number of operations required is equal to the length of the other string and vice versa.
  5. For each character in both the strings, we compare them and if they are equal we move to the next character else we look for the above mentioned cases.
#include <bits/stdc++.h>
using namespace std;

int helper(string x, string y){
	int m = x.size();
	int n = y.size();
	vector<vector<int>> dp(m+1, vector<int>(n+1));
	for(int i=0; i<=m; i++){
		dp[i][0]=i;
	}
	for(int j=0; j<=n; j++){
		dp[0][j]=j;
	}
	for(int i=1; i<=m; i++){
		for(int j=1; j<=n; j++){
			if(x[i-1]==y[j-1]){
				dp[i][j]=dp[i-1][j-1];
			}else{
				dp[i][j]=1+min(dp[i-1][j-1], min(dp[i-1][j], dp[i][j-1]));
			}
		}
	}
	return dp[m][n];
}

int main(){
	string str1 = "sunday"; 
    string str2 = "saturday"; 
    cout<<helper(str1, str2);
	return 0;
}

Also, check out another popular interview problem Coin change Problem.

2 comments

Leave a Reply

Your email address will not be published.