Tuesday, December 2, 2014

Longest Common Subsequence using dynamic programming in Java.

A subsequence is a sequence that is derived from a sequence such that all characters in subsequence are in same order as original sequence but may not be contiguous. For example, if we have a sequence ABDGTPWYZ, then BDTY is its subsequence, BDGT, ADTWZ are also its subsequences. So in longest common subsequence (LCS) there are 2 sequences given and we have to find length of longest subsequence that is present in both. For example, if we have 2 subsequences ABHKLMNYZ and XBMKLTNWZ then subsequence BKLNZ is longest subsequence that is common in both  and its length is 5.

If 2 sequences are  Xi and Yj, where i and j are their length repectively and LCS( Xi,Yj) is length of longest common subsequence, then below is recursive defination of LCS.

Explanation of above recursive function
1. If ith position character in seqeunce X is equal to character at jth position of sequence Y then length of subsequence increases by 1 and is equal to 1 + LCS of remaining i-1 characters of X and j-1 characters of Y
2. If ith position character in X is not equal to jth position character in Y then answer is maximum of LCS of i characters of X and j-1 characters of Y or LCS of i-1 charcters of X and j charcters of Y.
3. If length of either sequence becomes 0 comparaing then value of 0 is returned.

Java Code for bottoms up dynamic programming using tabulation is-
package com.anuj.dp;

import java.util.Scanner;

/*
 * Calculation of length of longest common 
 * subsequence of given two sequences using 
 * bottom up dynamic programming and tabulation.
 * @author Anuj
 * */

public class LCS {
    
    private static int findlcs(String seq1, String seq2){
        int x = seq1.length();
        int y = seq2.length();
        int [][] lcs = new int[x+1][y+1];
        
        for (int i = 0; i <= x; i++) {
            for (int j = 0; j <= y; j++) {
                if(i ==|| j == 0){
                    lcs[i][j0;
                }else if(seq1.charAt(i-1)==seq2.charAt(j-1)){
                    lcs[i][j+ lcs[i-1][j-1];
                }else if(seq1.charAt(i-1)!=seq2.charAt(j-1)){
                    lcs[i][j= max(lcs[i-1][j], lcs[i][j-1]);
                }
            }
        }
        return lcs[x][y];
    }
    
    private static int max(int a, int b){
        return (a > b)? a : b;
    }
    
    public static void main(String[] args) {
        Scanner sc = new Scanner(System.in);
        String seq1 = sc.next();
        String seq2 = sc.next();
        System.out.println("sequence1="+seq1+" sequence2="+seq2);
        System.out.println("Length of longest common subsequence is:"+findlcs(seq1, seq2));
    }
}
Java2html
Input:
ASDGHJPL
WSFDHPLJ

Output:
sequence1=ASDGHJPL sequence2=WSFDHPLJ
Length of longest common subsequence is:5

Explanation: In above solution we starts comparing characters in sequences from beginning and apply above function rules for lcs at each comparison. The table that is constructed when program finish is:
If characters at i and j are equal then we add 1 to value present in i-1 and j-1 cell. If character at i and j position are not equal then we pick maximum of value at i-1, j or i, j-1. After table is filled completely we check last value in table for length of LCS.

We can also find longest common subsequence value using below code:
int a = 1, b = 1;
        System.out.print("Longest common sequence is:");
        while (a <= x && b <= y) {
            if(seq1.charAt(a-1)==seq2.charAt(b-1)){
                System.out.print(seq1.charAt(a-1));
                a++;
                b++;
            }else if(lcs[a+1][b> lcs[a][b+1]){
                a++;
            }else{
                b++;
            }
        }
Java2html
Add this code in findlcs() method before return statement. In this method we start from begining of sequences and check if character at that index is equal than it is part of LCS and print it and increment both sequences index by 1 and if character are not equal then value at one cell below is compared with one cell to right i.e. value at a+1, b is compared with a, b+1 and we go to right if right cell has greater value or cell below if below cell has greater value and then again characters of sequences are compared at new position and if characters at that position are equal it is printed and after loop finished we have value LCS printed on console.



No comments:

Post a Comment