Monday, August 29, 2011

Avoiding dirty reads in a concurrent environment

Recently, I was facing issue of handling concurrent access and modifications to database using JPA. Most of the time inserts where happening based on dirty reads or unique key violations were happening, thereby making the data inconsistent. So, to overcome this, I used 2 separate things. Firstly, delegated few tasks to central db, by using before insert triggers (need to refresh the entity after persist in case of pre-insert trigger). Secondly, introduced version based modifications. Here is a nice article which explains this : http://java.dzone.com/articles/jpa-20-concurrency-and-locking.

One can use application based optimistic locks or db row level based pessimistic locks. While using locks you need to handle errors like lock timeout or optimistic lock exceptions and do multiple retries with some random short sleeps. It is important to remember that for most of the exceptions, jpa marks the transaction for rollbackonly, so one needs to begin a fresh transaction after exception, to have proper commits.

Saturday, July 30, 2011

Integrating code coverage (integration-test phase) with maven

This article is about integrating code coverage with maven (test or integration-test phase). Somehow, whenever I try to mess with maven, it gives me a hard time probably because its documentation is not well and scattered.

So, after a days research I came across 3 plugins : EMMA, Cobertura and Clover.
I will not go into details of clover as it is paid and has licensing issues (you need license war's although you can get 30 day trial war). I somehow found it difficult to integrate with tomcat.

I was pretty new to understand how code coverage is computed. After doing some research I figured out that it involves 4 steps :
1) Instrumentation : All classes will be added with some extra code which help analyze line hits in code coverage later. So, plugin will generate instrumented jar and an additional file like (.em/.ec by emma or .ser file by cobertura).
2) Running test cases/Deployment on J2EE container : Running in test phase is straight forward.
To deploy in say tomcat in my case, firstly, you need to specify environment variable to where plugin will dump instrumented data for during test phase on tomcat.
3) After tomcat is deployed, junits are run. And when tomcat stops, plugin will dump the instrumented data file which will be used to generate code coverage report.
4) Merge dump files and generate report in different formates like html,xml,txt.

Now, each plugin has a different way of dumping runtime data and initial instrumentation data. Lets discuss EMMA and Cobertura in their approach (Note : Please choose maven "phase" according to your needs)

Cobertura

Before we look into cobertura, a word of caution. If your project uses spring and jdk proxies, please use EMMA instead, as in my case. Cobertura has problems with org.springframework.aop.framework.ProxyFactoryBean and jdk proxies. If you are using AOP proxies and have annotation at class level and not interface level, then you can use cobertura.

1) To instrument classes use cobertura-maven-plugin .
Here is a snippet to instrument jars


 
  cobertura
  
   
    net.sourceforge.cobertura
    cobertura
    true
    1.9.4.1
   
  
  
   
    
     org.codehaus.mojo
     cobertura-maven-plugin
     
      
       cobertura-instrument
       process-classes
       
        instrument
       
      
     
    
   
  
 


This will generate a .ser file which will contain meta data about each class. It is important that before deploying, .ser should contain meta data about all the classes you are interested in finding. Say if you have 2 different jar's instrument in 2 different projects, use cobertura-merge before deploying on tomcat (if you are interested in code coverage of jar 1 in project 2..say common library in my case). This can be done using ant task (do only if you need to merge 2 different metadata ser before hand. In single project it is not needed). You can choose appropriate phase before deployment. In my project, I used it at jar verify stage.


 maven-antrun-plugin
 1.6
 
  
   merge-ser-pre
   verify
   
    
     < taskdef classpathref="maven.runtime.classpath"
     resource="tasks.properties" />
     < taskdef classpathref="maven.runtime.classpath"
     resource="net/sf/antcontrib/antcontrib.properties" />
     < echo message="Executing cobertura merge" />
     
      
       < include name="cobertura.ser" />

      
       < include name="cobertura.ser" />

     
    
   
   
    run
   
  
 
 
  
   ant-contrib
   ant-contrib
   20020829
  
 


Now, to deploy on tomcat you need following dependency in war packaging

 
  net.sourceforge.cobertura
  cobertura
  1.9.4.1
  jar
  compile
 

I am using cargo plugin to start tomcat..so here is how to set dump file system vairable :

 ...
 
  somelocation/cobertura.ser
  
 


It is extremely important to note that this dumping .ser should be same as previously metadata instrumented .ser . After runtime, EMMA should dump information into same .ser which was instrumented initially during jar creation or merged ser in case of multi projects or else code coverage will be 100%.

Finally, we need to create a report from this .ser . If you have different .ser from different projects computed separately, you can again use cobertura merge to merge them into a final ser.

Cobertura reports are better as html reports looks visually much better that EMMA. Apart from lines covered, cobertura report also tells how many times each line was hit. This is not covered in EMMA.


 org.apache.maven.plugins
 maven-antrun-plugin
 
  
   verify
   
    
     < taskdef classpathref="maven.runtime.classpath"
     resource="tasks.properties" />
     < taskdef
     classpathref="maven.runtime.classpath"
     resource="net/sf/antcontrib/antcontrib.properties" />
     < available
     file="./1.ser" property="ser.file.exists" />
     
      < equals arg1="${ser.file.exists}" arg2="true" />
      
       < echo message="Executing cobertura report" />
       < mkdir
       dir="${project.build.directory}/site/cobertura" />
       <
       cobertura-report format="xml" srcdir="../java/src/main/java"
       destdir="${project.build.directory}/site/cobertura"
       datafile="./final.ser" />

       
        <
        fileset dir="../commons/src/main/java" />
        < fileset
        dir="../algo/java/src/main/java" />
        < fileset
        dir="../practise/java/src/main/java" />
       
      
      
       < echo message="No SER file found."/>

     
    
   
   
    run
   
  
 
 
  
   ant-contrib
   ant-contrib
   20020829
  
 


EMMA
EMMA is relatively easier to integrate. For instrumentation initially use :


 emma
 
  
   
    org.codehaus.mojo
    emma-maven-plugin
    1.0-alpha-3
    true
    
     
      process-classes
      
       instrument
      
     
    
   
  
 


This will create .em file . EMMA does not want you to put this .em file in tomcat container unlike cobertura which uses same .ser file to dump runtime information. So, just deploy your war on tomcat with your choosen new dump file path.
you will need following dependency :


 
  emma
  emma
  2.1.5320
  jar
  compile
 


Here is code in cargo :


 
  false
  ${basedir}/target/final.ec
  
 


If you have multiple projects to be build, emma sometimes will give you "java.net.BindException:Address in use" error and not allow tomcat to start. This can either be solved by using emma.rt.control.port system variable and setting different values for different project jars. But a better approach is to somehow disable it using emma.rt.control, setting it to false(Thats why comment port)

After tomcat shuts, it will generated this final.ec

And finally we need to merge (in case of multi projects ) and create final report. Here is the code :


 
  emma
  emma
  2.1.5320
  jar
  compile
 
 
  emma
  emma_ant
  2.1.5320
  jar
  compile
 


 
  
   maven-antrun-plugin
   1.6
   
    
     report emma
     verify
     
      
       < taskdef classpathref="maven.runtime.classpath"
       resource="emma_ant.properties" />
       < sleep seconds="15" />
       
        
         
          < include name="algo/webapp/target/final.ec" />
          < include name="algo/java/target/coverage.em" />
          < include name="practise/webapp/target/final.ec" />
          < include name="practise/java/target/coverage.em" />

        
        
         
          < include name="target/final.emma" />

         < txt outfile="target/coverage.txt" />
         < html outfile="target/coverage.html" />
        
       
      
     
     
      run
     
    
   
   
    
     ant-contrib
     ant-contrib
     20020829
    
   
  
 


Congrats..check your report :)

I spent almost 4 night-outs on integration maven with cargo and integration tests. Hope you are able to do it quickly :)

Sunday, July 10, 2011

Comprehensive comparision of recursion,memoization and iteration


In this post, I am going to talk about iteration, recursion and memoization. It is important to understand, when one should use recursion. Recursion is sometimes said to be "code beautification", because it improves readability, but mostly suffers on performance. Lets take a famous problem of tower of hanoi. It sounds difficult at first glance, but can be very easily solved in a recursive fashion. Here goes the code :

public class Hanaoi {
 static int ctr=0;
 
 public static void main(String[] args) {
  Stack<String> a,b,c;
  a=new Stack<String>();
  b=new Stack<String>();
  c=new Stack<String>();
  Hanaoi h = new Hanaoi();
  a.push("A");
  int n=5; //no of discs
  for(int i=n;i>0;i--) a.push(i+""); 
  
  b.push("B");
  c.push("C");

  h.doit(a.size()-1,a,c,b);
  System.out.println("Total moves " + ctr);
 }

 void doit(int n,Stack<String> a,Stack<String> c,Stack<String> b) {
  ctr++;
  if(n==0) return;
  else {
   doit(n-1,a,b,c);
   System.out.print("Move plate "+n+" from "+a+" to "+c);
   c.push(a.pop());
   System.out.println("--> Move plate "+n+" from "+a+" to "+c);
   doit(n-1,b,c,a);
  }
 }
}

Give n>=30 and you can see that amount of time taken to compute, rises steeply.

So, a important question arises, when to avoid recursion. Their is no general rule to it, but I would suggest a thumb rule based on my experience. If at any point of time while compution (say state/value C), you can make a decision (somewhat greedy) about next value to be computed P(generally rule based with some temporary data) and their is no need to comeback in state C or reuse value in state C, you should go for iteration. You should generally use recursion when one needs to do a whole state space search, to find global optima.

Let take few examples to explain it (Simultaneously I am going to compare performance of iteration,recursion and memoization where possible)

1) Fibonacci Series :

In this case, you can easily store last 2 values and compute iteratively. Values Fn and Fn-1 will be used to compute Fn+1, while Fn-2 can safely be discarded. Fn-2 will not be used at later point of time.

Type n Output Time taken (in seconds) Comment
Recursion 40 1.02334155E8 1.484 n>=40 takes huges time
Iteration 1476 1.3069892237633987E308 0.0 n>=1476 , double overflows
Recursion with memoization 1476 1.3069892237633987E308 0.015 n>=1476 , double overflows

Here goes the code :
public class Fib {

 HashMap<Double,Double> fib = new HashMap<Double, Double>();
 public static void main(String args[]) {
  Fib f = new Fib();
  f.analyzeFibRecursion(40);
  f.analyzeFibIteration(1476);
  f.analyzeFibMemoizedRecursion(1476);
 }
 
 public double fibRecursion(double l) {
  if(l <= 1) return l;
  else return fibRecursion(l-1)+fibRecursion(l-2);
 }
 
 public void analyzeFibRecursion(double l) {
  Date start = new Date();
  double value = fibRecursion(l);
  Date end = new Date();
  System.out.println("Final Output : "+value);
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
 
 public double fibIteration(double l) {
  if(l <= 1) return l;
  else {
   double f1 = 0;
   double f2 = 1;
   double f3 = 0;
   double i = 2;
   while(i<=l) {
    f3 = f2 + f1;
    f1 = f2;
    f2 = f3;
    i++;
   }
   return f3;
  }
 }
 
 public void analyzeFibIteration(double l) {
  Date start = new Date();
  double value = fibIteration(l);
  Date end = new Date();
  System.out.println("Final Output : "+value);
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
 
 public double fibMemoizedRecursion(double l) {
  if(fib.containsKey(l)) return fib.get(l);
  else {
   double v = fibMemoizedRecursion(l-1)+fibMemoizedRecursion(l-2);
   fib.put(l, v);
   return v;
  }
 }
 
 public void analyzeFibMemoizedRecursion(double l) {
  Date start = new Date();
  fib.clear();
  fib.put(0d,0d);
  fib.put(1d,1d);
  double value = fibMemoizedRecursion(l);
  Date end = new Date();
  System.out.println("Final Output : "+value);
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
}

2) Binary Search :

At, each point of time, you can choose the search in lower or upper partition. Hence, no need to recurse. You dont need to come back to current state again.

Type Array size Lookups Time taken (in seconds)
Recursion 631900 10000000 10.797
Iteration 631900 10000000 4.281

public class BinarySearch {
 
 static int arr[];
 
 public static int RANGE = 1000000;
 public static int ATTEMPTS = 10000000;

 public static void main(String args[]) {
  Random r = new Random();
  TreeSet<Integer> t = new TreeSet<Integer>();
  for(int i=0;i<RANGE;i++) t.add(r.nextInt(RANGE));
  arr=new int[t.size()];
  Integer[] iarr = t.toArray(new Integer[0]);
  for(int i=0;i<iarr.length;i++) {
   arr[i]=iarr[i];
  }  
  System.out.println(t.size());
  
  analyzeIterativeBinarySearch();
  analyzeRecursiveBinarySearch();
 }
 
 private static void analyzeIterativeBinarySearch() {
  Random r = new Random();
  Date start,end;
  int idx;
  int toFind;
  start = new Date();

  for(int i=0;i<ATTEMPTS;i++) {
   toFind = r.nextInt(RANGE);
   idx = iterativeBinarySearch(arr,toFind);
   //System.out.println(toFind+" "+idx);
  }
  end = new Date();
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
 
 private static int iterativeBinarySearch(int arr[],int val) {
  int mid,low,high;
  low = 0 ;
  high = arr.length-1;
  while(low<=high) {
   mid=(low+high)/2;
   if(arr[mid]==val) return mid;
   else if(val<arr[mid]) high=mid-1;
   else low=mid+1;
  }
  return -1;
 }
 
 private static void analyzeRecursiveBinarySearch() {
  Random r = new Random();
  Date start,end;
  int idx;
  int toFind;
  start = new Date();

  for(int i=0;i<ATTEMPTS;i++) {
   toFind = r.nextInt(RANGE);
   idx = recursiveBinarySearch(arr,toFind,0,arr.length-1);
   //System.out.println(toFind+" "+idx);
  }
  end = new Date();
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
 
 public static int recursiveBinarySearch(int[] inArray,int num,int start,int end) {
  int pivot=(int)Math.floor((end-start)/2)+start;
  if(num==inArray[pivot]) return pivot;
  if(start==end) return -1; 
  if(num<=inArray[pivot]) return recursiveBinarySearch(inArray,num,start,pivot);
  else return recursiveBinarySearch(inArray,num,pivot+1,end);
 }
}


3) Binary Tree Search/Tree traveral :

BST search is similar to binary search on tree. So, we can go for iteration. Same should be used in case of tries also. On the contrary, tree traveral need to go over all the nodes and hence a iterative traveral will be a over do.

Lets examine the performance of tree search

Type Array size Lookups Time taken (in seconds)
Recursion 631647 1000000 61.718
Iteration 631647 1000000 44.86

While in case of tree traveral, you can clearly see that use of stacks is an overdo.

Type Array size Time taken (in seconds)
Recursion 631647 0.015
Iteration 631647 0.063

Here goes the code :

public class TreeSearch {
 public static int RANGE = 1000000;
 public static int ATTEMPTS = 1000000;

 public static void main(String[] args) {
  
  Random r = new Random();
  HashSet<Double> h = new HashSet<Double>();
  for(int i=0;i<RANGE;i++) h.add((double)r.nextInt(RANGE));
  System.out.println(h.size());
  Tree t = new TreeSearch().new Tree();

  for(Double d : h) {
   t.insert(d.doubleValue());
  }
  
  analyzeRecursiveSearch(t);
  analyzeIterativeSearch(t);
  
  analyzeRecursiveBrowse(t);
  analyzeIterativeBrowse(t);
 }
 
 private static void analyzeRecursiveSearch(Tree t) {
  Random r = new Random();
  Date start,end;
  boolean found;
  double toFind;
  start = new Date();

  for(int i=0;i<ATTEMPTS;i++) {
   toFind = (double)r.nextInt(RANGE);
   found = t.recursiveSearch(t.root, toFind);
  }
  end = new Date();
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
 
 private static void analyzeIterativeSearch(Tree t) {
  Random r = new Random();
  Date start,end;
  boolean found;
  double toFind;
  start = new Date();

  for(int i=0;i<ATTEMPTS;i++) {
   toFind = (double)r.nextInt(RANGE);
   found = t.iterativeSearch(t.root, toFind);
  }
  end = new Date();
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
 
 private static void analyzeRecursiveBrowse(Tree t) {
  Date start = new Date();
  t.inorderRecursive(t.root);
  Date end = new Date();
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }
 
 private static void analyzeIterativeBrowse(Tree t) {
  Date start = new Date();
  t.inOrderIterative(t.root);
  Date end = new Date();
  System.out.println((end.getTime()-start.getTime())/1000.0+ " seconds");
 }

 class Node {
  Node left,right;
  double val;
  
  Node(double val) {
   this.val = val;
  }   
 }

 class Tree {
  Node root;
  Tree() {}

  void insert(double val) {
   root=insert(root,val);
  }
  
  Node insert(Node n,double val) {
   if(n==null) n = new Node(val);
   else if(n.val>val) n.left=insert(n.left,val);
   else n.right=insert(n.right,val);
   return n;
  }
  
  boolean recursiveSearch(Node n , double val) {
   if(n==null) return false;
   else {
    if(n.val==val) return true;
    else if(n.val>val) return recursiveSearch(n.left, val);
    else return recursiveSearch(n.right, val);
   }
  }
  
  boolean iterativeSearch(Node n , double val) {
   while(n!=null) {
    if(n.val==val) return true;
    else if(n.val>val) n=n.left;
    else n=n.right;
   }
   return false;
  }
  
  void inorderRecursive(Node n) {
   if(n==null) return;
   inorderRecursive(n.left);
   //System.out.print(n.val+" ");
   inorderRecursive(n.right);
  }
  
  public void inOrderIterative(Node n) {
   Stack<Node> s = new Stack<Node>(); 
   while (n !=null) {
     s.push(n);
     n = n.left;
   }
   while (!s.isEmpty()) {
    n = s.pop();
    //System.out.print(n.val+" ");
    n = n.right;
    while(n !=null) {
     s.push(n);
     n = n.left;
    } 
   } 
  }
 }
}

4) Matrix Chain Multiplication :

This is a classic example of dynamic programming. Iteration can work with a DP formulation only. Otherwise, a brute force approach would be to use recursion which is very bad. But, you can drastically improve performance by using recursion + memoization.

Type Length of p array No. of p's solved Time taken (in seconds)
Recursion 25 500 64.999
Recursion + Memoization 25 500 0.016
Iteration 25 500 0.0

Code :

public class MatrixMultiplication {
 public static int ATTEMPTS = 500;
 public static int RANGE = 25;

 public static void main(String args[]) {
  analyze();
 }

 private static void analyze() {
  Random r = new Random();
  Date start, end;
  int val;
  long totalTimeDP = 0, totalTimeRecursive = 0, totalTimeMemoized = 0;
  for (int i = 0; i < ATTEMPTS; i++) {
   HashSet<Integer> h = new HashSet<Integer>();
   for (int j = 0; j < RANGE; j++)
    h.add(r.nextInt(RANGE));
   h.remove(new Integer(0));
   int[] p = new int[h.size()];
   Integer[] iarr = h.toArray(new Integer[0]);
   for (int j = 0; j < iarr.length; j++) {
    p[j] = iarr[j];
   }
   start = new Date();
   val = dp(p);
   // System.out.println(val);
   end = new Date();
   totalTimeDP += (end.getTime() - start.getTime());

   start = new Date();
   int[][] m = new int[p.length][p.length];
   val = recursive(p, 1, p.length - 1, m);
   // System.out.println(val);
   end = new Date();
   totalTimeRecursive += (end.getTime() - start.getTime());

   start = new Date();
   m = new int[p.length][p.length];
   val = memoized(p, m);
   // System.out.println(val);
   end = new Date();
   totalTimeMemoized += (end.getTime() - start.getTime());
  }
  System.out.println(totalTimeDP / 1000.0 + " seconds");
  System.out.println(totalTimeRecursive / 1000.0 + " seconds");
  System.out.println(totalTimeMemoized / 1000.0 + " seconds");

 }

 public static int dp(int p[]) {
  int n = p.length - 1;

  int[][] m = new int[n + 1][n + 1];
  int[][] s = new int[n + 1][n + 1];

  for (int i = 1; i <= n; i++)
   m[i][i] = 0;

  for (int L = 2; L <= n; L++) {
   for (int i = 1; i <= n - L + 1; i++) {
    int j = i + L - 1;
    m[i][j] = Integer.MAX_VALUE;
    for (int k = i; k <= j - 1; k++) {
     // q = cost/scalar multiplications
     int q = m[i][k] + m[k + 1][j] + p[i - 1] * p[k] * p[j];
     if (q < m[i][j]) {
      m[i][j] = q;
      s[i][j] = k;
     }
    }
   }
  }
  return m[1][n];
 }

 public static int recursive(int p[], int i, int j, int[][] m) {
  if (i == j)
   return 0;
  m[i][j] = Integer.MAX_VALUE;

  for (int k = i; k <= j - 1; k++) {
   int q = recursive(p, i, k, m) + recursive(p, k + 1, j, m)
     + p[i - 1] * p[k] * p[j];
   if (q < m[i][j])
    m[i][j] = q;
  }

  return m[i][j];
 }

 public static int memoized(int p[], int[][] m) {
  for (int i = 1; i < m.length; i++) {
   for (int j = 1; j < m.length; j++) {
    m[i][j] = Integer.MAX_VALUE;
   }
  }
  return memoized(p, 1, m.length - 1, m);
 }

 public static int memoized(int p[], int i, int j, int[][] m) {
  if (m[i][j] < Integer.MAX_VALUE)
   return m[i][j];

  if (i == j)
   m[i][j] = 0;
  else {
   for (int k = i; k <= j - 1; k++) {
    int q = memoized(p, i, k, m) + memoized(p, k + 1, j, m)
      + p[i - 1] * p[k] * p[j];
    if (q < m[i][j])
     m[i][j] = q;
   }
  }
  return m[i][j];
 }
}

Tuesday, July 5, 2011

Finding all combinations/permutations using bit operator

This is a simple problem of finding all combination/permutation of a given set of characters/words/numbers. What I want to illustrate is the beauty of bit operators. Usually, we tend to skip bit operations in code. Following code illustrates it :

public class CombinationPermutation {

 public String str = "abc";
 public int i = 0; //work as binary array
 public static void main(String[] args) {
  CombinationPermutation cp = new CombinationPermutation();
  cp.combinations();
  cp.permutations();
 }
 
 /**
  * Loop through numbers from 1 to 2^length
  * For each number if bit is set, append them
  */
 private void combinations() {
  int n = str.length();
  for (int i = 1; i < (1 << n); i++) {
   String s = "";
   for (int j = 0; j < n; j++) {
    if ((i & 1<<j) != 0) {
     // append chars for set bit
     s += str.charAt(j);
    }
   }
   System.out.println(s);
  }
 }
 
 /**
  * @param res
  * current permutation string formed
  * @param set
  * current set of characters from which permutation needs to be formed
  */
 private void permutations(String res,String set) {
  if(res.length()==set.length()) System.out.println(res);
  for(int j=0;j<set.length();j++) {
   if((i&1<<j) == 0) {
    i|=1<<j; //set available free 0 bit
    permutations(res+set.charAt(j),set);
    i&=~(1<< j); //unset that bit to 0
   }
  }
 }
 /**
  * For each combination found , find all its permutation
  * same code as combination
  */
 private void permutations() {
  int n = str.length();
  for (int i = 1; i < (1 << n); i++) {
   String s = "";
   for (int j = 0; j < n; j++) {
    if ((i & 1<<j) != 0) {
     // append chars for set bit
     s += str.charAt(j);
    }
   }
   permutations("",s);
  }
 }
}

Modeling Tree and DAG in MySQL Vs Inmemory implementation

Few days back I was stuck with a problem of modeling a directed acyclic graph(DAG) in database.

I started looking into various approaches of storing trees first. If it is a case of tree with multiple children, but only single parent, then we can choose to use nested set model. A comprehensive description in mysql is discussed here. I have personally implemented it and works well.

But, if you want to model a DAG kind of structure with multiple parents, then it gets difficult. A good article on this can be found here. It discusses about various operations on nodes and links. Operations like delete link gets pretty complicated.

So, I thought may be a in memory implementation would be a better alternative. I went ahead and wrote a inmemory thread implementation using DFS, which gives all the parents in the path till given node(maybe through multiple paths), and all the children in subtree on a given node.

Here is a brief outline of it :

//datastructure to store childrens and parents
public class Node {
 HashSet<Long> childrens = new HashSet<Long>(); 
 HashSet<Long> parents = new HashSet<Long>(); //parents in path
 public HashSet<Long> getChildrens() {
  return childrens;
 }
 public void setChildrens(HashSet<Long> childrens) {
  this.childrens = childrens;
 }
 public HashSet<Long> getParents() {
  return parents;
 }
 public void setParents(HashSet<Long> parents) {
  this.parents = parents;
 }
}
//adjacency list (Node id, neighbour id list)
private HashMap<Long,List<Long>> adjList= new HashMap<Long, List<Long>>();
//all the nodes hashed on node id
private HashMap<Long,Node> data = new HashMap<Long, Node>();

public void init() {
   //find nodes which have no inlink from adjacency list 
   HashSet<Long> parents = new HashSet<Long>(); //who are roots
   for(Long k : data.keySet()) {
      //add dummy nodes in adjacency list for nodes having no links
      if(!adjList.containsKey(k)) {
       adjList.put(k,new ArrayList<Long>()); 
      }
      boolean found = false;
      for(Long p : adjList.keySet()) {
         if(adjList.get(p).contains(k)) {
            found=true; break;
         }
      }
      if(!found) parents.add(k);
   }

   //start dfs
   HashSet<Long> parentNull = new HashSet<Long>();
   for(Long u : parents) {
      dfs(u,parentNull);
   }
}

/**
* @param node 
* current node
* @param parents
* set of all parents of given node
* @return all subtree nodes as children
*/
HashSet<Long> dfs(Long node, HashSet<Long> parents) {
   //cycle exists
   if(parents.contains(node)) {
      throw new Exception("Contains cycle");
   }
   //add parents to current node
   data.get(node).getParents().addAll(parents);
   //construct new set of parent for passing to children
   HashSet<Long> newParent = new HashSet<Long>(data.get(node).getParents());
   newParent.add(node);
   //do a dfs to all its children
   for(Long neighbours : adjList.get(node)) {
      HashSet<Long> childrens = dfs(neighbours,newParent);
      data.get(node).getChildrens().addAll(childrens); 
   }
   //it has subtree..so return all children
   if(adjList.get(node).size()>0) {
      return data.get(node).getChildrens();
   }
   else {
      //add only current node
      HashSet<Long> ret = new HashSet<Long>();
      ret.add(node);
      return ret;
   }
}

Saturday, July 2, 2011

Finding parameterized type at runtime

Say, I have a base class ParametrizeTest and a subclass of it SubClass. You can easily find runtime class of T for SubClass as following :

import java.lang.reflect.ParameterizedType;
import java.lang.reflect.Type;

public class ParametrizeTest<T> {
 public void getType() {
  ParameterizedType type = (ParameterizedType) this.getClass().getGenericSuperclass();
  Type ts[] = type.getActualTypeArguments();
  Class c = (Class) ts[0];
  System.out.println(c.getName());  
 }
}

public class SubClass extends ParametrizeTest<String> {
 public static void main(String args[]) {
  new SubClass().getType();
 }
}


You can do a valid typecast, only if you have written a class which extends some class. Otherwise, have you wondered whether this will work :

import java.lang.reflect.ParameterizedType;
import java.lang.reflect.Type;


public class ParametrizeTest<T> {
 public void getType() {
  ParameterizedType type = (ParameterizedType) this.getClass().getGenericSuperclass();
  Type ts[] = type.getActualTypeArguments();
  Class c = (Class) ts[0];
  System.out.println(c.getName());  
 }
 
 public static void main(String args[]) {
  ParametrizeTest<String> p = new ParametrizeTest<String>();
  p.getType();
 }
}


You will notice that it throws a "java.lang.ClassCastException: java.lang.Class cannot be cast to java.lang.reflect.ParameterizedType"

So, how can you find it ?

A simple hack which I came across was to add a curly braces in the end while instantiation. This essentially create a anonymous subclass of ParametrizeTest.

import java.lang.reflect.ParameterizedType;
import java.lang.reflect.Type;


public class ParametrizeTest<T> {
 public void getType() {
  ParameterizedType type = (ParameterizedType) this.getClass().getGenericSuperclass();
  Type ts[] = type.getActualTypeArguments();
  Class c = (Class) ts[0];
  System.out.println(c.getName());  
 }
 
 public static void main(String args[]) {
  ParametrizeTest<String> p = new ParametrizeTest<String>(){};
  p.getType();
 }
}


But, overall its not a good idea. Actually, due to erasure, type information is removed after compilation. So, there is no way to get class of parametrized type at runtime. This is not even mentioned in roadmap of JDK 1.7. Hope that its available in JDK 1.8 :)

Easy way of stopping Java thread gracefully

One of the easy ways of stopping thread is by use of a boolean flag. We can use a outer while loop to do our periodic task, and use combination of setStop and interrupt to end it.

Here goes the code

public class MyThread implements Runnable {
 private boolean stop;
 @Override
 public void run() {
  System.out.println("Thread starts");
  while(!stop) {
   try {
    //do something fooBar()
    System.out.println("Sleeping...");
    Thread.sleep(5000);
   } catch(InterruptedException e) {
    System.out.println("Thread was inturrupted");
   } catch(Exception e) {
    //handle error
    e.printStackTrace();
   }
   
   
  }
  System.out.println("Thread ends");
 }
 
 public void setStop(boolean stop) {
  this.stop = stop;
 }

 public static void main(String args[]) throws InterruptedException {
  MyThread m = new MyThread();
  Thread t = new Thread(m);
  t.start();
  System.out.println("Main thread Sleeping...");
  Thread.sleep(2000);
  //stop in this manner
  m.setStop(true);
  t.interrupt();
 }
}

Wednesday, June 29, 2011

Few handy functions provided by Enum class

Suppose we have a following enum class :

public enum Weekdays {
MONDAY("Monday",1),TUESDAY("Tuesday",2),WEDNESDAY("Wednesday", 3),THURSDAY("Thursday",4),FRIDAY("Friday",5),SATURDAY("Saturday",6),SUNDAY("Sunday",7);

 private static HashMap<String,Weekdays> hm = new HashMap<String, Weekdays>();


static {
hm.put(MONDAY.getName(),MONDAY);
hm.put(TUESDAY.getName(),TUESDAY);
hm.put(WEDNESDAY.getName(),WEDNESDAY);
hm.put(THURSDAY.getName(),THURSDAY);
hm.put(FRIDAY.getName(),FRIDAY);
hm.put(SATURDAY.getName(),SATURDAY);
hm.put(SUNDAY.getName(),SUNDAY);
}

int val;
String name;

private Weekdays(String name,int val) {
this.val = val;
this.name = name;
}

public int getVal() {
return val;
}

public String getName() {
return name;
}

public static Weekdays getByName(String s) {
return hm.get(s);
}

//Test Class
public class TestEnum {
public static void main(String args[]) {
TestEnum t = new TestEnum();
t.badWay();
}

public void badWay() {
Weekdays w = Weekdays.getByName("Friday");
System.out.println(w.getName());
}
}

And, we are interested in finding required enum by name or val. What usually comes to mind is creating a static hashmap and use above function like getByName. This might be fine if you need to do fast lookup quite often.

Enum provides some handy functions like values and valueOf instead. findByVal below illustrates a alternative approach using these function :


import java.util.HashMap;
public enum Weekdays {
MONDAY("Monday",1),TUESDAY("Tuesday",2),WEDNESDAY("Wednesday", 3),THURSDAY("Thursday",4),FRIDAY("Friday",5),SATURDAY("Saturday",6),SUNDAY("Sunday",7);

int val;
String name;

private Weekdays(String name,int val) {
this.val = val;
this.name = name;
}

public int getVal() {
return val;
}

public String getName() {
return name;
}

public static Weekdays findByVal(int i) {
for(Weekdays w : Weekdays.values()) {
if(w.getVal()==i) return w;
}
return null;
}

}

//Test Class
public class TestEnum {
public static void main(String args[]) {
TestEnum t = new TestEnum();
t.goodWay1();
t.goodWay2();
t.goodWay3();
}

public void goodWay1() {
Weekdays w = Enum.valueOf(Weekdays.class, "FRIDAY");
System.out.println(w.getName());
}

public void goodWay2() {
Weekdays w = Weekdays.valueOf("FRIDAY");
System.out.println(w.getName());
}

public void goodWay3() {
Weekdays w = Weekdays.findByVal(3);
System.out.println(w.getName());
}
}

Sunday, June 26, 2011

Funny thing about Java thread

Other day, I was trying to store a thread reference which could be started whenever needed. To surprise, it threw IllegalThreadStateException. After searching I figured out that threads can be started only ones..rightly as javadoc says "It is never legal to start a thread more than once. In particular, a thread may not be restarted once it has completed execution."

So you have 3 options :
- Use thread suspend/resume
- Do not allow it to end, use Thread.sleep(time)
- Create new thread as needed