Showing posts with label tree. Show all posts
Showing posts with label tree. Show all posts

Tuesday, July 5, 2011

Modeling Tree and DAG in MySQL Vs Inmemory implementation

Few days back I was stuck with a problem of modeling a directed acyclic graph(DAG) in database.

I started looking into various approaches of storing trees first. If it is a case of tree with multiple children, but only single parent, then we can choose to use nested set model. A comprehensive description in mysql is discussed here. I have personally implemented it and works well.

But, if you want to model a DAG kind of structure with multiple parents, then it gets difficult. A good article on this can be found here. It discusses about various operations on nodes and links. Operations like delete link gets pretty complicated.

So, I thought may be a in memory implementation would be a better alternative. I went ahead and wrote a inmemory thread implementation using DFS, which gives all the parents in the path till given node(maybe through multiple paths), and all the children in subtree on a given node.

Here is a brief outline of it :

//datastructure to store childrens and parents
public class Node {
 HashSet<Long> childrens = new HashSet<Long>(); 
 HashSet<Long> parents = new HashSet<Long>(); //parents in path
 public HashSet<Long> getChildrens() {
  return childrens;
 }
 public void setChildrens(HashSet<Long> childrens) {
  this.childrens = childrens;
 }
 public HashSet<Long> getParents() {
  return parents;
 }
 public void setParents(HashSet<Long> parents) {
  this.parents = parents;
 }
}
//adjacency list (Node id, neighbour id list)
private HashMap<Long,List<Long>> adjList= new HashMap<Long, List<Long>>();
//all the nodes hashed on node id
private HashMap<Long,Node> data = new HashMap<Long, Node>();

public void init() {
   //find nodes which have no inlink from adjacency list 
   HashSet<Long> parents = new HashSet<Long>(); //who are roots
   for(Long k : data.keySet()) {
      //add dummy nodes in adjacency list for nodes having no links
      if(!adjList.containsKey(k)) {
       adjList.put(k,new ArrayList<Long>()); 
      }
      boolean found = false;
      for(Long p : adjList.keySet()) {
         if(adjList.get(p).contains(k)) {
            found=true; break;
         }
      }
      if(!found) parents.add(k);
   }

   //start dfs
   HashSet<Long> parentNull = new HashSet<Long>();
   for(Long u : parents) {
      dfs(u,parentNull);
   }
}

/**
* @param node 
* current node
* @param parents
* set of all parents of given node
* @return all subtree nodes as children
*/
HashSet<Long> dfs(Long node, HashSet<Long> parents) {
   //cycle exists
   if(parents.contains(node)) {
      throw new Exception("Contains cycle");
   }
   //add parents to current node
   data.get(node).getParents().addAll(parents);
   //construct new set of parent for passing to children
   HashSet<Long> newParent = new HashSet<Long>(data.get(node).getParents());
   newParent.add(node);
   //do a dfs to all its children
   for(Long neighbours : adjList.get(node)) {
      HashSet<Long> childrens = dfs(neighbours,newParent);
      data.get(node).getChildrens().addAll(childrens); 
   }
   //it has subtree..so return all children
   if(adjList.get(node).size()>0) {
      return data.get(node).getChildrens();
   }
   else {
      //add only current node
      HashSet<Long> ret = new HashSet<Long>();
      ret.add(node);
      return ret;
   }
}