Tuesday, December 16, 2014

Asynchronous Java RMI

I have started to write my own opensource project.


It is a fast modern asynchronous version of Java RMI.

It has many goodies that does not exists in Oracle versions, some of them:


  • NIO.
  • Asynchronous calls.
  • call pipline.
  • one way calls.
  • Easy TLS configuration.
  • IP based connection filters with rules similar to firewalls, and more.

check the project page for more.




Tuesday, February 14, 2012

Fast Flexible Efficient In Memory Java LRU cache

In this post I will share the ideas and code for my Java LRU cache implementation.
My requirements were:
  • Bounded cache -- the number of maximum entries is bounded.
  • LRU cache -- eviction is done to the Least Recently Used item first.
  • Low Memory footprint -- since each entry can consume unbounded amounts of memory the cache should holds its values using WeakReference, this makes it possible for the cache to evict elements even when the cache is not full when memory is short.
  • Minimized locking -- the cache should not be locked when an entry is searched, only the searched entry can be locked.
  • Minimized computation -- in case of multiple threads searching for the same key, the computation of the value should be done only once.
We will not insert the values to the cache explicitly, instead we will define the Compute Interface that will be called to compute missing values and fill the cache when needed.





The Compute interface is very similar to Java Callable interface. The only difference is that each call to compute receives the key and returns its value.
The Cache constructor receives 3 parameters:
  • A Compute instance -- computes the values for objects that are not in the cache.
  • An ExecutorService -- executes the computation of missing values (the call to compute)
  • The size of the LRU cache -- the maximum number of entries that can be kept in this cache.





The cache itself built using 3 building blocks provided by Java:
  • LinkedHashMap -- Hash table and linked list implementation of the Map interface, with predictable iteration order.
  • Future -- an interface that represents the result of an asynchronous computation.
  • And SoftReference --  which are cleared at the discretion of the garbage collector in response to memory demand.
We use SoftValue that is a subclass of Java SoftReference to hold the values. The reason is that we wish to have the value removed on memory shortage and we need the keys of the values that were removed because we need to clear the keys from the map.
Here is the code of SoftValue:





One more Interface the cache is using is the ExceptionStrategy, it is used to decide what to do when the call to compute(key) throws an Exception, the 2 options is to keep the mapping in the map to discard this entry (when the exception may be temporary,  for example  network  exception ).
Here is the source of ExceptionStrategy:





When a call to compute results in Exception the cache invoke it's instance of ExceptionStrategy removeEntry method with the key and the Exception, if removeEnty returns false the pair is kept in the cache, otherwise it does not and the next call to cache.get(key) will try to compute the value for key again.
For example the function alwaysRetain, a factory method that returns an ExceptionStrategy that always keep the mapping of an Exception in the cache can be define as:






We are now ready to take a closer look at the Cache constructor



The important part is the definition of the internal map on line 5, the map is defined as mapping from the key to a SoftValue that holds the key and a Future of the value, possibly an un-computed value.
We used the LinkedHashMap constructor that specify the accessOrder of the map as true so that get(key) will mark key as recently used as well as put(key).
The removeEldestEntry is called on each put. its result determines if the eldest entry should be removed from the map or not.


Neutrally the most used method of the cache is the get method, we'll now go over it's code:







We can see in line 5 the usage of exceptionStrategy to decide how to deal with exceptions.
The method getTask(key) in line 3 returns a Future to the computation of this entry and the computation itself is done only at the call to get() (last call in line 3).
Here is the code for getTask:





Since getTask only returns Future of the value (it does not compute the value) it can be synchronized because it's execution time is not related to the compute execution time.
On line 2 we see a call to processQueue this method removes from the map all the entries that their value was collected by the GC. we will have a look at it in short time.
Next the Future for the key is searched in the map and in case it is there it is returned (line 8),  in case there is no Future for this key a new one is created by submitting a computation to the ExecutorService (line 11), the new Future is then wrapped with SoftValue to enable the gc to reclaim this value in case of memory shortage, inserted into the map and returned as the result.
It is important to notice that in this case the compute was not called yet, it will be called only when the Future's get method will be first invoked.
The usage of  ExecutorService in this case give us the flexibility to control the number of threads that will  be used to compute values at any given time.
For example by passing  Executors.newFixThreadPoolExecutor(2) to the Cache we limit the number of concurrent calls to compute to 2, note that if we wish to call the compute on the user thread we need to write our own DirectExcutorService.
Next we will look at the processQueue method:

This method is looping over the entries that were reclaim by the GC and remove them from the map.
Here is the binary version of the Cache and here is the zip file that includes all the sources with the tests and the ant build file, or you can get all the sources from github.

Thursday, August 18, 2011

Continuous integration notification, the other way.

My team use Hudson for continuous integration it used to send mail to whoever commit.
Recently our handsome sysadmin decided that this is an abuse of the system so this option was blocked.

As a result i noticed from time to time that the build was broken for some time and i didn't love it very much, I assume no one love it.
I decided to improvise, Hudson  have plugin to gtalk so I created a gmail account for the builder and try this plugin.

Luck was not with me, the builder located in the lab on a private network so the plugin didn't worked.
However, Hudson have an option to send UDP package to a configured address as a result of a job state changed (job start, success or failed). so I configure Hudson to send my Linux machine notification about the build state.

Using Hudson REST API I was able to get all the missing information to send the needed data to the users who committed.
Since my machine connected to the web it was possible to start gtalk conversation quickly I wrote a Java program that send the users gtalk message using smack.
The result was beautiful the builder gtalk user start a conversation with the users that committed and report the build status.

Now for the last part, after I add an image to the gtalk user I was kind of attached to him, I was thinking it would be nice to talk with him from time to time, and then I remembered of my old friends from the university Eliza and Emacs Doctor.

The result is here: builder.webspace@gmail.com 
Here is a sample of our conversation

Just give it a try, open gtalk and chat with builder.webspace@gmail.com .
Barak.





Thursday, May 26, 2011

Calling Java applet from Javascript

Hello.

Recently we were busy adding upload media to our webapp, it was required to upload large media files (more then 4G) from client browser to the main site while monitoring the progress. It was required that the user will be able to cancel downloads and have many possibly download running in parallel.

At first we tried to implement this using plupload which is flash client that use multi-part post, the sever side was implemented as Java servlet , this seems to works for small files but fail on large files, in addition memory was not free on some OS.

We next tried native html5 file upload that turn out to suffer with the same sickness, at the same time we noticed that jetty conent-length attibute is of type int and not long, that rules out both html5 upload and the multi part post approach.

Our last try was an FTP applet (based on our in house FTP client) that upload files to a Mina based FTP Server.
That turn out to be really successful, we let the applet handle the file choosing and the FTP to the server while keeping the UI in the Javascript side.
There were 2 things that gave us hard time while developing this applet.


  • The first is the security, it turns out that even while the applet is fully signed and allow all security, it can not do that from a thread that was called from Javascript -- Oracle doc has only one short sentence about this property and it took us a while finding it.What we did is use the command pattern, having one thread created from the applet constructor, this threads allow to do all since it does not a thread created for Javascript command execution. A request from Javascript create a command that returns a Future, those commands than inserted into a queue to be processed by the thread created at the applet constructor.This seems to work well. Pay attention that you should not use Java Executors to implement this queue since Java Executors are lazy in its thread creation.
  • The second issue related to the fact that embedding Java Applet in a page make the page load un responsive while the Java Plugin is loaded -- so we preferred to load the Applet dynamically the first time that the user try upload, this turn out to be tricky on Firefox on Mac and on IE you need to create a new IFrame for the applet, while in Firefox on Linux and Windows you can load the applet dynamically directly to your page.

The Applet Code code.

Sunday, January 23, 2011

What is 'Level out the workload' in software development


Recently I been reading Taiichi Ohno book
Toyota Production System: Beyond Large-Scale Production
A key principles in his book was the principle of leveling out the workload

I have been reading lately lots of books that show how to apply Toyota Production principles to a software development process most of them describe how to map Toyota 7 waste to the software development world.

For example

  • Inventory  ~ Partially Done work.
  • Extra Processing ~ Extra Process.
  • Overproduction ~ Extra Features.
  • Transportation ~ Task Switching.
  • Waiting   ~ Waiting
  • Motion ~ Motion
  • Defects ~ Defects
The meaning of leveling out the production in Toyota is to try to make every day the same amount of
each type of product.
For example:

If you have an order of 300 cars of types A and 600 Cars of type B each Cell should assembly
one car of type A than 2 cars of type B and so on.
According to the book leveling out the production help to maintain both machines and employee in good condition, where burst of actions and afterward long periods of doing nothing exhausted both humans and machines.

Another great benefit of leveling out the production is that it allow JIT, when working in JIT
you get every part from the earlier processes exactly when and where you need it, this enable single flow, reduce inventory and find defects early.

Now the question I was thinking of is, how do I level out the workload for software development ?

Software development is creative process unlike car production so maybe there is no good equivalence to leveling out the workload in software development.

My faith is that in software production the your best assets are your employs so leveling the workout can be mapped to:


  • Invest and educate your employs.
  • Give them feeling that their work is important.
  • Provide them with ways to measuring their progress.
  • Plan ahead to reduce burst of actions or periods of doing noting.

If you think there are other way of leveling out the production in software development, please let me know.





Wednesday, January 19, 2011

Testing YUI DND with Selenium

If you happen to write some selenium tests for YUI based page you may notice that sometime selenium DND does not work.
This is because selenium sometime has error when computing the right X,Y of a component when using YUI layout manager.
This problem can be solved easily by overriding selenium JavaScript code that compute X,Y given a locator.

For example getElementPositionTop can be rewrite to something like that:

Selenium.prototype.getElementPositionTop = function(locator) {
    /**
     * Retrieves the vertical position of an element
     *
     * @param locator an element locator pointing to an element OR an element itself
     * @return number of pixels from the edge of the frame.
     */
    var element;
    if ("string" == typeof locator) {
        element = selenium.browserbot.findElement(locator);
    }
    else {
        element = locator;
    }
    var xy = selenium.browserbot.getCurrentWindow().YAHOO.util.Dom.getXY(element);
    return xy[1];
};

All the change should written to a file for example 'user-extensions.js' and selenium should be run with the cmd:


java -jar selenium-server.jar  -userExtensions user-extensions.js

That should solve the issue.



Sunday, February 7, 2010

JavaScript applicative Y combinator

Recently I have been working with JavaScript.
Although JavaScript interpreter does not have to be tail call optimized, and I believe most of them are not, JavaScript have closures and functions are first class objects. Hence it is possible to write in JavaScript applicative Y combinator.

In this short text I will start with a simple recursive factorial function written in JavaScript and by modifying it again and again I will transform the code into two methods Y and F such that Y handles the recursion and F handle the factorial logic.
It is worth to mention that once one have those methods, they can be used without naming (anonymously).

Starting with:


var factorial = function(n) {
        if (n === 0) {
            return 1;
        } else {
            return n * factorial(n - 1);
        }
    };

The first change is to create a factorial factory and use it instead the direct call.


var f = function() {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n *f( )(n - 1);
            }
        };
    };

Now instead of factorial(6) we call f()(6).

Next we change the factory method r to receive one argument, this argument will be always a factorial factory itself.


var f = function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * f(f)(n - 1);
            }
        };
    };


Now the call has changed to something like f(f)(6).
When substituting r’s body with r in the call above we have:



    function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * f(f)(n - 1);
            }
        };
    }
    (function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * f(f)(n - 1);
            }
        };
    })(6)

The next transformation is reverse eta conversion (η-conversion)
 f(f)(n) => function(a){ return  f(f)a}(n), this delays the computation of f(f(n)  and prevent infinite  loop because of the evaluation order in applicative order.


    function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * ((function(a) {
                    return f(f)(a);
                })(n - 1));
            }
        };
    }
    (function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * ((function(a) {
                    return f(f)(a);
                })(n - 1));
            }
        };
    })(6);

Now (function(a) {

                    return f(f)(a);
                })

can be extracted out we will do that by wrapping everything in a new function call with param r and passing it the expression  (function(a) {

                    return f(f)(a);
                })

To the function

    function(f) {
        return function(r) {
            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        }(function(a) {
            return f(f)(a);
        });
    }
    (function(f) {
        return (function(r) {
            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        })(function(a) {
            return f(f)(a);
        });
    })

Now we do the same with the pink expression and have:

function(m) {
        return function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        }(function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        })(function(r) {
            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        });

Hence we have 2 functions

var y = function(m) {

        return function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        }(function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        });

and:

var fact = function(r) {

            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        }

and 

y(fact)(6) => 720