Sunday, February 7, 2010

JavaScript applicative Y combinator

Recently I have been working with JavaScript.
Although JavaScript interpreter does not have to be tail call optimized, and I believe most of them are not, JavaScript have closures and functions are first class objects. Hence it is possible to write in JavaScript applicative Y combinator.

In this short text I will start with a simple recursive factorial function written in JavaScript and by modifying it again and again I will transform the code into two methods Y and F such that Y handles the recursion and F handle the factorial logic.
It is worth to mention that once one have those methods, they can be used without naming (anonymously).

Starting with:


var factorial = function(n) {
        if (n === 0) {
            return 1;
        } else {
            return n * factorial(n - 1);
        }
    };

The first change is to create a factorial factory and use it instead the direct call.


var f = function() {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n *f( )(n - 1);
            }
        };
    };

Now instead of factorial(6) we call f()(6).

Next we change the factory method r to receive one argument, this argument will be always a factorial factory itself.


var f = function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * f(f)(n - 1);
            }
        };
    };


Now the call has changed to something like f(f)(6).
When substituting r’s body with r in the call above we have:



    function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * f(f)(n - 1);
            }
        };
    }
    (function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * f(f)(n - 1);
            }
        };
    })(6)

The next transformation is reverse eta conversion (η-conversion)
 f(f)(n) => function(a){ return  f(f)a}(n), this delays the computation of f(f(n)  and prevent infinite  loop because of the evaluation order in applicative order.


    function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * ((function(a) {
                    return f(f)(a);
                })(n - 1));
            }
        };
    }
    (function(f) {
        return function(n) {
            if (n === 0) {
                return 1;
            } else {
                return n * ((function(a) {
                    return f(f)(a);
                })(n - 1));
            }
        };
    })(6);

Now (function(a) {

                    return f(f)(a);
                })

can be extracted out we will do that by wrapping everything in a new function call with param r and passing it the expression  (function(a) {

                    return f(f)(a);
                })

To the function

    function(f) {
        return function(r) {
            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        }(function(a) {
            return f(f)(a);
        });
    }
    (function(f) {
        return (function(r) {
            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        })(function(a) {
            return f(f)(a);
        });
    })

Now we do the same with the pink expression and have:

function(m) {
        return function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        }(function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        })(function(r) {
            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        });

Hence we have 2 functions

var y = function(m) {

        return function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        }(function(f) {
            return m(function(a) {
                return f(f)(a);
            });
        });

and:

var fact = function(r) {

            return function(n) {
                if (n === 0) {
                    return 1;
                } else {
                    return n * r(n - 1);
                }
            };
        }

and 

y(fact)(6) => 720







Thursday, January 28, 2010

How to convert user stories to tasks

Every Agile team needs to convert, as part of its work, user stories to lists of tasks.

  • User stories are stories that describe how the user wants to act on the system and what happens as a result. Implementing a user story can span over many days for the whole team. However, it is preferred to narrow a user story if its estimation exceeds the agile iteration time.
  • Tasks on the other hand, are coding + testing missions for one programmer (could be two in cases where pair-programming policy is used by the team). They normally do not exceed one day of work. 

During the iteration planning time it is the team’s responsibility to convert the list of the user stories given by the product owner, to lists of tasks. This is done for the following benefits:

  • By splitting user stories to smaller tasks the team is forced to discus implementation issues and to understand better what is needed, and agree how it should be implemented.
  • It is much easier to estimate a small task than the whole user story.
  • It is possible to track the progress of the team when the team members work on day or less long tasks, this can help the team coach to see and handle problems soon.

Now the question is how does one split the user stories to tasks?

In my team we are working on software that spans over modules. We have a database, business logic servers and a client application.
Most of the user stories involve coding in all modules. Thus, at first, we used to split the user stories with respect to those modules, each user stories became 3 tasks -- database, business logic and client.

It seems natural doesn't it?
Well apparently it is not. Although the team shares the same room, we spent much time on integration. And most of the time, even after all of the user stories tasks were done, the user story itself was not done yet.

Recently we started to create tasks by functionality instead of modules, with this method each task implements part of the functionality required by the user story (maybe even simplified one) but it spans over all modules.
At start it seemed like we are doing more work (instead of working on the database until all work is completed, we now implement in iterations), but this is not true, working like that has a lot of benefits.

Among these benefits are:

  • Less integration issues.
  • System tests can be a part of the task (Done definition).
  • Testing can start earlier.
  • If the user story is not finished by the end of the iteration, some times it is still possible to ship 

the part that is done.

In my opinion, what frightens teams from working like that is the fear that newer tasks will create regressions in early tasks.

I see this fear as another benefit since it forces you to write good system tests for each task and insist on the done definition.

Wednesday, January 20, 2010

Bugs free software development

This is the holy grail of the software industry, how can we write bugs free software.
Surely there is not such thing, what we can do is improve the development process toward this goal.
My assumption is that programmers will always create bugs as they develop, so I will focus on the safety harness - testing.

We can divide the test into 2 categories:
  • Automatic tests,
  • Human tests.
First let's look at automatic tests; there are 2 kinds of them:

System tests -- tests that check the whole system by performing user stories as defined in the spec.
  • Pros
    • System tests are important because they tests/use the system like real users do .For example system tests run on top of setup installation using the real database and other 3rd party components.
    • It is very easy to design good set of system tests, you just have to follow the user stories that the system should support.
    • One big advantage of system tests are that they are not tied to implementation of specific module, for example you can change the underline algorithm since the user story stay the same the test remain correct.  
  • Cons
    • System tests takes time to run and hard to understand where things go wrong when failed, so in most of the cases the tests infrastructure should be accompanied by report infrastructure.
    • It can be technically very hard to prepare the infrastructure required for for system tests because of the dependencies of 3rd party components.
And unit tests -- tests that written by programmer to test a piece of code before or after writing this piece of code.


  • Pros

    • Unit test are run fast and can be part of the compilation process on the developer machine, so bugs are caught and fix on the spot.
    • Since it run fast it is possible to cover in some case all the input and the output for a given piece of code and this is very nice.
    • Unit test are very easy to fix when fails.
    • Unit tests in general create better code because it force the programmer to think of interface and use its piece of code on the spot.
  • Cons
    • Unit tests are highly dependent on the code they tests, you can not change algorithm without rewrite the tests.
    • Good set of unit tests are very hard to write, it is an art.
A good software development process should rely both on unit tests and on system tests as part of the development process.
The first line of defense should be the unit tests, while the system test should be used to verify that the old user stories still working.

What about human tests?
First it is clear that there should be phase of human tests in or after the development process
this is because automatic tests will never give you 100% cover in nontrivial software.

I myself think that it is better to push the human tests as early as possible in the development process because:
  • That way you always know where you stand.
  • The more early you find bug, the easier it to fix it.
The real difficulty in this case where the team contains programmers and testers is how to make the testers citizens with equal rights.

Wednesday, January 13, 2010

Continues integration and long running system tests



Say you are working in a software company that use system test as part of the production system.
Ideally you would treat system tests like unit tests e.g. having all of them run each build and have the report email to you.

Now the problem with that is the fact that system tests can take lots of time to run for example we have only 50 tests and it take about 15 min, when you have hundreds of tests it can take some hours to run them all.

Even if you are using grid to run your tests it is only ease the pain since the first machine you add the grid reduce the time by 50% but effect of a new machine is reduced dramatically the more machines you already have.

So we conclude the it is not practical to expect that all the tests will run for each build.

This is how we at Dalet solved this problem:

First for each build the artifacts of the build is 2 zip files
  • product.zip and
  • product-system-tests.zip
those 2 files are saved at the end of the build at a directory with name as the svn revision of the build.

Second the automatic test process is trigged after each build but only one process of the tests can run at some time, now the first thing the tests runner doing is to find the latest revision that was built and not tested (this is a simple scanning of the files generated by the first stage).
Once it find such revision it extract the product extract the system-tests and run the tests, the output of this run is a test report.

Now you can say that this is not good enough and I am totally in with you.
The problem is that every tests run cover multiple commits and you do not know really which one of them broke the test.

So our next step was to write a small web service to display the report nicely and to compare 2 tests once against other.


Once you see those results the problem become much simpler since you realize that only small parts of the tests will fail in each run and that you have only to find the failed tests revision.
This can be done by running for each revision that was skipped only the failed search (something like binary search on the tests) once you do that you have the changes set that fail test the fix become very simple.

Another step that you can go is to collect the statistics about what files caused what test to fail, this can help you understand the dependencies in your code and let you choose automatically
small subset of the tests to run after a given change set.

Nice Ha ?

Barak.