Are you using an opensource library? There’s a good chance you are vulnerable…

This is the talk I presented yesterday at Codemotion Rome 2018! Awesome conference and people, cannot wait for the next one!

You can find a detailed technical explanation in my previous blog post, and you can also have access to the code on GitHub to reproduce the exploit yourself.

Do not underestimate your problem, and put the correct procedure in place: you do not want to be the next Equifax.

 

Advertisements

Dynamic Programming explained (hopefully)

Okay, so some of you certainly already heard about Dynamic Programming (DP), but this is what I understood so far and I am happy to share with you.

Theory

In short, DP is all about ordering your computations in a way that avoids recalculating duplicate work: you have a main problem and a lot of subproblems.

There are two key attributes that a problem must have in order for DP to be applicable: optimal structure and overlapping subproblems:

  • when you have an optimal structure, then the optimal solution of a given problem can be obtained by the combination of optimal solutions of its subproblems
  • when you have overlapping subproblems then a solution of a problem should require the same subproblem again and again

Hey, please note that if a problem can be solved by combining optimal solution of non overlapping subproblems then we are in the “divide and conquer” area, where for example merge sort and quick sort lies.

Dynamic Programming is typically implemented using two common techniques, tabulation and memoization:

  • when you solve a DP problem using tabulation you solve the problem using a bottom-up approach, by solving all subproblems first, and creating a n-dimensional table: based on such table the solution to the original problem is computed. Because of that, tabulation solves all the subproblems.
  • when you solve a DP problem using memoization you do it by maintaining a map of already solved subproblem: you solve the problem top-down, basically solving the top problem first and then recursing in solving the subproblems. Memoization may pay an overhead due to the recursion, but it does not need to solve all the subproblems

Please note that in DP you will often use backtracking, that incrementally builds candidates for the solution and then abandons them when it determines that they cannot contribute to the solution.

Some code, please!

Ok, all good. Now where do we go from here? Some code will help 🙂 A typical DP problem is the fibonacci sequence:

fib(n) = fib(n-1) + fib(n-2)

I guess you can already see the overlapping subproblems and the optimal structure: let’s try to solve this with the most natural solution (I guess), which is a recursion.

    private static int fib(int val) {
        if (val == 0 || val == 1)
            return 1;
        else
            return fib(val - 1) + fib(val - 2);
    }

Ok, cool. It works: result! Pretty inefficient tough,  as it uses a large amount of stack memory and computes the solution to the same problem again and again! In fact, for example, to compute fib(5) it will compute three times fib(2). How can we improve this? Well, memoization comes in handy:

    private static Map<Integer, Integer> cache = new HashMap<Integer, Integer>();

    private static int fib(int val) {
        if (val == 0 || val == 1)
            return 1;
        else {
            Integer res = cache.get(val);
            if (res == null) {
                res = fib(val - 1) + fib(val - 2);
                cache.put(val, res);
            }
            
            return res;
        }        
    }

Ok, this is better. At least we do not recompute a lot of times the same solution, but we still use a lot of stack memory, to handle the recursion. And, at the end of the day, we need to compute all the solution to solve this problem, don’t we? Why don’t we use tabulation then? if we do so, we can revert to a nice iterative solution!

    private static int fib(int val) {
        if (val == 0 || val == 1)
            return 1;
        
        int fibs[] = new int[val+1];
        fibs[0] = 1;
        fibs[1] = 1;
        
        for (int i=2; i<=val; i++)
            fibs[i] = fibs[i-1] + fibs[i-2];
        
        return fibs[val];
    }

Ah, that’s better! No more recursion, a plain iterative process going on, just a bit of memory used for our table. But wait… can we do any better? Do we really need the whole table? Can we do better than Dynamic Programming?

    private static int fib(int val) {
        int prev = 0;
        int curr = 1;
        
        for (int i=2; i<=val; i++) {
            int next = curr + prev;
            prev = curr;
            curr = next;
        }
        
        return curr;
    }

Oh yeah 🙂 We just need to keep the last two values, n-1 and n-2: job done!

Conclusions (?)

DP was useful to think out the best algorithm, it was instrumental to discover it but, then, well, we needed that plain old spark of genius that not all of us have (certainly not me!) and some help has been very welcome. But without DP (and without a bigger spark) we would never easily found out an O(n) elegant and efficient solution: so it helps knowing about it. And sometimes some problems are really not solvable without DP, so please do not underestimate it!

Let me know if you are interested in this stuff, I can post more 🙂

 

 

Each comment must start with the words “sorry, …”

I mean, if you really need to write one. When you put a comment, you implicitly admit your inability to communicate your ideas trough your code, you are basically saying “sorry, I do not know enough of this language to express myself decently, so I would put some content in another language to make things clear”.

How dare you? I will have to use your code at a point in time! I will need to go and change it, and I will need to trough your rotten series of comments, I will have to git blame here and there, use aggressively grep, and then spend precious hours of my life that I will never get back in order to understand what you so poorly communicated in your code. How disrespectful of you. At least apologise.

So, new rule for my teams since today.