Published by Karol Majta on 1st Apr 2012

Ok, so you already have some decorator knowledge, don't you? If not take a quick
look at Python decorators
introductory article I wrote a while ago. Armed with this knowledge you can do
some pretty useful stuff. Today we will look at a technique called *memoization*.

So what basically is memoization? It's just storing previously calculated values
for further use durging program execution. Let's again refer to *Fibonacci and
Factorials Explained Fictional Project*. This time we will investigate a function
calculating values of the Fibonacci sequence:

```
# functions.py
def fibonacci(n):
"""
Calculates n-th element of Fibonacci sequence
"""
if n == 0:
return 0
if n == 1:
return 1
else:
return fibonacci(n-2)+fibonacci(n-1)
```

We can call it from other file like this:

```
# project.py
import sys
from functions import fibonacci
print "fibonacci({0}): {1}".format(sys.argv[1], fibonacci(int(sys.argv[1])))
```

We get what we expect. But let's look what happens if we decorate the function with
the previously written *log* decorator and call it:

```
# functions.py
from decorators import Log
@Log(name="fibonacci")
def fibonacci(n):
"""
Calculates n-th element of Fibonacci sequence
"""
if n == 0:
return 0
if n == 1:
return 1
else:
return fibonacci(n-2)+fibonacci(n-1)
```

The console output:

```
$ python snippet.py 4
calling fibonacci(4,)
.calling fibonacci(3,)
..calling fibonacci(2,)
...calling fibonacci(1,)
...fibonacci(1,) returned 1
...calling fibonacci(0,)
...fibonacci(0,) returned 0
..fibonacci(2,) returned 1
..calling fibonacci(1,)
..fibonacci(1,) returned 1
.fibonacci(3,) returned 2
.calling fibonacci(2,)
..calling fibonacci(1,)
..fibonacci(1,) returned 1
..calling fibonacci(0,)
..fibonacci(0,) returned 0
.fibonacci(2,) returned 1
fibonacci(4,) returned 3
fibonacci(4): 3
```

Investigating the console output we can figure out some pretty redundant operations take place:

*fibonacci(n)*gets called with arguments 4, 3, 2, 1 and 0 - that's perfectly fine, as we only go "deeper" in the recursion tree.*fibonacci(1)*and*fibonacci*(0) return (into*fibonacci(2)*)*fibonacci(2)*returns (into*fibonacci(3)*) and then*fibonacci(1)*gets called again, because as you're probably aware*fibonacci(3)=fibonacci(2)+fibonacci(1)*.

Of course we have calcualed the value of *fibonacci(1)* before and there is no
need to do it again. The problem can be found in a few other places in the console
log and tends to become even more severe when argument given to the script is bigger.
For the sake of brevity I won't include console dumps for that case, but you can try
it yourself.

One noticable thing is the fact that this problem is not obvious, as humans calculate Fibonacci numbers starting from 0 - there is no repetition. The recursive algorithm solves it the other way round.

You probably already figured out that the solution is to store previously
calculated values. This is done in an *extremely* easy and elegant way with
a class based decorator.

The code is pretty short:

```
# decorators.py
class Cache(object):
def __init__(self):
self._storage = {}
def __call__(self, callable):
def _callable(*args):
if self._storage.has_key(args):
return self._storage[args]
else:
result = callable(*args)
self._storage[args] = result
return result
return _callable
def clear_cache(self):
self._storage = {}
```

In the constructor we create the cache. The ****call**** method is actually
responsible for decorating the function, and **_callable** will replace it.
**_callable** works in a pretty straightforward way. If the arguments' tuple
is in the cache dict return it. Else, add it to the cache dict and then return
it. There is also an utility method for clearing the cache between function
calls.

Decorated *fibonacci* function:

```
# functions.py
from decorators import Log, Cache
fib_cache = Cache()
@fib_cache
@Log(name="fibonacci")
def fibonacci(n):
"""
Calculates n-th element of Fibonacci sequence
"""
if n == 0:
return 0
if n == 1:
return 1
else:
return fibonacci(n-2)+fibonacci(n-1)
```

Let's look at the new console output:

```
$ python snippet.py 4
calling fibonacci(4,)
.calling fibonacci(3,)
..calling fibonacci(2,)
...calling fibonacci(1,)
...fibonacci(1,) returned 1
...calling fibonacci(0,)
...fibonacci(0,) returned 0
..fibonacci(2,) returned 1
.fibonacci(3,) returned 2
fibonacci(4,) returned 3
fibonacci(4): 3
```

As you can see, with memoization we never do any unnecessary traversal of the recursion tree. It's a simple journey - "down" and then "up".

Is it any good? Let's write a short benchmark:

```
# project.py
import sys
from functions import fibonacci, fib_cache
start = time.time()
for _ in range(0, 1000):
fibonacci(int(sys.argv[1]))
fib_cache.clear_cache()
stop = time.time()
print stop-start
```

Test it with different values, and decorated and undecorated version of
*fibonacci* function. My results are:

- for
*n=5*the undecorated function calls take about*0.01s*and decorated*0.03s*. The decorated vesion is acually*slower*if n is small. - for
*n=10*the undecorated function calls take about*0.1s*and decorated*0.06s*. Benefits of memoization start to become visible. - for
*n=20*the undecorated function calls take about*15.0s*and decorated*0.1s*. That's circa 100 times faster!

I've titled this article *poor man's Memcached*. Well... If you feel you need
Memcached, you probably need Memcached. Use Memcached for "persistent" cache.

Apart from scientific computations and recursive algorithms memoization as
described in this article can be helpful during optimisation. Our *Cache()*
decorator allows us to "turn off" a costly function, or in other words to
limit it's execution time almost to zero. Switching different functions on
and off in some sort of testcase can show which parts of code are CPU hungry
and should be optimized.

Happy (memoized) coding!