Recently I've been playing quite a lot with Behave, a lovely and fresh BDD framework for Python. If you're looking for a really hype tool to mess around with end-to-end tests and user specs you should give it a try.
Writing stories and acceptance tests would be super fun if there was no database, external APIs, and no "enviroment" that should be properly mocked out. Database trip times can be kept low by using in-memory SQLite, it's the http layer that becomes a real pain.
Fortunately I've found a great HTTPretty library that monkey-patches Python's socket module keeping all requests made with popular HTTP libraries in strict isolation from the outside world.
So far so good... You can monkey patch stuff, mess around with your configuration and keep your tests quite reliable, but why the hell is it so hard?
It's broken until it's tested, isn't it? True, true, so let's look at some simple example:
::python import uuid class Entity(object): """ Super dull class that only has an id. And by accident it can be compared... """ def __init__(self) self.entity_id = int(uuid.uuid4()) def __cmp__(self, other): if self.entity_id < other.entity_id: return -1 elif self.entity_id == other.entity_id: return 0 else: return 1 def __hash__(self): return self.entity_id
__hash__ method is a simple getter so it doesn't
need a test, let's have a look at the test for
::python def test_cmp(): e1 = Entity() e2 = Entity() expected = e1.entity_id < e2.entity_id actual = e1 < e2 assert expected == actual
Whoa! Tested code, it works :) Yup, it does, but you
probably already see the smell in it. We had to
duplicate the logic of
__cmp__ method inside the
test to do some introspection about the objects. That's
bad. Firstly, it's against the DRY principle, secondly
unittests are about assumptions that look trivial in
test code. This one is not. So, how to refactor our
::python class Entity(object): """ Super dull class that only has an id. And by accident it can be compared... """ def __init__(self, entity_id): self.entity_id = entity_id def __cmp__(self, other): if self.entity_id < other.entity_id: return -1 elif self.entity_id == other.entity_id: return 0 else: return 1 def __hash__(self): return self.entity_id
Test code is really simple now:
::python def test_cmp(): e1 = Entity(1) e2 = Entity(2) assert e1 < e2
That's good, and it's a really good idea to never instantiate or by other means "obtain" anything in your constructors. Everything that's neccessary to create an instance should be provided as arguments.
That seems great when you're reading smart books filled with bright insights, but we all know in real world code this can get painfully cumbersome to always provide every small piece of information.
We need them so bad! The power of great python libraries such as Requests comes from the fact that they're dead simple to use without a single line of configuration.
Providing defaults is really simple in Python and because the language is so robust there is more than one way to do it.
Factory methods and default arguments:
::python import uuid class Entity(object): """ Super dull class that only has an id. And by accident it can be compared... """ @classmethod def create(self, cls, uid_factory=uuid.uuid4) return Entity(uid_factory()) # here comes the rest of implementation
::python import uuid class Entity(object): """ Super dull class that only has an id. And by accident it can be compared... """ uid_factory = uuid.uuid4 def __init__(self): self.entity_id = self.__class__.uid_factory() # here comes the rest of implementation
Both ways seem equaly good for me and it's mostly a matter of taste which one you prefer.
Default arguments may lead to really long constructor
lists, if other classes use
Entity their methods
may carry this burden further into your code.
Class properties are powerful, but if you look at
__init__ there is no obvious way to pass
entity_id explicitly, so in your tests you
would need to configure the class with custom factory.
Previous examples were somewhat made up to prove some
points, but they're not very detached from real
solutions. Let's look at a django view that fetches
google.com and returns the result:
::python from django.views.generic.base import View from django.http import HttpResponse import requests class GoogleView(View): def get(self, request): text = requests.get("http://google.com").text return HttpResponse(text)
Now imagine an end-to-end test run using Selenium or other framework, or better, a whole suite. How can you prevent it from hitting the network hundreds of times? How can you prevent it from failing unexpectedly on network errors, or google errors (this probably won't happen, but who knows?). You can monkeypatch.
What if you provide this code to your client and he runs his acceptance suite? He can monkeypatch. But he will monkeypatch code that he does not know, he will dig into the internals he would never think to care about. He shouldn't have to. He can also mock the whole network layer with HTTPretty. Whatever he does, the person responsible for end-to-end tests will probably hate you.
One simple refactoring in this code can change this:
::python class GoogleView(View): http_provider = requests def get(self, request): text = self.http_provider.get("http://google.com").text return HttpResponse(text)
Now in any BDD framework, such as Behave one can modify the behavior of this class for the whole suite:
::python import GoogleView class StubResponse(object) def __init__(self, text): self.text = text class StubRequests(object): def get(url): return StubResponse("Hello from google!") def before_all(context): GoogleView.http_provider = StubRequests()
Let me state it again: tested code is the code that works, but testable code is the quality code. If your library/framework/product is not painfully difficult to mock/silent/replace/configure in end to end tests, you're doing it right.
If there are any bright ideas in that big, steamy, obese pile of junk - Java, it's the interface. Interfaces cannot be instantiated, and it's great for tests, because if you depend on interfaces in your tests you can always inject a stub instead of a real thing.
Dynamic languages make it easy to, monkeypatch, change behaviors at runtime etc. but they also make lazy programmers and libraries that are well unit tested, fun to work with, but fail utterly when you try to make them transparent in end to end tests. Get things right. Leave a back door for others to configure your classes. Provide reasonable defaults. Others will love your code.