Searching on a suite of failing tests

Aug 13, 2015 - 5 minute read - Comments - tip search naive-search binary-search dichotomous-search dichotomic-search comparison spring context end-to-end-test test-dependency antipattern

Motivation

Today, at a client, in the green phase, we had a test suite which was failing: the whole suite fails but the test cases, individually, succeed.

So my first impression was that something was being shared between tests. Also confirmed because the failing class was an EndToEnd test, in which we load the whole spring context

A quick glance was not revealing anything interesting, so I decided to find which is the minimum suite (as opposed as to the whole suite) that makes the new testcase fail, expecting to narrow the search for possible causes. This is very similar to what QuickCheck does, to generate the minimum testcase that breaks your property. In this way, you can focus in a possibly simpler¹ feature.

This post covers how to perform this search.

Mathematical formulation

Let T be a sequence of tests: i1, i2, ..., in, F, j1, j2, ..., jm

where:

i are green tests
F is first failing test:
- when executed in isolation, is green
- when executed in the sequence, is red
j are tests after the failed test.

In this case, the order is important, as the failed test suite (presumably) comes from a shared state set by a previous test.

The first search-space pruning is to remove the j, as they supposedly don’t have any effect (as they are after the failed test)

let T1 be a subsequence of T that includes i and F: i1, i2, ..., in, F

Then it is assumed that one (or more) of the i generates an inconsistent state that makes F fail.

In the sequence T1, F fails, but in the sequence TF composed by F only, the test succeeds. This can be thought as the equivalent of the Intermediate value theorem, also called Bolzano’s theorem, where the T1 is at one side ot the axis and the TF at another. The theorem proves that there must be at least one value where the domain of the function crosses the axis

I say “thought of” because that theorem is only for continous functions and sets are not (are discrete) but the analogy is good enough: if one sequence is OK but the other isn’t, there must be a minimum sequence where the result is OK and another one where the results are not. They two sequences must not be the same case as one result cannot be OK and not OK at the same time.

Naïve search

The first way of searching would be find the minimum set that fails:

include F, then another from i1,...,in that makes the F fail. F the last one, because it needs to be affected by the side effects from the preceding test.
if the first strategy does not work, for each of the above cases, add another from the i1,...,in (except the one that was added)

For the first step, it takes O(n * 2), assuming executing a single test costs O(1). Reduces to O(n)

For the second step, O(n * n-1 * 3). Reduces to O(n^2)

For the third step, O(n * n-1 * n-2 * 4). Reduces to O(n^3)

For the nth step, it costs O(n * n-1 * n-2 * ... * n-(n-1) * (n+1)). Reduces to O(n^n). Which is polynomical but not feasible for medium-sized n (in an automatic fashion) or small n (in a manual fashion)

Dichotomic search

(This is also known as binary search)

Inspired by git-bisect, I decided to treat the sequence i1, ..., in as the source for the dicotomical search, applying the subsequences to F.

The first step, it takes O(n/2 * n/2) = O(n^2) to execute half the tests

The second step, is to execute half the number of the tests previously executed, either from the sequence before (if the F fails) or from the other half (if F does not fail). The cost is O(n/4 * n/4) = O(n^2)

For the nth step, the cost is O(n/2^n * n/2^n) = O(n^2/2^n) = O(0)

This n in the nth step is smaller than the other n, as each step divides by two the amount of tests to be included.

The amount of tests to be executed is n + n/2 + n/4 + n/8 + ... + 1 which is roughly 2n. Executing each test costs O(1) (by the assumption before), so the total cost is O(2n) = O(n)

If we take it by the amount of steps we need to manually execute is 1 for the whole, 1 for the half, 1 for the quarter, …. = O(log2 n)

Procedure

To keep the executed tests, I created a support branch where I deleted the tests that were selected to be excluded. Always executed “all tests in the suite” as this makes it faster to select in the IDE.

When the half taken was wrong, I reverted the last commit and selected the other half.

After finding the minimum sequence and solving the issue, this support branch was discarded

Conclusion

In the real scenario, with around 100 tests, searching manually in the naïve way would not have been possible. It would have cost 100 steps, as the minimum set that produces F had size 2 (so only one step was necessary).

Applying the dichotomic search, in 8-10 steps I had finished, with the guarantee that no matter how many tests produced the F I would have found it in a reasonable amount of time.

Comments

Finally, the root cause for the failing test F was the OrientDB InMemory implementation with Spring context, as the former does not allow two instances at the same time in the same JVM.

It was solved using @DirtiesContext in both cases of the minimum sequence that forms F, so no matter which order the executor decides, the context will always be clean for the next execution.

We found this thanks to a teammate’s intuition.

Because maybe the minimum testcase is more difficult than another. ↩︎

Brown-bag session: Spring Boot

Aug 12, 2015 - 1 minute read - Comments - brown-bag-session client spring-boot

At a client, today we’ve done a brown-bag session on Spring Boot: we’ve gone through the basics with a sample maven project and its pom file.

We’ve seen the Spring Initializr and a live demo, using curl for REST client.

The Animal Laborans and the Homo Faber

Aug 10, 2015 - 2 minute read - Comments - the-craftsman richard-sennett philosophy hannah-arendt quote animal-laborans homo-faber

I’ve found this quote very interesting from the book “The Craftsman”, by Richard Sennett:

Animal laborans is, as the name implies, the human being akin to a beast of burden, a drudge condemned to routine. Arendt enriched this image by imagining him or her absorbed in a task that shuts out the world, a state well exemplified by Oppenheimer’s feeling that the atomic bomb was a “sweet” problem, or Eichmann’s obsession with making the gas chambers efficient. In the act of making it work, nothing else matters; Animal laborans takes the work as an end it itself.

By contrast, Homo faber is her image of men and women doing another kind of work, making a life in common. Again Arendt enriched an inherited idea. The Lating tag Homo faber means simple “man as maker”. […] Homo faber is the judge of material labor and practice, not Animal laborans’s colleague but his superior. Thus, in her view, we human beings live in two dimensions, In one we make things; in this condition we are amoral, absorbed in a task. We also harbor another, higher way of life in which we stop producing and start discussion and judging together. Whereas Animal laborans is fixated in the question “How?” Homo faber asks “Why?”

Richard Sennett, The Craftsman, Prologue, page 7

The language was prepared for that

Aug 10, 2015 - 2 minute read - Comments - clojure haskell java language-comparison prefix-notation operator overloading

Many times I’ve written this function:

public boolean between(int lowerBound, int n, int upperBound){
	return lowerBound <= n &&
		n <= upperBound;
}

It may depend on the case, whether it is [], [), (] or (), to use mathematical terms.

When the two comparisons are the same ([] and ()), there is duplication in the comparisons.

Investigating a little bit on this in clojure, I’ve found this function:

<=

And its clojuredocs: Returns non-nil if nums are in monotonically non-decreasing order, otherwise false.

A sample usage:

(<= 1 2)
; true

(<= 1 2 1)
; false

The last part is the most interesting one. As this function is prepared to receive more than two parameters, it is very easy for the programmer to use it. We could say that the language was prepared for that.

The implementation:

(defn <=
  ([x] true)
  ([x y] (. clojure.lang.Numbers (lte x y)))
  ([x y & more]
   (if (<= x y)
     (if (next more)
       (recur y (first more) (next more))
       (<= y (first more)))
     false)))

Inspired by this, I’ve implemented the same function in haskell (for the repl):

let isBigger acc ele = (snd acc) && (fst acc) < ele in
   foldl (\acc ele -> (ele, isBigger acc ele)) (1, True) [1,2,1,3]

and a simpler solution I’ve found on Stack Overflow:

isSorted :: (Ord a) => [a] -> Bool
isSorted xs = all (\(x, y) -> x <= y) $ zip xs (tail xs)

isSorted :: (Ord a) => [a] -> Bool
isSorted xs = and $ zipWith (<=) xs (tail xs)

Conclusion

Unless a more elegant, language-provided solution exists in haskell, the clojure one is way simpler. This is one of the benefits of prefix notation, that operators (e.g., +, -, *, <=) are overloaded to take more arguments than before.

Recognizing dependencies

Aug 8, 2015 - 1 minute read - Comments - chapter poodr sandi-metz ruby dependency object quote

From the Chapter 3, Managing Dependencies, from the book Practical Object-Oriented Design in Ruby, by Sandi Metz:

An object has a dependency when it knows:

The name of another class. […]

The name of a message that it intends to send to someone other than self. […]

The arguments that a message requires. […]

The order of those arguments. […]

If an object knows any of these facts about another object, it has dependencies to the other.

This is not to say that having dependencies to others is bad, as

A single object cannot know everything, so inevitably it will have to talk to another object. Chapter 3, Managing Dependencies, Introduction

For this latter purpose, there is the section “Writing loosely coupled code”

Multiple return values in a Mockito Stub

Aug 7, 2015 - 1 minute read - Comments - mockito tip stub java

I’ve been asked today how to return multiple return values from a Mockito Spy, effectively using the spy as a Stub, as well.

package com.example.spike;

import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;
import static org.mockito.Mockito.spy;
import static org.mockito.Mockito.when;

import org.junit.Test;
import org.mockito.Spy;

public class DifferentReturnValues {

	@Spy
	private Spike1 spike1 = new Spike1();


	@Test
	public void spike1() {
		spike1 = spy(spike1);
		when(spike1.getBool()).thenReturn(false, true);

		assertThat(spike1.getBool(), is(false));
		assertThat(spike1.getBool(), is(true));

		assertThat(spike1.getBool(), is(true));
		assertThat(spike1.getBool(), is(true));
	}


	private class Spike1 {
		public boolean getBool() {
			return true;
		}
	}
}

The key line is:

when(spike1.getBool()).thenReturn(false, true);

this makes the stubbed function to return multiple values:

assertThat(spike1.getBool(), is(false));
assertThat(spike1.getBool(), is(true));

The last value is repeated after the last defined value:

@Test
public void spike1() {
	spike1 = spy(spike1);
	when(spike1.getBool()).thenReturn(false, true);

	assertThat(spike1.getBool(), is(false));
	assertThat(spike1.getBool(), is(true));

	assertThat(spike1.getBool(), is(true));
	assertThat(spike1.getBool(), is(true));
}

If you want to loop over the values, you can implement it with the doAnswer method:

@Test
public void spike1() {
	spike1 = spy(spike1);
	when(spike1.getBool()).thenReturn(false, true);

	final boolean[] value = {true};

	doAnswer(invocation -> {
		value[0] = !value[0];
		return value[0];
	}).when(spike1).getBool();

	assertThat(spike1.getBool(), is(false));
	assertThat(spike1.getBool(), is(true));

	assertThat(spike1.getBool(), is(false));
	assertThat(spike1.getBool(), is(true));
}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39