Posted by & filed under Java.

We’re used to type erasure ruining our day in Java. Want to create a new T() in a generic method? Normally not possible. We can pass a Class<T> around, but that doesn’t work for generic types. We can’t write List<String>.class

One exception to this was using super type tokens. We can get the generic type of a super class at runtime, so if we can force the developer to subclass our type, then we’re able to find out our own generic types.

It let us do things like

List<String> l1 = new TypeReference<ArrayList<String>>() {}.newInstance();

This was really convenient prior to Java 8, because without lambdas we had to use anonymous inner classes to represent functions, and they could double up as type references.

Unfortunately, the super type token does not work with Lambdas, because Lambdas are not just sugar for anonymous inner classes. They are implemented differently.

However, there’s another trick we can use to get the generic type. It’s far more hacky implementation-wise, so probably not useful in a real scenario, but I think it’s neat nonetheless.

Here’s an example of what we can do, a method that takes a TypeReference<T> and creates an instance of that type

public static <T> T create(TypeReference<T> type) {
    return type.newInstance();

So far just the same as the supertype tokens approach. However, to use it we just need to pass an identity lambda.

This prints hello world

ArrayList<String> list = create(i->i);

This prints hello=1 world=2

LinkedHashMap<String, Integer> map = create(i->i);
map.put("hello", 1);
map.put("world", 2);

We could also use as a variable. This prints String

TypeReference<String> ref = i->i;

Unfortunately, it won’t work for the main motivation for supertype tokens – we can’t use this TypeReference as a key in a map because it will be a different instance each time.

First Attempt – Casting

For my first attempt I noticed that if we try casting something, and the cast is invalid we’ll get a ClassCastException at runtime that will tell us exactly what the type actually is. This does work with lambdas, since they’re translated into normal methods.

Underneath the above Java snippet with a TypeReference<String> has been translated into something like

private static java.lang.String lambda$main$0(java.lang.String input) {
    return input;

As you can see, calling this with a type other than String is going to generate a ClassCastException.

So we could invoke the lambda with a type other than it is expecting and pull the type name from the resulting exception.

This worked, but suggested that a better way is possible since we’re invoking a real method, that we should be able to interrogate with reflection.

Better Attempt – Reflection

If we force our lambda to be Serializable, by having our TypeReference interface implement Serializable we can get access to a SerializableLambda instance. This gives us access to the containing class and the lambda name, and then we’re able to use the normal reflection API to interrogate the types.

Here’s a MethodFinder interface that we can extend our TypeReference from which gives us access to the Parameter types

Parameter Objects

Let’s consider a more concrete example of why lambdas that are aware of their types can be useful. One application is parameter objects. Extract parameter-object is a common refactoring.

It lets us go from a method with too many parameters, to a method that takes a parameter object.


List<Customer> customers = listCustomers(dateFrom, includeHidden, companyName, haveOrders);


List<Customer> customers = listCustomers(customerQuerySpec);

Unfortunately the way this is commonly implemented means simply moves the problem one level up to the constructor of the parameter object.


CustomerQuerySpecification customerQuerySpec = new CustomerQuerySpecification(dateFrom, includeHidden, companyName, haveOrders);
List<Customer> customers = listCustomers(customerQuerySpec);

We still have just as many parameters, and still just as hard to follow. Furthermore, we now have to import the CustomerQuerySpecification type. The CustomerQuerySpecification type that your IDE might generate for you is also quite big. So this isn’t ideal.

At this point we might reach for the builder pattern, or a variation thereof, to help us name our parameters. However, there are alternatives.

If we were using JavaScript we might pass an object literal in this scenario, to allow ourselves to have default parameter values and named parameters.

    includeHidden = true,
    companyName = "A Company"

We can achieve something similar in Java using lambdas. (Or we could pass an anonymous inner class and use double-brace initialisation to override values)

First of all we’ll create a Parameter Object to store our parameters. It can also have default values. Instead of using getters/setters and a constructor I’m going to deliberately use public fields.

public static class CustomerQueryOptions {
    public Date from = null;
    public boolean includeHidden = true;
    public String companyName = null;
    public boolean haveOrders = true;

Now we want a way of overriding these default values in a given call to our method. One way of doing this is instead of accepting the CustomerQueryOptions directly, accepting a function that mutates the CustomerQueryOptions. If we did this then we can easily specify our overrides at the callsite.

listCustomers(config -> {
    config.includeHidden = true;
    config.companyName = "A Company";

You might notice that this lambda looks a lot like a Consumer<CustomerQueryOptions> – it accepts a config and returns nothing.

We could just use a Consumer as is, but we can make life easier for ourselves with a little utility method that just gives us the config back and applies the function to it.

Let’s make a Parameters interface that extends consumer. We’ll add a default method to it that returns our config. It instantiates the config for us, applies the consumer function to it in order to override any default values, and then returns the instantiated config.

First we’ll need a way of creating a type of the CustomerQueryOptions ourselves, this is where our Newable<T> interface comes in. We define a NewableConsumer<T>

interface NewableConsumer<T> extends Consumer<T>, Newable<T> {
    default Consumer<T> consumer() {
        return this;

And now we define our Parameters interface extending NewableConsumer

public interface Parameters<T> extends NewableConsumer<T> {
    default T get() {
        T t = newInstance(); // provided by Newable<T>
        accept(t); // apply our config
        return t; // return the config ready to use
public static class CustomerQueryOptions {
    public Date from = null;
    public boolean includeHidden = true;
    public String companyName = null;
    public boolean haveOrders = true;
public List<Customer> listCustomers(Parameters<CustomerQueryOptions> spec) {
    // ...

This would even work with generic types.

The following will print hello world.

foo(list -> {
    // list.add(5); this would be a compile failure
public static void foo(Parameters<ArrayList<String>> params) {


We can use a hack to make lambdas aware of their generic type. It’s a shame that it’s rather too terrible to use for real, because it would be really useful for some of the reasons outlined above.

Unlike the super type tokens we also cannot use it as a key in a map because we’ll get a different lambda instance each time.

Does anyone have an alternative approach?

This post was inspired by Duncan‘s use of this pattern in Expec8ions

The code from this post is available on github.

Posted by & filed under ContinuousDelivery, XP.

There was a recent discussion on the Extreme Programming mailing list kicked off by Ron Jeffries saying he wants his XP back.

The implication being that Extreme Programming is no longer practised, and that most “Agile” organisations are actually practising Flaccid Scrum – some agile process but little of the technical practices from Extreme Programming.

Update: Ron clarifies in the comments that we agree that extreme programming is still practised, but it would be good if it were practised by more teams.

I disagree with this premise. Extreme Programming is alive and well, at least here in London. We have XProlo, eXtreme Tuesday Club, XPDay and many other communities dedicated to XP practices under other names like Continuous Delivery and Software Craftsmanship. There are enough organisations practising Extreme Programming for us to organise regular developer exchanges to cross-pollenate ideas. Extreme programming skills such as Test-driven development and continuous-integration are highly in demand skills in Job Descriptions, even if there is much confusion about what these things actually entail.

When I say that Extreme Programming is alive and well, I do not mean we are working in exactly the same way as described in Kent Beck’s Extreme Programming Explained book. Rather, we still have the same values, and have continued to evolve our technical and team practices. Kent Beck says

“my goal in laying out the project style was to take everything I know to be valuable about software engineering and turn the dials to 10.”

Well now we have turned the dials up to eleven. What does modern Extreme Programming look like?

Turning the dials up to 11

Here are some of the ways we are now more extreme than outlined in Extreme Programming explained.

Pair Programming becomes Mob Programming

Update: Apparently XP Teams are so aligned that Rachel has written a similar blog post, covering this in more detail.

XP Explained says “Write all production programs with two people sitting at one machine”. We’ve turned this to eleven by choosing how many people are appropriate for a task. We treat a pair as a minimum for production code, but often choose to work with the whole team around a single workstation.

Mobbing is great when the whole team needs to know how something will work, when you need to brainstorm and clarify ideas and refinements as you build. It also reduces the impact of interruptions as team-members can peel in and out of the mob as they like with minimal disruption, while a pair might be completely derailed by an interruption.

When pair programming it’s encouraged to rotate partners regularly to ensure knowledge gets shared around the team and keep things fresh. Mobbing obviates the need to rotate for knowledge sharing , and takes away the problem of fragmented knowledge that is sometimes a result of pair rotation.

Continuous Integration becomes Continuous Deployment

In Extreme Programming explained Kent Beck explains that “XP shortens the release cycle”, but still talks about planning “releases once a Quarter”. It suggests we should have a ten-minute build, integrate with the rest of the team after no more than a couple of hours, and do Daily Deployments.

We can take these practises to more of an extreme.

Deploy to production after no more than a couple of hours
Not just build but deploy to production in under ten minutes
Allow the business to choose to release whenever they wish because we decouple deployment from release

Each of these vertical blue lines is our team deploying a service to production during an afternoon.

I think of Continuous Deployment as full Continuous Integration. Not only are we integrating our changes with the rest of the team, but the rest of the world in our production environment. We reduce the release deployment pain to zero because we’re doing it all the time, and we get feedback from our production monitoring, our customers and our users.

David Farley recently said “Reduce cycle time and the rest falls out” at Pipeline 2015. When turning up the dial on release frequency we find we need to turn other dials too.

Collective Code-Ownership becomes Collective Product-Ownership

XP Explained suggests that “Anyone on the team can improve any part of the system at any time” but mostly discusses the idea of shared code – anyone is free to modify any part of the codebase that team owns. However, it also suggests that we need real customer involvement in the team.

We wish to move fast for two reasons.

  1. To generate value as rapidly as possible
  2. To get feedback as frequently as possible to ensure the above

Continuous Deployment helps us move fast and get valuable feedback frequently, but the freedom to deploy-continually is only practicable if our teams also have a collective responsibility for maintaining our systems in production. To make sensible decisions about what to build and release we need to also have the responsibility of being woken up when it goes wrong at 2am.

Continuous deployment and collective ownership of infrastructure operations means we can move fast, but then the bottleneck becomes the customer. We can learn whether our changes are having the desired effect rapidly, but the value of the feedback is not fully realised if we cannot act upon the feedback to change direction until a scheduled customer interaction such as a prioritisation meeting.

Hence we extend collective ownership to all aspects of product development.

  1. Product planning
  2. Coding
  3. Keeping it running in production

Everyone on the team should be not only be free, but encouraged to

  • Change any part of the codebase
  • Change any part of our production infrastructure
  • Discuss product direction with business decision makers.

Products not Projects

Collective product ownership implies some more things. Instead of giving teams a project to build a product or some major set of features over a time period, the team is given ownership of one or more Products and tasked with adding value to that product. This approach allows for full collective product ownership, close collaboration with customers, and fully embracing change as we learn which features turn out to be valuable and which do not.

This approach is more similar to the military idea of Commander’s Intent. The team is aware of the high level business goals, but it is up to the team, with embedded customer to develop their own plans to transform that thought into action.

Hypotheses as well as Stories

User stories are placeholders for conversation and delivery of small units of customer-visible functionality. These are useful, but often we make decisions about which features to prioritise based on many assumptions. If we can identify these assumptions we can construct tests to confirm whether they are accurate and reduce the risk of spending time implementing the feature.

When working in close collaboration with customers to test assumptions and collectively decide what to do there’s also less need for estimation of the cost of stories, and there’s certainly less need to maintain long backlogs of stories that we plan to do in the future. We can focus on what we need to learn and build in the immediate future.

Test-First Programming becomes Monitoring-First Programming

Or TDD becomes MDD. When we collectively own the whole product lifecycle we can write our automated production monitoring checks first, and see them fail before we start implementing our features.

This forces us to make sure that features can be easily monitored, and helps us avoid scope creep, just like TDD, while ensuring we have good production monitoring coverage.

It also helps us have trust in our production systems and the courage to make frequent changes to our production system.s

Just like with TDD, MDD gives us feedback about the design of our systems, which we can use to improve our designs. It also gives us a rhythm for doing so.

Continuous Learning

Nearly all of the above are designed to maximise learning, whether it is from our peers during development, our production environment when we integrate our changes, our users when we release changes, or our customers when we test hypotheses.

But it’s also important to have space for individual learning, it helps retain people and benefits the team.

Extreme Programming practices have changed in the last 15 years because we are continually learning. Some of the ways to provide space for learning include

  • Gold cards/20% time – provide dedicated and regular time for individuals in the team to do work of their own choosing. Providing opportunity for bottom-up innovation.
  • Dev-exchanges – regularly swap developers with other organisations to allow for sharing of ideas between organisations
    Meetups and Conferences – Encouraging each other to attend and speak at conferences and local meetups helps accelerate learning from other organisations.
  • Team-Rotations – Regularly swapping people between teams inside the organisation helps spread internal ideas around.

Posted by & filed under Conferences, ContinuousDelivery, XP.

One of the more interesting questions that came up at Pipeline Conference was:

“How can we mitigate the risk of releasing a change that damages our data?”

When we have a database holding data that may be updated and deleted, as well as inserted/queried, then there’s a risk of releasing a change that causes the data to be mutated destructively. We could lose valuable data, or worse – have incorrect data upon which we make invalid business decisions.

Point in time backups are insufficient. For many organisations, simply being unable to access the missing or damaged data for an extended period of time while a backup is restored, would have an enormous cost. Invalid data used to make business decisions could also result in a large loss.

Worse, with most kinds of database it’s much harder to roll back the database to a point in time, than it is to roll-back our code. It’s also hard to or isolate and roll-back the bad data, while retaining the good data inserted since a change.

How can we avoid release-paralysis when there’s risk of catastrophic data damage if we release bad changes?

Practices like having good automated tests and pair programming may reduce the risk of releasing a broken change – but in the worst-case scenario where they don’t catch a destructive bug, how can we mitigate its impact?

Here’s some techniques I think can help.

Release more Frequently

This may sound counter-intuitive. If every release we make has a risk of damaging our data, surely releasing more frequently increases that risk?

There has been lots written about this. The reality seems to be that the more frequent our releases, the smaller they are, which means the chances of one causing a problem are reduced.

We are able to reason about the impact of a tiny change more easily than a huge change. This helps us to think through potential problems when reviewing before deployment.

We’re also more easily able to confirm that a small change is behaving as expected in production. Which means we should notice any undesirable behaviour more quickly. Especially if we are practising monitoring driven development,

Attempting to release more frequently will likely force you to think about the risks involved in releasing your system, and consider other ways to mitigate them. Such as…

Group Data by Importance

Not all data is equally important. You probably care a lot that financial transactions are not lost or damaged, but you may not care quite so much whether you know when a user last logged into your system.

If every change you release is theoretically able to both update the user’s last logged in date, and modify financial transactions, then there’s some level of risk that it does the latter when you intended it to do the former.

Simply using different credentials and permissions to control which parts of your system can modify what data, can increase your confidence in changing less critical parts of your system. Most databases support quite fine-grained permissions, to restrict what applications are able to do, and you can also physically separate different categories of data.

Separate Reads from Writes

If you separate the responsibilities of reading/writing data into separate applications (or even clearly separated parts of the same application), you can make changes to code that can only read data with more peace of mind, knowing there’s limits to how badly it can go wrong.

Command Query/Responsibility Separation can also help simplifying conceptual models, and simplify certain performance problems.

Make Important data Immutable

If your data is very important, why allow it to be updated at all? If it can’t be changed, then there’s no risk of something you release damaging it.

There’s no reason you should ever need to alter a financial transaction or the fact that an event has occurred.

There are often performance reasons to have mutable data, but there’s rarely a reason that your canonical datastore needs to be mutable.

I like using append-only logs as the canonical source of data.

If changes to a data model are made by playing through an immutable log, then we can always recover data by replaying the log with an older snapshot.

If you have an existing system with lots of mutable database state, and you can’t easily change it, you may be able to get some of the benefits by using your database well. Postgres allows you to archive its Write-Ahead-Logs. If you archive a copy of these you can use them to restore the state of the database at any arbitrary point in time, and hence recover data even if it was not captured in a snapshot backup.

Delayed Replicas

Let’s say we mess up really badly and destroy/damage data. Having good snapshot backups probably isn’t going to save us, especially if we have a lot of data. Restoring a big database from a snapshot can take a significant amount of time. You might even have to do it with your system offline or degraded, to avoid making the problem worse.

Some databases have a neat feature of delayed replcation. This allows you to have a primary database, and then replicate changes to copies, some of which you delay by a specified amount of time.

This gives you a window of opportunity to spot a problem. If you do spot a problem you have the option to failover to a slightly old version, or recover data without having to wait for a backup to restore. For example you could have a standby server that is 10 minutes delayed, another at 30mins, and another at an hour.

When you notice the problem you can either failover to the replica or stop replication and merge selected data back from the replica to the master.

Even if your database can’t do this, you could also build it in at the application level. Particularly if your source of truth is a log or event-stream.

Verify Parallel Producers

There will always be some changes that are riskier than others. It’s easy to shy away from updating code that deals with valuable data. This in turn makes the problem worse, it can lead to the valuable code being the oldest and least clean code in your system. Old and smelly code tends to be riskier to change and release.

Steve Smith described an application pattern called verify branch by abstraction that I have used successfully to incrementally replace consumers of data, such as reporting tools or complex calculations.

A variation of this technique can also be used to incrementally and safely make large, otherwise risky changes, to producers of data. i.e. things that might need to write to stores of important data and can potentially damage them.

In this case we fork our incoming data just before the point at which we want to make a change. This could be achieved by sending HTTP requests to multiple implementations of a service, by having two implementations of a service ingesting different event logs, or simply having multiple implementations of an interface within an application, one of which delegates to both the old and new implementation.

At whatever stage we fork our input, we write the output to a separate datastore that is not yet used by our application.

We can then compare the old and new datastore after leaving it running in parallel for a suitable amount of time.

If we are expecting it to take some time to make our changes, it may be worth automating this comparison of old and new. We can have the application read from both the old and the new datastores, and trigger an alert if the results differ. If this is too difficult we could simply automate periodic comparisons of the datastores themselves and trigger alerts if they differ.


If you’re worried about the risk to your data of changing your application, look at how you can change your design to reduce the risk. Releasing less frequently will just increase the risk.

Posted by & filed under Conferences, ContinuousDelivery, XP.

Alex and I recently gave a talk at Pipeline Conference about our approach of testing in production.

With our limited time we focused on things we check in production. Running our acceptance/integration tests, performance tests, and data fuzzing against our production systems. We also prefer doing user acceptance testing and exploratory testing in production.

In a Continuous-Deployment environment with several releases a day there’s little need or time for staging/testing environments. They just delay the rate at which we can make changes to production. We can always hide incomplete/untested features from users with Feature Toggles

“How do you cope with junk data in production?”

The best question we were asked about the approach of both checking and testing in production, was “How do you cope with the junk test data that it produces?”. Whether it is an automated system injecting bad data into our application, or a human looking for new ways to break the system, we don’t want to see this test data polluting real users’ views or reports. How do we handle this?

Stateless or Read Only Applications

Sometimes we cheat beacause it’s possible to make a separately releasable part of a system entirely stateless. The application is effectively a pure function. Data goes in, data comes out with some deterministic transformation.

These are straightforward to both check and test in production, as no action we take can result in unexpected side-effects. It’s also very hard to alter the behaviour of a stateless system, but not impossible – for example if you overload the system, its performance will be altered.

Similarly, we can test and check read-only applications to our heart’s content without worrying about data we generate. If we can keep things that read and write data separate, we don’t have to worry about any testing of the read-only parts.

Side-Effect Toggles

When we do have side-effects, if we make the side-effects controllable we can avoid triggering them except when explicitly checking that they exist.

For example, an ad unit on a web page is generally read-only in that no action you can perform with it can change it. However, it does trigger side effects in that the advertising company can track that you are viewing or clicking on the ad.

If we had a way of loading the ad, but could disable its ability to send out tracking events, then we can check any other behaviour of the ad without worrying about the side effects. This technique is useful for running selenium webdriver tests against production systems to check the user interactions, without triggering side effects.

In a more complex application we could have the ability to only grant certain users read-only access. That way we can be sure that bots or humans using those accounts can’t generate invalid data.

Data Toggles

Ultimately, if we are going to check or test that our production systems are behaving as we expect, we do need to be able to write data. One way to deal with this is data-toggles. A simple boolean flag against a record which indicates the data is test data. You can then use a similar flag at a user or query level to hide/show the test data.

This might sound like a bit of a pain, but often this functionality is needed in any case to fulfil business requirements –

Reporting systems often need a way to filter out data which is invalid, anomalous, or outdated. Test data is just one type of data that is invalid in normal reports.

Many systems need a security system to control which users have access to what data. This is exactly what we want to achieve – hiding data generated by test users from real users.

We can often re-use the permissions and filtering systems that we we needed to build anyway, to hide our test data.

Posted by & filed under c#, Java.

A feature often missed in Java by c# developers is yield return

It can be used to create Iterators/Generators easily. For example, we can print the infinite series of positive numbers like so:

public static void Main()
    foreach (int i in positiveIntegers()) 
public static IEnumerable<int> positiveIntegers()
    int i = 0;
    while (true) yield return ++i;

Annoyingly, I don’t think there’s a good way to implement this in Java, because it relies on compiler transformations.

If we want to use it in Java there are three main approaches I am aware of, which have various drawbacks.

The compile-time approach means your code can’t be compiled with javac alone, which is a significant disadvantage.

The bytecode transformation approach means magic going on that you can’t easily understand by following the code. I’ve been burnt by obscure problems with aspect-oriented-programming frameworks using bytecode manipulation enough times to avoid it.

The threads approach has a runtime performance cost of extra threads. We also need to dispose of the created threads or we will leak memory.

I don’t personally want the feature enough to put up with the drawbacks of any of these approaches.

That being said, if you were willing to put up with one of these approaches, can we make them look cleaner in our code.

I’m going to ignore the lombok/compile-time transformation approach as it allows pretty much anything.

Both the other approaches above require writing valid Java. The threads approach is particularly verbose, but there is a wrapper which simplify it down to returning an anonymous implementation of an abstract class that provides yield / yieldBreak methods. e.g.

public Iterable<Integer> oneToFive() {
    return new Yielder<Integer>() {
        protected void yieldNextCore() {
            for (int i = 1; i < 10; i++) {
                if (i == 6) yieldBreak();

This is quite ugly compared to the c# equivalent. We can make it cleaner now we have lambdas, but we can’t use the same approach as above.

I’m going to use the threading approach for this example as it’s easier to see what’s going on.

Let’s say we have an interface Foo which extends Runnable, and provides an additional default method.

interface Foo extends Runnable {
    default void bar() {}

If we create an instance of this as an anonymous inner class we can call the bar() method from our implementation of run();

Foo foo = new Foo() {
    public void run() {

However, if we create our implementation with a lambda this no longer compiles

Foo foo = () -> {
    bar(); // java: cannot find symbol. symbol: method bar()

This means we’ll have to take a different approach. Here’s something we can do, that is significantly cleaner thanks to lambdas.

public Yielderable<Integer> oneToFive() {
    return yield -> {
        for (int i = 1; i < 10; i++) {
            if (i == 6) yield.breaking();

How can this work? Note the change of the method return type from Iterable to a new interface Yielderable. This means we can return any structurally equivalent lambda. Since we want to create an Iterable we’ll need to extend Iterable to have it behave as an Iterable.

public interface Yielderable<T> extends Iterable<T> { /* ... */ }

The Iterable interface already has an abstract method iterator(), which means we’ll need to implement that if we want to add a new method to build our yield statement, while still remaining a lambda-compatible single method interface.

We can add a default implementation of iterator() that executes a yield definition defined by a new abstract method.

public interface Yielderable<T> extends Iterable<T> {
    void execute(YieldDefinition<T> builder);
    default Iterator<T> iterator() {  /* ... */  }

We now have a single abstract method, still compatible with a lambda. It accepts one parameter – a YieldDefinition. This means we can call methods defined on YieldDefinition in our lambda implementation. The default iterator() method can create an instance of a YieldDefinition and invoke the execute method for us.

public interface Yielderable<T> extends Iterable<T> {
    void execute(YieldDefinition<T> builder);
    default Iterator<T> iterator() {
	    YieldDefinition<T> definition = new YieldDefinition<>()
	    //... more implementation needed.

Our YieldDefinition class provides the returning(value) and breaking() methods to use in our yield definition.

class YieldDefinition<T> {
    public void returning(T value) { /* ... */ }
    public void breaking() { /* ... */ }

Now Java can infer the type of the lambda parameter, allowing us to call them in the lambda body. You should even get code completion in your IDE of choice.

I have made a full implementation of the threaded approach using with above lambda definition style if you are interested.

There’s some more examples of what is possible and the full implementation for reference.

Posted by & filed under Java.

The Java Streams API is lovely, but there are a few operations that One repeats over and over again which could be easier.

One example of this is Collecting to a List.

List<String> input = asList("foo", "bar");
List<String> filtered = input
  .filter(s -> s.startsWith("f"))

There are a few other annoyances too. For example I find myself using flatMap almost exclusively with methods that return a Collection not a Stream. Which means we have to do

List<String> result = input
  .flatMap(s -> Example.threeOf(s).stream())
public static List<String> threeOf(String input) {
  return asList(input, input, input);

It would be nice to have a convenience method to allow us to use a method reference in this very common scenario

List<String> result = input
  .flatMap(Example::threeOf) // Won't compile, threeOf returns a collection, not a stream 

Benjamin Winterberg has written a good guide to getting intellij to generate these for us . But we could extend the API ourselves.

Extending the Streams API is a little tricky as it is chainable, but it is possible. We can use the Forwarding Interface Pattern to add methods to List, but we want to add a new method to Stream.

What we can end up with is

EnhancedList<String> input = () -> asList("foo","bar");
List<String> filtered = list
  .filter(s -> s.startsWith("f"))

To do this we can first use the Forwarding Interface Pattern to create an enhanced Stream with our toList method. Starting with a ForwardingStrem

interface ForwardingStream<T> extends Stream<T> {
  Stream<T> delegate();
  default Stream<T> filter(Predicate<? super T> predicate) {
    return delegate().filter(predicate);
  // Other methods omitted for brevity

Now, since Stream provides a chainable api, with methods that return Streams – when we implement our EnhancedStream we need to change the subset of methods that return a Stream to return our EnhancedStream.

interface EnhancedStream<T> extends ForwardingStream<T> {
  default EnhancedStream<T> filter(Predicate<? super T> predicate) {
    return () -> ForwardingStream.super.filter(predicate);
  // Other methods omitted for brevity

Note that the filter method already exists on Stream and ForwardingStream, but as Java supports covariant return types we can override it and change the returntype to a more specific type (In this case from Stream to EnhancedStream).

You may also notice the lambda returned from the filter() method. Since ForwardingStream is a single method interface with just a Stream delegate() method it is compatible with a lambda that supplies a delegate Stream. Equivalent to a Supplier. EnhancedStream extends ForwardingStream and doesn’t declare any additional abstract methods, so we can return lambdas from each method that needs to return a Stream, delegating to the ForwardingStream. The ForwardingStream.super.filter() syntax allows us to call the implementation already defined in ForwardingStream explicitly.

Now we can add our additional behaviour to the EnhancedStream. Let’s add the toList() method, and a new flatMapCollection

interface EnhancedStream<T> extends ForwardingStream<T> {
  default List<T> toList() {
    return collect(Collectors.toList());
  default <R> EnhancedStream<R> flatMapCollection(Function<? super T, ? extends Collection<? extends R>> mapper) {
    return flatMap(mapper.andThen(Collection::stream));
  // Other methods omitted for brevity

Finally, we’ll need to hook up the Stream to our List. We can use the same technique of a forwarding interface which overrides a method and specifies a more specific return type.

interface EnhancedList<T> extends ForwardingList<T> {
  default EnhancedStream<T> stream() {
    return () ->;
interface ForwardingList<T> extends List<T> {
  List<T> delegate();
  default int size() {
    return delegate().size();
  // All the other methods omitted for brevity 

So now we can put it all together. EnhancedList is a single method interface so compatible with any existing List in our code, we just wrap it in a lambda, and then we can flatMapCollection() and toList(). The following prints “foo” three times.

EnhancedList<String> input = () -> asList("foo", "bar");
List<String> result = input
// ...
public static List<String> threeOf(String input) {
  return asList(input, input, input);

Here’s the unabbreviated code.

Posted by & filed under ContinuousDelivery, XP.

I have become increasingly convinced that there is little difference between monitoring and testing. Often we can run our automated tests against a production system with only a little effort.

We are used to listening to our automated tests for feedback about our software in the form of test-smells. If our tests are complicated, it’s often a strong indicator that our code is poorly designed and thus hard to test. Perhaps code is too tightly coupled, making it hard to test parts in isolation. Perhaps a class has too many responsibilities so we find ourselves wanting to observe and alter private state.

Doesn’t the same apply to our production monitoring? If it’s hard to write checks to ensure our system is behaving as desired in production, it’s easy to blame the monitoring tool, but isn’t it a reflection on the applications we are monitoring. Why isn’t it easy to monitor them?

Just as we test-drive our code, I like to do monitoring check-driven development for our applications. Add checks that the desired business features are working in production even before we implement them, then implement them and deploy to production in order to make the monitoring checks pass. (This does require that the same team building the software is responsible for monitoring it in production.)

As I do this more I notice that in the same way that TDD gives me feedback on code design (helping me write better code), Check-driven-development gives feedback on application design.

Why are our apps too hard to test? Perhaps they are too monolithic so the monitoring tools can’t check behaviour or data buried in the middle. Perhaps they do not publish the information we need to ascertain whether they are working.

Here are 4 smells of production monitoring checks that I think give design feedback, just like their test counterparts. There are lots more.

  • Delving into Private State
  • Complexity, Verbosity, Repetition
  • Non-Determinism
  • Heisentests

Delving into Private State

Testing private methods in a class often indicates there’s another class screaming to be extracted. If something is complex enough to require its own test, but not part of the public interface of the object under test, it likely violates the single responsibility principle and should be pulled out.

Similarly, monitoring checks that peek into the internal state of an app may indicate there’s another app that could be extracted with a clearly separate responsibility that we could monitor independently.

For example, we had an application that ingested some logs, parsed them and generated some sequenced events, which then triggered various state changes.

We had thoroughly tested all of this, but we still occasionally had production issues due to unexpected input and problems with the sequence generation triggered by infrastructure-environment problems.

In response, we added monitoring that spied on the intermediate state of the events flowing through the system, and spied on the database state to keep track of the sequencing.

Monitoring that pokes into the internals of an application like this is similar to tests that use reflection to spy on private state – and similar problems arise. In this case schema changes and refactorings of the application would break the monitoring checks.

In the end we split out some of these responsibilities. We ended up with an entirely stateless app that created events from logs, a separate event sequencer, and the original app consumed the resultant events.

The result is much easier to monitor as the responsibilities are more clear. We can monitor the inputs and outputs of the event generator, both passively and actively by passing through test data in production.

Our monitoring checks are relying on the public interface that is used by the other applications so we are less likely to inadvertently break it. It’s similarly easy to monitor what the sequencer is doing, and we can ingest the events from our production system in a copy to spot problems early.

Complexity, Verbosity, Repetition

I’ve also been guilty of writing long monitoring checks that poll various systems, perform some calculations on the results, and then determine whether there is a problem based on some business logic embedded in the check.

These checks tend to be quite long, and are at risk of copy-paste duplication when someone wants to check something similar. Just as long and duplicated unit tests are a smell, so are monitoring checks. Sometimes we even want to TDD our checks because we realise they are complex.

When our tests get long and repetitive, we sometimes need to invest in improving our test infrastructure to help us write more concise tests in the business domain language. The same applies to monitoring checks – perhaps we need to invest in improving our monitoring tools if the checks are long and repetitive.

Sometimes verbose and unclear tests can be a sign that the implementation they are testing is also unclear and at the wrong abstraction level. If we have modelled concepts poorly in our code then we’ll struggle to write the tests (which are often closer to the requirements). It’s a sign we need to improve our model to simplify how it fulfils the requirements.

For example, if we had a payment amount in our code which we were representing as an integer or a currency value, but could only be positive in our context, we might end up with tests in several places that check that the number is positive and check what happens if we ended up with a negative number due to misuse.

public void deposit(int paymentAmount) { ... }
public void withdraw(int paymentAmount) { ... }
public void deposit_should_not_accept_negative_paymentAmount() { ... }
public void withdraw_should_not_accept_negative_paymentAmount() { ... }

We might spot this repetition and complexity in our tests and realise we need to improve our design, we could introduce a PaymentAmount concept that could only be instantiated with positive numbers and pass that around instead, removing the need to test the same thing everywhere.

class PaymentAmount { ... }
public void should_not_be_able_to_create_negative_paymentamount() { ... }
public void deposit(PaymentAmount amount) { ... }
public void withdraw(PaymentAmount amount) { ... }

In the same way monitoring checks that are repetitive can often be replaced with enforcement of invariants within applications themselves. Have the applications notify the monitoring system if an assumption or constraint is violated, rather than having the monitoring systems check themselves. This encapsulates the business logic in the applications and keeps checks simple.


COUNT=$(runSql "select count(*) from payment_amounts where value < 0")
# Assert value of count


class PaymentAmount {
  public PaymentAmount(int value) {
    monitoredAssert(value > 0 , "Negative payment amount observed")
static void monitoredAssert(boolean condition, String message) {
  if (!condition) notifyMonitoringSystemOfFailure(message);

Now your monitoring system just alerts you to the published failures.

Too often I’ve written checks for things that could be database constraints or assertions in an applications.


Non-deterministic tests are dangerous. They rapidly erode our confidence in a test suite. If we think a test failure might be a false alarm we may ignore errors until some time after we introduce them. If it takes multiple runs of a test suite to confirm everything is OK then our feedback loops get longer and we’re likely to run our tests less frequently.

Non-deterministic monitoring checks are just as dangerous, but they’re so common that most monitoring tools have built in support for only triggering an incident/alert if the check fails n times in a row. When people get paged erroneously on a regular basis it increases the risk they’ll ignore a real emergency.

Non-deterministic tests or monitoring checks are often a sign that the system under test is also unreliable. We had a problem with a sporadically failing check that seemed to always fix itself. It turned out to be due to our application not handling errors from the S3 API (which happen quite frequently). We fixed our app to re-try under this scenario and the check became reliable.


A test or check that influences the behaviour it is trying to measure is another smell. We’ve had integration tests that ran against the same system as our performance tests and influenced the results by priming caches. We’ve also had similar issues with our production monitoring.

In one particularly fun example, we had reports that one of our webapps was slow during the night. We were puzzled and decided to start by adding monitoring of the response times of the pages in question at regular intervals so we could confirm the user’s reports.

The data suggested there was no problem, and the users actually reported the problem fixed. It turned out that by loading these pages regularly we were keeping data in caches that normally expired during low usage periods. By monitoring the application we had changed the behaviour.

Both of these scenarios were resolved by having the application publish performance metrics that we could check in our tests. We could then query from a central metrics database for production monitoring. This way we were checking performance against our acceptance criteria and real-world user behaviour, and not influencing that behaviour.


Writing your production monitoring before you implement features can help you towards a better design. Look out for smelly monitoring, what is it telling you about your application design?

What other smells have you noticed from monitoring your systems in production?

Posted by & filed under Java.

A common frustration with Java is the inability to overload methods when the method signatures differ only by type parameters.

Here’s an example, we’d like to overload a method to take either a List of Strings or a List of Integers. This will not compile, because both methods have the same erasure.

class ErasureExample {
    public void doSomething(List<String> strings) {
        System.out.println("Doing something with a List of Strings " );
    public void doSomething(List<Integer> ints) { 
        System.out.println("Doing something with a List of Integers " );

If you delete everything in the angle brackets the two methods will be identical, which is prohibited by the spec

public void doSomething(List<> strings) 
public void doSomething(List<> strings)

As with most Java things – if it’s not working, you’re probably not using enough lambdas. We can make it work with just one extra line of code per method.

class ErasureExample {
    public interface ListStringRef extends Supplier<List<String>> {}
    public void doSomething(ListStringRef strings) {
        System.out.println("Doing something with a List of Strings " );
    public interface ListIntegerRef extends Supplier<List<Integer>> {}
    public void doSomething(ListIntegerRef ints) {
        System.out.println("Doing something with a List of Integers " );

Now we call call the above as simply as the following, which will print “Doing something with a List of Strings” followed by “Doing something with a List of Integers”

public class Example {
    public static void main(String... args) {
        ErasureExample ee = new ErasureExample();
        ee.doSomething(() -> asList("aa","b"));
        ee.doSomething(() -> asList(1,2));

Using the wrapped lists inside the method is straightforward. Here we print out the length of each string and print out each integer, doubled. It will cause the above main method to print “2124”.

class ErasureExample {
    public interface ListStringRef extends Supplier<List<String>> {}
    public void doSomething(ListStringRef strings) {
        strings.get().forEach(str -> System.out.print(str.length()));
    public interface ListIntegerRef extends Supplier<List<Integer>> {}
    public void doSomething(ListIntegerRef ints) {
        ints.get().forEach(i -> System.out.print(i * 2));

This works because the methods now have different erasure, in fact the method signatures have no generics in at all. The only additional requirement is prefixing each argument with “() ->” at the callsite, creating a lambda that is equivalent to a Supplier of whatever type your argument is.

Posted by & filed under Java.

What use is a method that just returns its input? Surprisingly useful. A surprising use is as a way to convert between types.

There’s a well known trick that’s often used to work around Java’s terrible array literals that you may have come across. If you have a method that takes an array as an argument

public static void foo(String[] anArray) { }

Invoking foo requires the ugly

foo(new String[]{"hello", "world"});

For some reason Java requires the redundant “new String[]” here, even though it can be trivially inferred. Fortunately we can work around this with the following method which at first glance might seem pointless.

public static <T> T[] array(T... input) {
    return input;

It just returns the input. However it is accepting an array of Type T in the form of a varargs, and returning that array. It becomes useful due as now Java will infer the types and create the array cleanly. We can now call foo like so.

foo(array("hello", "world"));

That is a neat trick, but it really becomes useful in Java 8 thanks to structural typing of lambdas/method references and implicit conversions between types. Here’s another example of a method that’s far more useful than it appears at first glance. It just accepts a function and returns the same function.

public static <T,R> Function<T,R> f(Function<T,R> f) {
    return f;

The reason it’s useful is we can pass it any structurally equivalent method reference and have it converted to a java.util.function.Function, which provides us some useful utility methods for function composition.

Here’s an example. Let’s say we have a list of Libraries (Collection of Collection of Books)

interface Library {
    List<Book> books();
    static Library library(Book... books) {
        return () -> asList(books);
interface Book {
    String name();
    static Book book(String name) { return () -> name; }
List<Library> libraries = asList(
    library(book("The Hobbit"), book("LoTR")),
    library(book("Build Quality In"), book("Lean Enterprise"))

We can now print out the book titles thusly
    .flatMap(library -> library.books().stream()) // Stream of libraries to stream of books.
    .map(Book::name) // Stream of names

But that flatMap call is upsetting, everything else is using a method reference, not a lambda expression. I’d really like to write the following, but it won’t compile because flatMap requires a function that returns a Stream, rather than a function that returns a List.
    .flatMap(Library::books) // Compile Error, wrong return type.

Here’s where our method that returns its input comes in again. This compiles fine.
public static <T,R> Function<T,R> f(Function<T,R> f) {
    return f;

This works because Library::books is equivalent to a Function<Library, List<Book>>, so passing it to the f() method implicitly converts it to that type. java.util.function.Function provides an andThen method which returns a new function which composes the two functions.

Now in this trivial example it’s actually longer to write this than the equivalent lambda, but it can be useful when combining more complex examples.

We can do the same thing with other functional interfaces. For example to allow Predicate composition or negation.

Here we have a handy isChild() method implemented for us on Person, but we really want the inverse – isAdult() check to pass to the serveAlcohol method. This sort of thing comes up all the time.

interface Person {
    boolean isChild();
    static Person child() { return () -> true; }
    static Person adult() { return () -> false; }
public static void serveAlcohol(Person person, Predicate<Person> isAdult) {
    if (isAdult.test(person)) System.out.println("Serving alcohol");

If we want to reuse Person::isChild we can do the same trick. The p() method converts the method reference to a Predicate for us, and we can then easily negate it.

serveAlcohol(adult(), p(Person::isChild).negate());
public static <T> Predicate<T> p(Predicate<T> p) {
    return p;

Have you got any other good examples?

Posted by & filed under XP.

At work, we’ve always pair-programmed all our production code, so we’re already pretty bought into it being a good idea to have multiple people working on a single problem. I previously wrote about some of the reasons for pairing.

Mob Programming

Recently, having inspired by a talk by Woody Zuill, we decided to give mob-programming a go, and our experiences so far have been very positive.

Mob programming is having the whole team working on the same problem, at the same time, using a single workstation.

We’re not using it all the time right now. We’ve started doing Mob-Fridays as a way of regularly working together as a group instead of pairing. We’re still pretty new to it – only having done it for a few weeks, but I thought I’d post some of my observations thus far.


Here’s our setup. We all (4-6 of us) sit round in a semicircle, as we would when having a group disscusion. We have a big 124cm HD TV for everyone to see the code on, and a 76cm monitor for the person at the keyboard – positioned perpendicularly to the TV. This allows the driver to see the rest of the team. We also have a large whiteboard behind the team (just out of the photo) which we can scribble design ideas on.

We have been using strict 5 minute rotations for driving. Every 5 minutes the person with the keyboard relinquishes it and another team member takes over. This gives us a rhythm for continuous deployment (We try to deploy to production after between 5-10 rotations – i.e. at least once an hour). 5 minute rotations keep it very fast paced, and keeps everyone engaged.

We’ve also tried including team members with specialities in our mobbing sessions, including having them drive. We’ve had mob sessions with our product manager and UX specialist. I think it could be interesting to include our internal team customers in the future.



You may be thinking that this can’t possibly be efficient. Surely 5 or 6 people working individually can get more done than working together, constrained by the speed that one person can type and they can communicate. I think you might well be right, but the amount of stuff a team can get done (throughput) is not necessarily what you want to optimise. Often the speed at which we can get from where we are now to achieving a business goal is more important (latency). Anything we can do to get there faster is a good thing, even if it’s less efficient in terms of throughput.

Regardless of the efficiency of cranking out code, mobbing provides several efficiencies.

Mobbing eliminates a whole class of meetings – removing synchronisation points that slow down developers working independently. There’s no need for detailed design discussions in advance of starting on implementation, because everyone can contribute to the design while working on it. There’s also no need for traditional standup meetings to catch up on what is going on. When working together as a team everyone knows what everyone has been doing and is going to do.

There is also less time loss due to interruptions. People seem more reticent to interrupt a group session than an individual or pair – We pause to answer any questions from outside the team and have a break after each person in the team has had a driving session. It’s also less disruptive when someone’s phone rings or someone needs a toilet break. They can just nip out of the mob and let the mob continue. When pairing, work often stops when these small interruptions occur, and a lot of the context is lost.

A combination of the 5 minute cadence and having more people involved also seems to help avoid wasting time doing things that we don’t really need, which helps us move faster.

We’re also able to more rapidly adapt to what we discover in the course of implementation. Our preconceived ideas of how we might build a feature don’t always survive implementation. We often learn along the way that our original plans won’t work. When pairing we often convened team huddles to discuss these issues before continuing. When working as a mob we just press through them unfazed, without any delay waiting for input from others.


Mobbing seems great for making significant architectural changes to the system. Things that you need everyone on the team to be bought into, and ideally want as many pairs of eyes on to avoid problems. For instance, we have been mobbing on a new design for a system that processes money. It’s a core technology for us, that it’s important for the whole team to understand, and since it deals with processing money mistakes could be costly.

Mobbing also completely eliminates one of the problems I’ve observed with pair programming – that purity of design can be lost when you rotate out one of the developers from the pair and swap someone in. When mobbing everyone on the team gets to see designs through to the end.

Another reason for mobbing is that it’s great fun. Doing something together as a team makes us a better team. Mobbing is a teambuilding activity that’s actually achieving what we would be achieving if we were working individually.


Do try mob programming yourself. It’s great fun, should help you be a better team, and an effective way to build software.