Learning from Pain

Pain is something we generally try to avoid; pain is unpleasant, but it also serves an important purpose.

Acute pain can be feedback that we need to avoid doing something harmful to our body, or protect something while it heals. Pain helps us remember the cause of injuries and adapt our behaviour to avoid a repeat.

As a cyclist I occasionally get joint pain that indicates I need to adjust my riding position. If I just took painkillers and ignored the pain I’d permanently injure myself over time.

I’m currently recovering from a fracture after an abrupt encounter with a pothole. The pain is helping me rest and allow time for the healing process. The memory of the pain will also encourage me to consider the risk of potholes when riding with poor visibility in the future.

We have similar feedback mechanisms when planning, building, and running software; we often find things painful.

Alas, rather than learn from pain and let it guide us, we all too often stock up on painkillers in the form of tooling or practices that let us press on obstinately doing the same thing that caused the pain in the first place.

Here are some examples…

Painful Tests

Automated tests can be a fantastic source of feedback that helps us improve our software and learn to write better software in the future. Tests that are hard to write are a sign something could be better.

The tests only help us if we listen to the pain we feel when tests are hard to write and read. If we reach for increasingly sophisticated tooling to allow us to continue doing the painful things, then we won’t realise the benefits. Or worse, if we avoid unit testing in favour of higher level tests, we’ll miss out on this valuable feedback altogether.

Here’s an example of a test that was painful to write and read, testing the sending of a booking confirmation email.

@Test // Click to Expand, Full code in link above
public void sendsBookingConfirmationEmail() {
    var emailSender = new EmailSender() {
        String message;
        String to;
 
        public void sendEmail(String to, String message) {
            this.to = to;
            this.message = message;
        }
 
        public void sendHtmlEmail(String to, String message) {
 
        }
 
        public int queueSize() {
            return 0;
        }
    };
 
    var support = new Support() {
        @Override
        public AccountManager accountManagerFor(Customer customer) {
            return new AccountManager("Bob Smith");
        }
 
        @Override
        public void calculateSupportRota() {
 
        }
 
        @Override
        public AccountManager superviserFor(AccountManager accountManager) {
            return null;
        }
    };
 
 
    BookingNotifier bookingNotifier = new BookingNotifier(emailSender, support);
 
    Customer customer = new Customer("jane@example.com", "Jane", "Jones");
    bookingNotifier.sendBookingConfirmation(customer, new Service("Best Service Ever"));
 
    assertEquals("Should send email to customer", customer.email, emailSender.to);
    assertEquals(
        "Should compose correct email",
        emailSender.message,
        "Dear Jane Jones, you have successfully booked Best Service Ever on " + LocalDate.now() + ". Your account manager is Bob Smith"
    );
 
}

The test method is very long at around 50 lines of code
We have boilerplate setting up stubbing for things irrelevant to the test such as queue sizes and supervisors
We’ve got flakiness from assuming the current date will be the same in two places—the test might not pass if run at midnight, or when changing the time
There’s multiple assertions for multiple responsibilities
We’ve had to work hard to capture side effects

Feeling this pain, one response would be to reach for painkillers in the form of more powerful mocking tools. If we do so we end up with something like this. Note that we haven’t improved the implementation at all (it’s unchanged), but now we’re feeling a lot less pain from the test.

@Test // Click to Expand, Full code in link above
public void sendsBookingConfirmationEmail() throws Exception {
    var emailSender = mock(EmailSender.class);
    var support = mock(Support.class);
 
    BookingNotifier bookingNotifier = new BookingNotifier(emailSender, support);
 
    LocalDate expectedDate = LocalDate.parse("2000-01-01");
    Customer customer = new Customer("jane@example.com", "Jane", "Jones");
    when(support.accountManagerFor(customer)).thenReturn(new AccountManager("Bob Smith"));
    mockStatic(LocalDate.class, args -> expectedDate);
 
    bookingNotifier.sendBookingConfirmation(customer, new Service("Best Service Ever"));
 
    verify(emailSender).sendEmail(
        customer.email,
        "Dear Jane Jones, you have successfully booked Best Service Ever on 2000-01-01. Your account manager is Bob Smith"
    );
 
}

The test method is a quarter the length—-but the implementation is as complex
The flakiness is gone as the date is mocked to a constant value—but the implementation still has a hard dependency on the system time.
We’re no longer forced to stub irrelevant detail—but the implementation still has dependencies on collaborators with too many responsibilities.
We only have a single assertion—but there are still as many responsibilities in the implementation
It’s easier to capture the side effects—but they’re still there

A better response would be to reflect on the underlying causes of the pain. Here’s one direction we could go that removes much of the pain and doesn’t need complex frameworks

@Test // Click to Expand, Full code in link above
public void composesBookingConfirmationEmail() {
 
    AccountManagers dummyAllocation = customer -> new AccountManager("Bob Smith");
    Clock stoppedClock = () -> LocalDate.parse("2000-01-01");
 
    BookingNotificationTemplate bookingNotifier = new BookingNotificationTemplate(dummyAllocation, stoppedClock);
 
    Customer customer = new Customer("jane@example.com", "Jane", "Jones");
 
    assertEquals(
        "Should compose correct email",
        bookingNotifier.composeBookingEmail(customer, new Service("Best Service Ever")),
        "Dear Jane Jones, you have successfully booked Best Service Ever on 2000-01-01. Your account manager is Bob Smith"
    );
 
}

The test method is shorter, and the implementation does less
The flakiness is gone as the implementation no longer has a hard dependency on the system time
We’re no longer forced to stub irrelevant detail because the implementation only depends on what it needs
We only have a single assertion, because we’ve reduced the scope of the implementation to merely composing the email. We’ve factored out the responsibility of sending the email.
We’ve factored out the side effects so we can test them separately

My point is not that the third example is perfect (it’s quickly thrown together), nor am I arguing that mocking frameworks are bad. My point is that by learning from the pain (rather than rushing to hide it with tooling before we’ve learnt anything) we can end up with something better.

The pain we feel when writing tests can also be a prompt to reflect on our development process—do we spend enough time refactoring when writing the tests, or do we move onto the next thing as soon as they go green? Are we working in excessively large steps that let us get into messes like the above that are painful to clean up?

n.b. there’s lots of better examples of learning from test feedback in chapter 20 of the GOOS book.

Painful Dependency Injection

Dependency injection seems to have become synonymous with frameworks like spring, guice, dagger; as opposed to the relatively simple idea of “passing stuff in”. Often people reach for dependency injection frameworks out of habit, but sometimes they’re used as a way of avoiding design feedback.

If you start building a trivial application from scratch you’ll likely not feel the need for a dependency injection framework at the outset. You can wire up your few dependencies yourself, passing them to constructors or function calls.

As complexity increases this can become unwieldy, tedious, even painful. It’s easy to reach for a dependency injection framework to magically wire all your dependencies together to remove that boilerplate.

However, doing so prematurely can deprive you of the opportunity to listen to the design feedback that this pain is communicating.

Could you reduce the wiring pain through increased modularity—adding, removing, or finding better abstractions?

Does the wiring code have more detail than you’d include in a document explaining how it works? How can you align the code with how you’d naturally explain it? Is the wiring code understandable to a domain expert? How can you make it more so?

Here’s a little example of some manual wiring of dependencies. While short, it’s quite painful:

// Click to Expand, Full code in link above
public static void main(String... args) {
    var credentialStore = new CredentialStore();
 
    var eventStore = new InfluxDbEventStore(credentialStore);
 
    var probeStatusReporter = new ProbeStatusReporter(eventStore);
 
    var probeExecutor = new ProbeExecutor(new ScheduledThreadPoolExecutor(2), probeStatusReporter, credentialStore, new ProbeConfiguration(new File("/etc/probes.conf")));
 
    var alertingRules = new AlertingRules(new OnCallRota(new PostgresRotaPersistence(), LocalDateTime::now), eventStore, probeStatusReporter)
 
    var pager = new Pager(new SMSGateway(), new EmailGateway(), alertingRules, probeStatusReporter);
 
    var dashboard = new Dashboard(alertingRules, probeExecutor, new HttpsServer());
}

There’s a lot of components to wire together
There’s a mixture of domain concepts and details like database choices
The ordering is difficult to get right to resolve dependencies, and it obscures intent

At this point we could reach for a DI framework and @Autowire or @Inject these dependencies and the wiring pain would disappear almost completely.

However, if instead we listen to the pain, we can spot some opportunities to improve the design. Here’s an example of one direction we could go

// Click to Expand, Full code in link above
public static void main(String... args) {
 
    var probeStatus = probeExecutor();
    var probeVisibility = visibilityOf(probeStatus);
    var dashboard = dashboardFor(probeVisibility);
    var pager = pagerFor(probeVisibility);
 
}
 
private static ProbeVisibility visibilityOf(ProbeStatusReporter probeStatus) {
    var credentialStore = new CredentialStore();
    var eventStore = new InfluxDbEventStore(credentialStore);
    AlertingRules alertingRules = new AlertingRules(new OnCallRota(new PostgresRotaPersistence(), LocalDateTime::now), eventStore, probeStatus);
    return new ProbeVisibility(alertingRules, probeStatus);
}
 
static class ProbeVisibility {
    AlertingRules alertingRules;
    ProbeStatusReporter probeStatus;
 
    public ProbeVisibility(AlertingRules alertingRules, ProbeStatusReporter probeStatus) {
        this.alertingRules = alertingRules;
        this.probeStatus = probeStatus;
    }
}
 
private static Pager pagerFor(ProbeVisibility probeVisibility) {
    return new Pager(new SMSGateway(), new EmailGateway(), probeVisibility.alertingRules, probeVisibility.probeStatus);
}
 
private static Dashboard dashboardFor(ProbeVisibility probeVisibility) {
    return new Dashboard(probeVisibility.alertingRules, probeVisibility.probeStatus, new HttpsServer());
}
 
private static ProbeStatusReporter probeExecutor() {
    var credentialStore = new CredentialStore();
    var eventStore = new InfluxDbEventStore(credentialStore);
 
    var probeStatusReporter = new ProbeStatusReporter(eventStore);
    var executor = new ProbeExecutor(new ScheduledThreadPoolExecutor(2), probeStatusReporter, credentialStore, new ProbeConfiguration(new File("/etc/probes.conf")));
    executor.start();
    return probeStatusReporter;
}

We’ve spotted and fixed the dashboard’s direct dependency on the probe executor, it now uses the status reporter like the pager.
The dashboard and pager shared a lot of wiring as they had a common purpose in providing visibility on the status of probes. There was a missing concept here, adding it has simplified the wiring considerably.
We’ve separated the wiring of the probe executor from the rest.

After applying these refactorings the top level wiring reads more like a description of our intent.

Clearly this is just a toy example, and the refactoring is far from complete, but I hope it illustrates the point: dependency injection frameworks are useful, but be aware of the valuable design feedback they may be hiding from you.

Painful Integration

It’s common to experience “merge pain” when trying to integrate long lived branches of code and big changesets to create a releasable build. Sometimes the large changesets don’t even pass tests, sometimes your changes conflict with changes others on the team have made.

One response to this pain is to reach for increasingly sophisticated build infrastructure to hide some of the pain. Infrastructure that continually runs tests against branched code, or continually checks merges between branches can alert you to problems early. Sadly, by making the pain more bearable, we risk depriving ourselves of valuable feedback.

Ironically continuous-integration tooling often seems to be used to reduce the pain felt when working on large, long lived changesets; a practice I like to call “continuous isolation”.

You can’t automate away the human feedback available when integrating your changes with the rest of the team—without continuous integration you miss out on others noticing that they’re working in the same area, or spotting problems with your approach early.

You also can’t replace the production feedback possible from integrating small changes all the way to production (or a canary deployment) frequently.

Sophisticated build infrastructure can give you the illusion of safety by hiding the pain from your un-integrated code. By continuing to work in isolation you risk more substantial pain later when you integrate and deploy your larger, riskier changeset. You’ll have a higher risk of breaking production, a higher risk of merge conflicts, as well as a higher risk of feedback from colleagues being late, and thus requiring substantial re-work.

Painful Alerting

Over-alerting is a serious problem; paging people spuriously for non-existent problems or issues that do not require immediate attention undermines confidence, just like flaky test suites.

It’s easy to respond to overalerting by paying less and less attention to production alerts until they are all but ignored. Learning to ignore the pain rather than listening to its feedback.

Another popular reaction is to desire increasingly sophisticated tooling to handle the flakiness—from flap detection algorithms, to machine learning, to people doing triage. These often work for a while—tools can assuage some of the pain, but they don’t address the underlying causes.

The situation won’t significantly improve without a feedback mechanism in place, where you improve both your production infrastructure and approach to alerting based on reality.

The only effective strategy for reducing alerting noise that I’ve seen is: every alert results in somebody taking action to remediate it and stop it happening again—even if that action is to delete the offending alerting rule or amend it. Analyse the factors that resulted in the alert firing, and make a change to improve the reliability of the system.

Yes, this sometimes does mean more sophisticated tooling when it’s not possible to prevent the alert firing in similar spurious circumstances with the tooling available.

However it also means considering the alerts themselves. Did the alert go off because there was an impact to users, the business, or a threat to our error budget that we consider unacceptable? If not, how can we make it more reliable or relevant?

Are we alerting on symptoms and causes rather than things that people actually care about?
Who cares about a server dying if no users are affected? Who cares about a traffic spike if our systems handle it with ease?

We can also consider the reliability of the production system itself. Was the alert legitimate? Maybe our production system isn’t reliable enough to run (without constant human supervision) at the level of service we desire? If improving the sophistication of our monitoring is challenging, maybe we can make the system being monitored simpler instead?

Getting alerted or paged is painful, particularly if it’s in the middle of the night. It’ll only get less painful long-term if you address the factors causing the pain rather than trying hard to ignore it.

Painful Deployments

If you’ve been developing software for a while you can probably regale us with tales of breaking production. These anecdotes are usually entertaining, and people enjoy telling them once enough time has passed that it’s not painful to re-live the situation. It’s fantastic to learn from other people’s painful experiences without having to live through them ourselves.

It’s often painful when you personally make a change and it results in a production problem, at least at the time—not something you want to repeat.

Making a change to a production system is a risky activity. It’s easy to associate the pain felt when something goes wrong, with the activity of deploying to production, and seek to avoid the risk by deploying less frequently.

It’s also common to indulge in risk-management theatre: adding rules, processes, signoff and other bureaucracy—either because we mistakenly believe it reduces the risk, or because it helps us look better to stakeholders or customers. If there’s someone else to blame when things go wrong, the pain feels less acute.

Unfortunately, deploying less frequently results in bigger changes that we understand less well; inadvertently increasing risk in the long run.

Risk-management theatre can even threaten the ability of the organisation to respond quickly to the kind of unavoidable incidents it seeks to protect against.

Yes, most production issues are caused by an intentional change made to the system, but not all are. Production issues get caused by leap second bugs, changes in user behaviour, spikes in traffic, hardware failures and more. Being able to rapidly respond to these issues and make changes to production systems at short notice reduces the impact of such incidents.

Responding to the pain of deployments that break production by changing production less often, is pain avoidance rather than addressing the cause.

Deploying to production is like bike maintenance. If you do it infrequently it’s a difficult job each time and you’re liable to break something. Components seize together, the procedures are unfamiliar, and if you don’t test-ride it when you’re done then it’s unlikely to work when you want to ride. If this pain leads you to postpone maintenance, then you increase the risk of an accident from a worn chain or ineffective brakes.

A better response with both bikes and production systems is to keep them in good working order through regular, small, safe changes.

With production software changes we should think about how we can make it a safe and boring activity—-how can we reduce the risk of deploying changes to production, or how can we reduce the impact of deploying bad changes to production.

Could the production failure have been prevented through better tests?

Would the problem have been less severe if our production monitoring had caught it sooner?

Might we have spotted the problem ourselves if we had a culture of testing in production and were actually checking that our stuff worked once in production?

Perhaps canary deploys would reduce the risk of a business-impacting breakage?

Would blue-green deployments reduce the risk by enabling swift recovery?

Can we improve our architecture to reduce the risk of data damage from bad deployments?

There are many many ways to reduce the risk of deployments, we can channel the pain of bad deployments into improvements to our working practices, tooling, and architecture.

Painful Change

After spending days or weeks building a new product or feature, it’s quite painful to finally demo it to the person who asked for it and discover that it’s no longer what they want. It’s also painful to release a change into production and discover it doesn’t achieve the desired result, maybe no-one uses it, or it’s not resulting in an uptick to your KPI.

It’s tempting to react to this by trying to nail down requirements first before we build. If we agree exactly what we’re building up front and nail down the acceptance criteria then we’ll eliminate the pain, won’t we?

Doing so may reduce our own personal pain—we can feel satisfied that we’ve consistently delivered what was asked of us. Unfortunately, reducing our own pain has not reduced the damage to our organisation. We’re still wasting time and money by building valueless things. Moreover, we’re liable to waste even more of our time now that we’re not feeling the pain.

Again, we need to listen to what the pain’s telling us; what are the underlying factors that are leading to us building the wrong things?

Fundamentally, we’re never going to have perfect knowledge about what to build, unless we’re building low value things that have been built many times before. So instead let’s try to create an environment where it’s safe to be wrong in small ways. Let’s listen to the feedback from small pain signals that encourage us to adapt, and act on it, rather than building up a big risky bet that could result in a serious injury to the organisation if we’re wrong.

If we’re frequently finding we’re building the wrong things, maybe there are things we can change about how we work, to see if it reduces the pain.

Do we need to understand the domain better? We could spend time with domain experts, and explore the domain using a cheaper mechanism than software development, such as eventstorming.

Perhaps we’re not having frequent and quality discussions with our stakeholders? Sometimes minutes of conversation can save weeks of coding.

Are we not close enough to our customers or users? Could we increase empathy using personas, or attending sales meetings, or getting out of the building and doing some user testing?

Perhaps having a mechanism to experiment and test our hypotheses in production cheaply would help?

Are there are lighter-weight ways we can learn that don’t involve building software? Could we try selling the capabilities optimistically, or get feedback from paper prototypes, or could we hack together a UI facade and put it in front of some real users?

We can listen to the pain we feel when we’ve built something that doesn’t deliver value, and feed it into improving not just the product, but also our working practices and habits. Let’s make it more likely that we’ll build things of value in the future.

Acute Pain

Many people do not have the privilege of living pain-free most of the time, sadly we have imperfect bodies and many live with chronic pain. Acute pain, however, can be a useful feedback mechanism.

When we find experiences and day to day work painful, it’s often helpful to think about what’s causing that pain and, what we can do to eliminate the underlying causes, before we reach for tools and processes to work around or hide the pain.

Listening to small amounts of acute pain, looking for the cause and taking action sets up feedback loops that help us improve over time; ignoring the pain leads to escalating risks that build until something far more painful happens.

What examples do you have of people treating the pain rather than the underlying causes?

Learning from Pain

Painful Tests

Painful Dependency Injection

Painful Integration

Painful Alerting

Painful Deployments

Painful Change

Acute Pain

One Response to “Learning from Pain”

Bernhard Wiedemann

Leave a Reply

More Articles