This week I attended #pipelineconf, a new one-day continuous delivery conference in London.
There were great discussions in the open space sessions, and ad-hoc in the hallways. Here’s a few points that came up in discussion that I thought were particularly interesting.
De-segregate different categories of tests
There are frameworks for writing automated acceptance tests, tools for automated security testing, frameworks and tools for performance testing, and tools for monitoring production systems.
We don’t really want to test these things in isolation. If there is a feature requested by a customer that we’re implementing, it probably has some non-functional requirements that also apply to it. We really want to test those along with the feature acceptance test.
i.e. for each acceptance test we could define speed and capacity requirements. We could also run http traffic from each test through tools such as ZAP that try common attack vectors.
Testing and monitoring shouldn’t be so distinct
We often think separately about what things we need to monitor in our production environment and what things we want to test as part of our delivery pipeline before releasing to production.
This often leads us to greatly under-monitor things in production. Therefore, we’re over-reliant on the checks in our pipeline preventing broken things reaching production. We also often fail to spot behaviour degradation in production completely.
Monitoring tools for production systems often focus on servers/nodes first, and services second.
We’d really like to just run our acceptance tests for both functional and non-functional requirements in production against our production systems in the same way that we do as part of our deployment.
This isn’t even particularly hard. An automated test from your application test suite can probably succeed, fail, or generate an unknown failure. These are exactly the states that tools like Nagios expect. You can simply get Nagios to execute your tests.
Monitoring your application behaviour in production also gives you the opportunity to remove tests from your deployment pipeline if it’s acceptable to the business for a feature to be broken/degraded in production for a certain amount of time. This can be a useful trade-off for tests that are inherently slow and not critically important.
Non-functional requirements aren’t
People often call requirements about resilience/robustness/security/performance “non-functional requirements” because they’re requirements that are not for features per se. However, they are still things our customers will want, and they are still things that a our stakeholders can prioritise against features – as long as we have done a good enough job of explaining the cost and risk of doing or not doing the work.
Technical people typically help with coming up with these requirements, but they should be prioritised along with our features.
There’s no point building the fastest, most secure system if no-one ever uses it because we haven’t tested our assumptions that there’s a market for our product. Similarly there may be a high risk to not completing a particular piece of security work is not completed.
Technical people often don’t like trusting non-technical people to make these decisions – partly because we’re often bad at articulating and providing evidence for the risks associated with delaying this work, but also because we sometimes understand the risks/benefits of “non-functional” requirements better than the feature requirements so don’t agree with their decisions.
Both are communication problems. Business-people are used to weighing up risks and costs of delay with normal features. There is always an opportunity cost to working on something that is not needed, and there is a risk of delay of features because a competitor might get to it first.
Green-first performance tests
This was some interesting insight from Dave Farley. When writing tests we usually write a failing test first, then make it pass, then refactor. With performance tests this approach can lead to you being unsure whether your test is failing because the implementation is too slow, or because your code is too slow.
If you make your test pass first using a stub implementation and then see it fail by switching to the real implementation then you will have confidence that the test is not the bottleneck.
Customer as the bottleneck
Let us suppose we have development practices that allow us to iterate quickly, automated deployments that let us deploy rapidly, and operations are following the same practices – allowing us to promote to production rapidly.
In this scenario the customer becomes the bottleneck. How quickly can we come up with new things to build. No matter how much we automate the rest of the process – talking to customers remains manual. It’s also one of the hardest things to improve because it requires skills that are difficult to teach and learn.
It’s all about the people
There seemed to be strong agreement in all discussions that for continuous delivery to work you need people to think in the right way and to foster the right culture. Tools can help, but ultimately the technical challenges aren’t actually very big.
You can do continuous delivery with very simple tools and scripts. To do continuous delivery well you need people to buy into the idea, and you need to break down barriers that introduce asynchronicity into your delivery pipeline.
However, tools can help with persuasion. If you can demonstrate how much quicker something can be with automation then it goes a long way to selling the idea.
It’s always worth sharing your experience
I was struck by how unusual some of our working practices were, and how useful some people found our ideas.
Similarly, I learnt a lot from others at the conference. It was especially interesting to hear people’s experience introducing continuous delivery into large organisations that were extremely slow in delivery and risk adverse.
It’s easy to become blind to things you do every day which may actually be useful and unknown to those outside your niche community.