Tuesday 5 July 2011

Slow and Dirty – A Rant by Jason Gorman at SPA2011




Introduction:

This is an extract from Jason Gorman rant at SPA2011 conference. He had 20-25 to go through his rant therefore he requested everyone to suspend their beliefs till the end. He said that he decided on the title as a result of conversation he had with some people during the conference.

Acid test: In his first slide (Slide 2) he showed a graph that illustrates improve in quality of code resulting in fewer bugs but at the delivery cost going up exponentially. He asked how many people have actually seen the famous graph before. There were more yeses than no’s. He declared that this graph does not exist and we are all liars. He reiterated that there is no industry data from last 3 decades to prove that this graph actually exists.

Slide 2 - Relative Development effort
Reality: On the contrary, the reality looks more like shown in McConnell’s curve (slide 3) below that shows relationship between total development time vs % defects removed before release. Various studies show that, to a point, higher the quality of software the faster it gets delivered.  The software delivered is not only relatively bug free but it costs less and takes less time to deliver. There is an optimum value as indicated by the dotted line around 95% after which the cost goes up exponentially as one works to make the software 100% bug free. A quick poll showed that there was 1 person in the room whose company is on the right of the dotted line. It is clear from the evidence we have if we want to go faster we should take care of the quality by pulling the quality lever. Majority of software development teams are on the left of the optimum point

Slide 3 - Percentage of Bugs removed before Release


 What is the cost of fixing Bug during software lifecycle?

It is common knowledge in software development that the cost of fixing bugs rise fast late in the lifecycle that makes it important to understand the customer requirements before you even leave the room and start building anything. This is illustrated in the graph below that shows you can fix the defect with 1% of the cost compare to 150% after it is released or spend 20% of the extra effort and break even at the testing stage but from thereon the cost goes up exponentially. There is more data from Scott Ambler to support the claim of cost of bug fixing depending on when they are found in the software lifecycle. The graph shows the lengthening of feedback cycle as one progress through the development cycle. That’s why it is important to stay below the optimum to reduce the cost of delivering better software.


  
Slides 4 & 5 - Cost of Bug Fixed through Development Lifecycle and Feedback loop


Does delivering maintainable code help?

It is also important to deliver maintainable code as shown by the example (slide 6) below. 

Slide 6 - Maintainability Factors
The graph(slide 7) below shows the cost of a LOC (line of code) dependent on the softare release when a  bug is found.  This emphasises the fact that more effort you put in to make your software maintainable less effort you would need to change it later. Jason gave an example from a successful company that he worked for, the graph (slide 8) represents data taken from their code base and human resources. It depicts the cost  of line of working code that increases exponentially over the life time of the software as the engineering staff increases from 20 to 1200 over a period of 10 years vast majority of them did not write code. In one instance, it took Jason three months to find someone who wrote or changed code with majority not knowing who is responsible for writing or changing code.

 
Slides 7 & 8 - Cost of LOC during release and staff expansion

Another interesting feature of Jason’s research as shown in graph (slide 9) was the pace of software changes with time that slowed down dramatically around release 4. The reason for that was the product grew very rapidly and hit a plateau, the company did not run out of innovations or no more value could be derived from the product but it became too expensive to maintain the code. The end result was disastrous for both the company and its partners.

Slide 9 - Product Release cycle data
What you learn from thoughtful experiment?

Next Jason tried a thoughtful experiment with the audience called ‘guess four digit numbers’. He split the room into two teams Waterfall and Agile.  Team ‘waterfall’ was asked to guess the four digit number and the answer he got was 1234 which he dismissed as incorrect. Team ‘agile’ was asked to guess the first digit of the 4 digit sequence and the first response was dismissed as incorrect but second guess was accepted. Team waterfall was asked to guess four digit numbers again and they came up with an arbitrary number of 5000 which was dismissed as well. This time Jason changed his tacit and shifted the emphasis by asking which team you would bet on assuming that team ‘waterfall’ would have 10,000 guesses to find the right combination whereas ‘Agile’ team would require 14 guesses. The reply was ‘Agile’ team; therefore, he suggested that it is a perfectly statistical proof that incremental development provides more cheaply better quality solutions for complex systems. 


Slide 10 - Guess 4 numbers

The point Jason was trying to make was that it is not how fast you deliver but how fast you learn from what you deliver which works better for agile developments. If a team is learning and getting better feedback from each other then it will be ahead of the pack.

It is not how fast we deliver that defines the winners and losers; it is how fast we learn from what we deliver

Is software development like a race?

Let’s think for a moment that software development is like a sprint or a marathon, you would not approach marathon the same way as a sprint would you or vice versa. In software development if you start a marathon at the pace of a sprint you are likely to be carried on a stretcher because at that point it will be too painful to go any further.  This form of development is called anaerobic software development where team write code at a stable pace but lactic acid in the code smell builds up faster than their ability to deal with the problems.


  
Slides 11 & 12 - Anaerobic Software Development

Providers of web applications such as Google (slide 13) are running a race against each other some of them were early leaders such as Vimeo; you can tie them to their engineering practices. What makes YouTube such a powerful platform at the moment is due to Google’s engineering practices? The innovation is happening rapidly and relentlessly that sometimes features are built to target a market and if it does not work then they quietly disappear. All these companies are closely tied together to their relentless pace.


Slide 13 - Successful Web Stories

TDD experiment: Jason explained about an experiment (slide 14) he did 3 times with TDD CAT and 3 times with non TDD CAT.  He ran a suite of tests that he would only run if he thinks he is done.  He discovered that over a period of half hour exercise, it was slower when he did without TDD and faster with TDD. The reason was that for the former tests he has to make passes ie he has to run more debugging each time he ran the acceptance testing.

Slide 14 - TDD VS NO TDD
He pointed out that don’t be fooled by the illusion of done that if you make the goal post so wide that anything can get through. That’s why most start-ups are very happy to live with such scenario because they want to hit the deadlines. If we don’t take into account the subsequent cost of dealing with all the problems they let through then everything is slipped under the carpet. That’s why one of the reasons of fixing things after the release is 150 times more expensive if they are not dealt in the requirement meeting. You are doing yourself or your business no favour if you let this stuff through and bury your head in the sand. It is no good pretending that the problems are not there; they will come and haunt you.  
       
      
Slides 15 & 16 - Illusion of Done


How do we know we are building the right system?

Some people might argue what are the consequences of building high quality version of the wrong thing at the same cost of building a low quality version of the wrong thing. It implies that if it is the wrong thing then there is no point bothering with the requirements.  First, if we know it is the wrong thing then you should not build it in the first place.  Secondly, if we don’t know it is the wrong thing then how can you decide whether to bother with quality given that you have established from the industry data to build it right cost no more than do a bad job of it? Who in their right mind would choose to do a bad job of it knowing it would not save them any money and probably cost them more in the long run? In fact it would reduce your choices afterwards if you have built the wrong thing. Like a golfer you are certainly not going to get ball in the hole first time around. It is important to take care of first drive so you end up from where you get the ball in the hole. Software development is no different and there is no good saying that we should know better.


Slides 17 & 18 - Nobody gets it right first time


What can we learn from the likes of Facebook and Google?

 We know Facebook, twitter and Bebo did not pay much attention to the quality of software but they are very successful. Their engineering base is much larger than when they were start-ups and we really can’t learn much useful from them. These companies threw money at their early engineering mistakes because internet bubble was creating artificial money for them to do that. In other kinds of businesses money is just not there. Your new release is probably written by the same guys who wrote release 1.  Our experience shows that it costs to do a better job, therefore, like a poker player you have to play hand after hand, you may have to lose few before you start winning.  We know from the market data that we should not cut corners right from the start because corners turn into sharp bends those results in a crash. The problem lies in people overestimating and then lowering the bar to hit the deadline.
 
   
Slides 19 & 20 - You can't learn from the likes of Google

Jason’s message was Facebook and Bebo are just statistical aberrations, like ghost stories, like UFO that are not repeatable and are not very scientific. You cannot learn from their engineering practices and you would be wise to avoid them.

Summary:

  • Business models learn from aberrations are fool’s gold and we should stop talking about them. They are just exceptions and got nothing to teach us.
  • For overwhelming majority  quicker and better are the same thing
  • If you want to get your product to the market sooner and do not have enough time then best thing to do is to lower your sight about what you can deliver: reduce your scope or reduce your ambitions.
  • If you make some mistakes then learn from them.
  • We have cost-to-quality data and our own personal experiences to back us up; therefore, we should stand our ground and stay to the left of the optimum on the McConnell’s curve.
  • Throwing more people or money on a job would not help.



1 comment:

Lisa said...

Mohinder, thanks so much for this post! There's an amazing amount of information compressed into this. It presents a lot of good software practices/principles in a nutshell. I'm going to remember these for when I'm trying to explain them to people. I've never had the opportunity to hear Jason speak, this is the next best thing.