Thursday, 16 August 2012

What does "software design" actually mean ?


Have you ever thought about software design and what it really means? Actually, I have spent some time investigating that problem. At some point, I even started to be so curious that I did my own enquiry. I was asking two simple questions: what design is and what does it mean, to almost all engineers I met. Basically, I could divide them into quite a few groups i.e. civil, electrical, mechanical, electronics and even aviation engineers. Interviewed engineers had also a different level of expertise. 
All their answers were focusing on inventing a solution to a given problem. Then, they were pointing that the problem itself should have been divided into subproblems, small enough to be tackled separately. Basically, that principle is called 'divide and conquer' and it's widely known in any sort of science. What's more, they were highlighting fact that series of prototypes or proofs of concept and its validations had to be done. If they were not happy with test results, they would have to redesign the prototype. Second important observation was blueprint produced as a result of whole designing process. There is also one more, extremely crucial concept hidden between lines - the price of implementing a solution. Everywhere apart from IT the price is very high. Almost always implementation takes time, manpower and money. However, in IT the whole, so called build process is relatively fast. Comparing to other branches of engineering, I would be tempted to say we have got it almost for free. I realized that this fundamental observation and what's more important, its implications are something we seem to forget and maybe even not fully understand. I came across two people dealing with that problem before me:
  1. Jack W. Reeves - C++ Journal (1992) article titled: "What is software design?"
  2. Robert C. Martin (aka Uncle Bob) in Clean Code Videos.

False Parallels


A thing worth mentioning is the existence of false parallels in software industry. We all know that software design is an abstract concept, which is definitely not something tangible as some mechanical parts, buildings or aircrafts are. Our community is often looking for similarities in other disciplines, just to show usage or explain some hidden concepts. For instance ideas of network bridge (bridge), JMS queue (pipe) or database (storage) are excellent examples. Although there are many true and valid examples, the wrong thing is association built in our minds that every single IT concept could be explained by similarities derived from other branches of engineering. I would say that it is not even all our fault. It's all about the way our brain is working. Generalizing is human's brain feature, not an obstacle or sort of impediment we have to fight with. This is a process we must be aware of and use it to understand our potential strengths and limitations. In fact, due to software design abstractiveness, majority of our concepts should be rather treated as mathematical or physical problems.


Why software engineering is so much different from other sciences ?


To be honest, there are couple of reasons, but the most important is price we pay for building code base. As whole build process is automated and needs compiler and/or linker, the overall price is extremely cheap. We may assume it is for free. If you calculate return of invest (ROI), which basically is a rate showing what you obtain involving a certain amount of means, you will get the result, which I call an infinite ROI. This is something every decent businessman is aiming for - infinite return of invest. Let's get down to brass tacks and show the formula:
Infinite ROI = ( amount of builds that could be done ) / ( time/money spent to build/buy compiler )
As you can see, the amount of builds done would be increasing over time, whereas the time/money spent on compiler would remain same. Hence, in a long run, it is worth to sacrifice any means to get automated build process. Incidentally, above statement seems to be valid in relation to continuous integration or delivery processes, as well. So after you have your compiler in place, you are able to build forever, for free.
The next, pretty significant thing is blueprint. All other sciences consider blueprint as a document describing design. Furthermore, that document is handed over to manufacturing team, so that they can implement concept prepared by engineers. The key thing here, is the blueprint itself acting as an input for development team. The output, on the other hand is a ready product. Basing on that observation, we may attempt to define our code base as a blueprint. That means that no other specification, Confluence, JIRA, Wiki page or even a Word document is a valid design document. The only thing that matters is our compilers input - source code. That's why we can conclude that software has very expensive design (as any other science), but it's cheap to implement, because we have automated manufacturing team - compiler. In contrary, building a dam, ship or space craft is having very expensive both design and build phase.


What everybody knows, but not everybody understands implications.


One of the major consequences of very fast build phase, is the increased level of changes in software requirements. I would even say that it is a constant evolution of requirements. Only by looking at 'changing requirements problem' from above angle, we are on a position to understand that we are no longer facing a problem of building a bridge outspread between A and B places. Software requirements become now an amorphous, organic, living system, rather that a strict document, where every single change in design costs time, money and whole design process has to be started from scratch.
Another very well known and established process in engineering is validation or testing. Typically, we build a model of say aeroplane, put it to the aerodynamic tunnel with bunch of probes and start to measure its performance and other parameters. This process is called validating or proving the concept. It's exactly the same what we are trying to do using so called spikes, proofs of concept (POC), prototypes and tests of course. Usually, when spike is not fulfilling requirements, we either trash it or redesign it a bit. Here comes the last aspect of whole validation process - redesigning. Basically, we call it refactoring, but in a nutshell it means refining our design so that it can be more powerful and resilient without harming client needs.
Kent Beck's TDD process (red-green-refactor) is our response to constant change in business requirements. To be fairly honest, Agile as such is an answer, but let's focus on TDD. As everywhere else, we also do have validation (red), implementation (green) and redesign (refractor) phase. Without established TDD process, we are extremely vulnerable  for errors, which are costing our client time and money. The rule of thumb is that even a small piece of code, during TDD process, is highly possible to be revised or fully rewritten.
If application we are writing would be an aircraft, everybody will know that it has to be done in a thorough and decent manner, tested hundred times, all scenarios checked (with weird edge cases on a top) before going to production. It's not a surprise that this kind of process takes time. What's more, it's expensive and it must last, as building is very expensive and almost always can be done only once. 
In IT we have a cheap building phase, but it doesn't dispense us from building proper solution. We also need to test what we did and redesign it form time to time. This is what mainly TDD, ATDD and BDD provides. That's how we should design our applications. That's our process! So what compiling is so quick. Let's leverage that fact for our convenience, not against us. 
I believe everybody knows at least one person saying that we don't need tests and refactoring is a waste of time. By saying that we are undermining our process, our credibility and understanding what the whole design is about.


Agile is our process


Software Engineering is so much similar and same time so much different from other branches of engineering that we deserve for distinct design process. That's why we have to and want to be Agile, not Waterfall. As Winston Royce says, in very first sentence of his paper titled: "Managing the development of large software systems", where he introduces Waterfall process: 
"I am going to describe my personal views about managing large software developments."
Waterfall is his view how large software development processes should be led - it doesn't mean it's the only one. Moreover, W. Royce worked for Lockhead, an aviation company, which is definitely a sort of that business, where safety is the crucial factor. What he did was moving whole manufacturing process from aircrafts to IT. Now, we know why it wasn't so effective. Of course, Waterfall is very useful in projects like: building a stadium, airplane, computer hardware, bridge or dam. This kind of process fits perfectly well to that sort of undertakings, but not in IT. Here building phase is cheap. I will be repeating that fact, as it's crucial to understand where we are and how it affects our job.


It's all about process, not coding.


Engineering is about setting up proper process, not about discussing whether final blueprint should be written in Word, CAD or something else. Software documentation (other than source code) should only capture:
  • important information from business/problem space, not directly included or visible in our design (which is code base)
  • aspects difficult to extract from code 
Everything else is in most useful place … code base. Our source code is what is passed to the compiler and is a matrix for building real (binary) code, which is run on various machines.
That is why I am appealing to you, please stop using parallels based on Civil Engineering, Architecture in an dumb way, which establish wrong associations in peoples' minds (especially non-software engineer ones). As I said earlier, human brain tends to generalize, so if we are selecting bad examples, they are sprouting, embedding themselves and transforming into a foundation for further associations. This is very tricky process, as initial comparison may seem to be a very good idea, however implications might be tragic. We are subconsciously comparing Agile and Waterfall production processes. We tend to equalize them and we start to think about them as same sort of processes applied to two different domains. This is wrong! The fundamental difference is in time of building a solution. That difference is so important that it changes the whole approach to software production process. It's no longer economic to spend one year designing and then implementing, whatever project is about. It's better to do something and ask for client for feedback. Of course, I'm not saying that we should drop TDD and refactoring. I'm just saying that designing phase as it's defined in Winston Royce's paper is no longer valid in software development. We should use TDD cycle instead.



1 comment:

  1. Fully agree with the point of view presented. Waiting for more impatiently:)

    ReplyDelete