Developers with an eye toward maintainability often expend a lot of effort commenting their code and keeping comments up-to-date. For instance, comments may address why methods are called in a specific order, how code deals with a difficult bug, or how a piece of code interacts with other pieces of a system further in the process.
Out of necessity, code comments are usually focused on how and why code works at that particular moment in a code base. In other words, when developers see a comment, they expect it to describe how the code currently works.
As an inline form of documentation, it makes sense that comments address what's currently happening. Well-considered comments can, after all, help a developer navigate a complex system of code.
Describing the current structure, organization or reasoning of code only tells half the story, however.
What comments typically leave out is all the paths not taken — methods tried and discarded, performance issues or bottlenecks encountered, libraries or third party services used and removed, and code written but ultimately not used.
These paths not taken often contain valuable information that can help future developers understand why certain decisions were made and avoid heading down dead-ends or ending up with a lackluster solution.
At this point, some developers might ask whether that's the whole idea behind version control: to keep a record of what transpired. Version control systems don't usually tell the whole story either, though.
For developers that use version control as a repository for all code — whether published or not — version control certainly can move the needle toward maintaining a transparent record of developer activity.
But many developers don't use version control in this manner, choosing instead to simply use it to instead keep a record of only code that's made it beyond the "just trying it" stage.
Additionally, at some point, even when developers use version control to keep a complete record, the history of the project will likely become overwhelming, reducing the chances that developers glean useful information from it.
With most projects, there's a gap between the story that both comments and version control tell and the effort that developers actually expend. So, for many projects, it's worth considering whether there's enough value in that difference to capture historical information that may provide insight into a project.
What would such a system document? While there are an infinite number of reasons that code, system designs or ideas might not make it into production, here are some of the various findings that historical, retrospective documentation might record:
Developers often spend an enormous amount of time investigating how they would implement certain features before actually implementing them, but those findings are often discarded once a choice is made to move forward with a particular implementation.
For example, developers may compare the ease with which various programming languages can tackle the problems at hand, the availability of open source implementations to solve particular problems in that programming language's ecosystem, or cloud services that are available from various providers.
Why developers choose certain languages, libraries or services over others is important information that can benefit future developers working on a project.
Without it, those developers may not know that certain decisions were made, for instance, because a certain language had better libraries dedicated to solving particular problems, or that a database was designed a certain way because that implementation would produce cost savings based on how a particular cloud provider billed for certain database services.
Well-known programming problems such as sorts and searches offer known academic measurements of efficiency, but the reality of programming is that real-world efficiency of an application isn't nearly so predictable.
Utilizing appropriate data structures to good effect can have a large impact on efficiency, but overall performance typically depends on a wide range of factors — often, including external factors — that may not be known in advance.
As a result, there are often numerous ways to implement a given software feature. And, if that feature is critical to the overall performance of an application, it may be necessary to test various methods of implementing it.
Developers often tend toward selecting code that's encapsulated, extensible and "future proof" because it helps keep the level of maintainability of an application high. However, sometimes this type of code isn't the most performant — for instance, because it relies on web browser features that are slow or requires too much processing power for mobile devices.
Any concrete implementation of a particular feature usually has tradeoffs in comparison to other possible implementations, and the reasoning behind choosing one implementation over another is often valuable information that's excluded from comments and source history.
Information about how and why code is restructured is often lost in the low fidelity of comments and information overload of source control history. The motivations behind changing a system often get lost in the overwhelming amount of details that change due to the restructuring. For example, infrastructure may be removed or added or new services introduced.
Restructuring of code usually happens for a reason, for instance due to a current system not being extensible enough to accommodate for emerging business requirements, or to improve security, or to increase performance or efficiency. The reasons behind this restructuring can be lost in the activity of the restructuring itself, but that information may be valuable to future developers who work on a project.
At the architectural level of a system, it can be helpful to carefully track the relationship between various components that are deployed, along with how information is transmitted through them and where it's stored. Projects will often use network diagrams or similar mechanisms to keep track of this type of information, but a differential analysis of the various diagrams and why they've changed over time can provide useful information for understanding the various design decisions of a project.
It can be difficult to track the addition and removal of third party integrations, especially when code is decoupled from the actual implementation. For example, if code implements an "email" interface that decouples sending emails from the actual implementation used to send those emails, and then that implementation is later removed, it can be difficult to keep track of the fact that a particular provider's services were utilized at one point but ultimately removed.
Entirely changing a project's technology has been known to happen, but it's usually a good idea to have a sound reason or reasons for switching technology stacks. Enumerating these reasons can help future developers understand not only what prompted a change, but why the previously used technology became insufficient to meet project requirements or what the new technology provided that upstaged the old technology.
The problem is that these types of issues and their ultimate resolution can be hard to track. It's great to have a streamlined view of how a system currently operates, but understanding how that system used to operate and why it no longer does can provide a significant advantage for future developers working on a system that they didn't develop themselves.
The downside to keeping track of this type of historical information is that capturing such information can be expensive — in terms of developer cost, time and effort. Additionally, as with any documentation process, there's always the possibility that documentation will fall out of sync with implementation, leading to a situation that could potentially create more confusion.
Like with most things, developers should weigh the pros and cons of trying to capture this type of historical information before making a decision as to whether it makes sense for their particular scenario.