The Problem with Documentation
I hate software documentation. Or at least how I often see it done.
Maybe you have excellent documentation. But then lots of code changes happen – maybe even an entirely new architecture – and the documentation goes out of date.
Have you ever heard the phrase “No comment is better than a wrong comment”? If a comment/example/high-level document is wrong, it’s worse than having no documentation in the first place. This is due to it leading you down the wrong path. It would have been faster to learn how it worked correctly by manually looking through the code. Instead, you are lead down a rabbit trail that is in the wrong direction.
It is very easy to forget to update documentation – or to ignore it due to lack of time. Code gets tested (well… at least I hope your code does) through unit tests, manual tests, and automated system tests. Generally, documentation gets a lot less care. It generally doesn’t get read or tested in the developer environment (government work might be an exception, where, in my opinion, too much emphasis is put on documentation). There’s also no way to tell you that the documentation is out of date without reading through the document every time you make a code change. Who wants to do that?
Introducing Documentation Continuous Integration
Take a look at the Python Flask Github page. Do you see how it has an examples folder? How awesome is that? When you clone that code, you already have several projects to view for examples of using Flask. I did not look at all of them, but at least the Flaskr example includes the Flask package in its installation script (setup.py). In addition, the .travis.yml file (a file for the Travis Continuous Integration server), it installs and runs the unit tests for all that example code.
Let me repeat… HOW AWESOME IS THAT!?
Not only can you look at example code for how to use the Flask Python library… but you also verify those examples are up-to-date. They get checked by continuous integration. A CI server verifies that your code passes unit tests each time a commit is made. Well part of those tests is checking that the example code ALSO passes all its tests. So if the documentation fails… your build fails… which hopefully sends you an email with a bug to fix.
This is documentation that tells you when it needs to be updated. This is so vital to being able to trust that your documentation is up-to-date!
Automated Class Diagrams or Dependency Diagrams
Let’s go a step further. Suppose you use UML diagrams or some other tool to show classes and their interactions or dependencies in your code. Taking the time to automate this could make your documentation so much more useful, as it would always be accurate.
- The process to generate the documentation should be automated (creating a dependency graph or UML diagram should not be manual)
- Each commit can have the build server create the updated diagram
- The build server can post that updated documentation to wherever it is hosted
This should also be the process you use if you use things such as Doxygen or some other comment formatting to automatically generate HTML documentation.
What About High-Level Documentation?
I suppose at some point, you want some high-level documentation that doesn’t use some tool to generate it. The 10,000-foot view that tells how everything works together.
While this is less likely to need to be updated (only big architectural changes would make it change), it’s still possible it could go out of date.
What if instead of only writing the document, we have a system test that ties all the pieces of software together? That system test would use lots of code in order to make it work. For each ‘package’ that it uses, it could reference that project’s documentation (and tell how the various parts of the system interact together by showing code examples, which are checked by your CI server). This way, if any project is not longer there, the documentation will be updated automatically.
- If the project is no longer used, that reference will not be there.
- The interactions between the systems – those code examples – are check by CI, which will flag it if it is no longer valid
For each project, it will be updated by its own set of code examples. More documentation can be written at the top of source files, which can then be imported into the document (or the document can be generated from those comments in source files). While the comments at the top of source code might not be validated, it is more likely to be valid since it lives with the code. Those comments should also be higher-level so they hopefully don’t have to change as much.
Documentation can be important in showing how your code works to others. However, it generally is not given much thought into keeping it accurate and up-to-date. Given some thought and time, automation can be applied to documents to keep them updated without a lot of effort (after the initial setup time). This can keep your team better prepared to teach what is being done to others within the team and outside the team. It can also make bringing new people up to speed more quickly. Give your documents the same love you do to your code quality! And if your code quality sucks, fix that too. 🙂