Indexing the Invisible

I've been invited to address the Pacific Northwest Software Quality Conference in November and have agreed to trial my talk as a lightning talk in January. This is a work in progress.

Proper title is Indexing the Unknown making clear that it is a cognitive limitation, not a mechanical omission.

experience

Test-driven development was born at the dawn of desktop computing where if one user found their computer empowering that was a success. Datacenter based software as a service has shown that no one person can explain what software does or why it might stop doing it. This is the invisible we seek to know.

When things go wrong our most talented are mustered to make things right as soon as possible. In the quiet calm of the following day the expectations of many are aligned by the cold reality of what happened. If the cost of incident response is considered an investment, the return of this investment is knowing the software a little better.

A few architects at my previous employer anticipated problems that might occur with the upcoming conversion of a monolith product to micro-services. This lead to the creation of a property graph database which I maintained for a number of years and have spoken about at the Domain Driven Design conference in Denver.

approach

Although the value of this work has been recognized, it has not been duplicated because such a graph has to be too complete before it delivers any value. Now I suggest an incremental approach.

Tip: Make many small graphs with tens of relations then compose them together to make the diagram you need. A "recommender" makes this feel like browsing rather than composing graph queries against a large database.

Tip: Build and share small graphs as json files in the source code repositories already in use as just another kind of "Read Me" file. Use build automation to update these with every checkin.

Tip: Quality expertise attends incident reviews to be sure that the invisible knowledge thus exposed is captured in aspect specific graphs. Pull requests for updates become an expected product of each review.

We've tested and improved this approach working in the open source for a few years now. We're in the process of separating the technology from the specific projects that begot them and wish to harden the elements as required in commercial practice.

technology

Property Graph: We represent a graph by two arrays of objects that include a type and and a hash of additional properties. One array is nodes, the other relations. These linked together with automatically assigned ids, the indexes provided by the arrays. These are serialized together as JSON when stored or shared.

SVG Diagrams: Although the graphs are independent of any particular rendering we find it convenient to convert them to SVG using Graphviz at the command line or interactively in the browser using WASM Graphviz. Our small graphs are easily within the sweet spot for automatic layout in Graphviz. GitHub has excellent support for displaying and even diffing SVG files.

Aspect-Oriented Annotations: Our most promising long-term approach to maintenance has each subsystem separated into meaningful aspects and repository resources, source code and configuration files, annotated with aspect, nodes and relations exactly where they they are realized.

Click to Fix: We attach annotation source file and line number as properties of teach relation. If something is found wrong or incomplete, a click on the relation label will open the web editor ready to make corrections and file a pull request.

Browse to Discover: Interactive browsing will recommend additional sources or aspects that might straddle whatever issue is of concern in the moment. Touch points between graphs are highlighted with linked resources a click away.

future

We are promoting a variation of user-generated-content as opposed to outsourcing to tool or database vendors. We have seen this work with wiki and will continue enhancing federated wiki with these mechanisms. We are willing to distance ourselves from wiki in this case because the most challenging issues emerge in high volume revenue critical applications. We hope to learn together.