Ward Cunningham found a way of watching large software developments that recalls Domain Driven Design's Context Map. Here we collect insights about exploring complex systems and tools that lend themselves to continually re-calibrating the model as the business grows. He re-examines original c2 wiki, recent (as of 2017) work on El Dorado for New Relic, and federated wiki. youtube
YOUTUBE oPJIXPC_vn8 Ward Cunningham, Observation of Emergent Schema in Organized Development, Explore DDD 2017 youtube
Abstract from Explore DDD 2017
The original wiki was founded to understand engineering at a social level, first within one company and then more broadly around the world. We asked not for advice but for personal experience, what parts worked and why.
The agile methods incubated in wiki have enabled a pace and scale of development unimaginable at its founding. Now, situated within rapid development, we have found a new way to observe and understand software and the people who make it.
In this keynote we describe our methods, similar to data warehousing, but with the same interests as the original wiki. Our system captures from existing data a broad picture of what has been built and puts this in service of its continuing evolution.
.
Eric Evans reflects on the impact Ward had on the Data Driven Design book. "He was sort of a proto-DDDer." Ward had suggested putting tactical patterns into an appendix to bring clearer focus on the essential patterns. Eric defends the presence of tactical patterns to ensure the book staked a claim to concrete and practical concerns.
We're going to watch connections change.
Sometimes you don't know what you're going to get until you look.
Three examples of computing. A gesture that presents completely different paradigms of computation. University lab full of keypunch consoles. Kids gathered around a Xerox Alto running Smalltalk. And an airconditioned datacenter full of 19" cabinets.
How are we gonna live in this new world of computing?
This tool, El Dorado, enables exploration of changing relationships in a complex microservices architecture. This resembles how the original c2 wiki enabled a community to explore each others experience learning how to program with objects and pattern languages. In particular, the tool exposes the markup used to query the database and to draw the diagrams. c2
Context. New Relic R&D went through a transformation into 40 or 50 autonomous teams each with at least a developer, a UI expert, a big data expert, and so on. Internally teams would work any way they liked. One rule though. If one team was going to depend on another team, it had to be self-service. No one could put work on another team's to-do list.
Ward asked himself, "What could possibly go wrong?" Teams are going to create lots of different services. How are we going to be able to manage the deprecation over time? Can we even count the number of services? That turned out to be hard.
It got easier with a prototype that used neo4j to graph the relationships.
Sometimes you care more about how the things are connected than what the things are. 19m20s
You can click the arrows to discover those connections. Here's an example where two systems connected to each other, and that's the kafka topic that they're talking over in a stream of data.
What's neat about it is that I made a small diagram that meant something that could have been written on any whiteboard by any team member and they'd recognize it.
It is a kind of structure warehouse. It uses an ETL process to gather metadata from 20 different sources, as of 2017, and loads that into the graph database.
The web-based UI allows editing of the cypher query that is sent to the database and editing of the graphviz instructions for drawing the returned table of data. This enables a closed loop where people viewing these things can tamper with them to discover better ways to look at the structure.
When I showed it to Eric Evans last summer he said that's a context map and the bounded context. Here I was discovering what Eric had forecast 14 years ago. 21m44s
The nodes are the domain objects. The capitalized words are relationships between them. And the faint dotted lines in the lowercase is what file or database I read to learn that relationship. 23m01s
I didn't say how to organize the graph. In trying to place these nodes and route the edges, graphviz surfaced a nice diagram of our company. Here's the product planning. This is the management structure with the teams. Connecting to teams we have how we develop software. And down here, with a little overlap, is how we deploy software.
I also have a drill down because it's too big to look at.
I should have called this "employee" instead of "person" because here I have a former employee, not a former person.
All these dotted lines show how many of our sources talk about teams. Oh, and you know, they don't actually punctuate or captalize the names the same way in any of them, so I apply a half-a-dozen heuristics to get those keys to match up. 26m56s
One thing that turned out to be important was to tabulate how many heuristics had to be applied to make any connection. I can write queries to show me what heuristics got applied today and show me the examples of what came out. That's how I check my work. Just throw all of that data into the graph database and write some queries. Service is another major connector and there are 8 sources that are combined help us understand how our services are connected. 27m21s
Architects query the system and develop meaningful queries and diagrams. I can save those diagrams into our collection of canned queries and include them in the test suite to ensure we keep those queries working as new sources are folded into the system and change the structure.
The content-editable part is YAML with two fields: one for the cypher query, the other for dot to send to graphviz. The labels in the cypher query can be used as placeholders in the dot.
This is probably our most advanced query because it is the hardest one to make a very simple diagram that anybody could look at and say "yes, that's what we do." 34m53s
We discovered in terms of modeling from all this information that we get from all of these sources that we really have processes that are running in production or staging, they come from the same source code which we call programs, and above that we group them into clusters of services—Eric said we should have called those contexts. 35m15s
This is a conceptual thing that has more to do with how the engineers think about what they're building and that's very dynamic. People will say, "Look, you ought to own this service, not us. Why don't you get current on it and we'll forget about it." There's a lot of evolution. This is changing all the time. 36m23s
We don't describe how the services are connected. We can pull in data about the producers and consumers and the graph database can infer the connectivity between the services through the kafka topic.
We have this goal of making human looking diagrams.
Demo of exploring maps and map data in Federated Wiki. "I want to mention how I came to be devoted to this idea. ... I want to show you just a little bit of federated wiki to show this same kind of ... usability model that is exploring and observing as opposed to getting a quick answer." 38m35s
Ward demonstrates using maps in federated wiki to coordinate with a friend visiting from out of town. They were looking for good places to stay in Portland and wanted to enjoy Bike Town. Ward was able to mark the hip neighborhoods on a map and then locate bike share stations to help his friend choose a place to stay. To broaden the point he showed the same activity of finding a coffee shop close to the conference venue for meeting some federated wiki community members in Denver. See Federated Wiki Meetup Lunch.
Two principals to take away. We're not making data warehouses, we're making structure warehouses—I think the DDD community are very tuned in to structure. The other thing is building interfaces that you discover and explore in. I used to think of that as scientific. Now I think of it as a belief system. You want to hold onto those beliefs, but as you look a little deeper and find evidence that contradicts your beliefs, you want to be able to modify your beliefs and have that reflected in what you've built. 47m32s
How hard was it to build? Hard because ETL involves a lot of clerical work to clean dirty data. It generally takes about a day to bring a new source of metadata. It's better if I have somebody who knows the API or schema and has the necessary credentials and cares about their data. It is difficult to understand someone's representations. I have to enter their world and learn their [domain] language. A lot of the work is adding checking and validation. Yes it's JSON and it's valid JSON, but is it formatted with the keys we expect. We prefer to push those checks closer to the extraction rather than discover it in the transform. Do more logging and better diagnostics. When somebody does change something that breaks your ETL and your query starts returning empty rows, you want to know how your going to spend your morning and really just want something to tell you what to fix. 51m54s
.
For a method of developing a data extraction, see Exploratory Parsing in Frames.