I will present at New Relic Reliability Unconference 2018. Here I refine the story I will tell.
# Story Arc _10min_ Unevenly Distributed Future Complexity in One Computer Complexity Distributed
_10min_ Line of Representation
_?min_ Incident Lifecycle
_?min_ Where to Dig
Alvaro suggests analyzing distributed traces to identify targets for chaos engineering. Those targets become adversarial gamedays. With El Dorado, we have a graph of our System which includes the larger picture of people and teams that SNAFUcatchers recommend. Speaking of SNAFUcatchers, they recommend using incidents themselves to identify targets that have already failed.
We start by using our incidents as a roadmap. As we build towards more proactive steps, we can focus chaos engineering efforts with a graph of The System build by El Dorado and distributed tracing.
See a brief demo of El Dorado by Ward Cunningham at 2017 Explore DDD Conference youtube