We recognize important work during a week by enumerating the links present in pages last edited that week. We wish to isolate clusters into distinct aspects so that separate threads can be traced from week to week. github
Consider this graph for the week beginning 1/27/2024.
digraph { overlap = false; splines=true layout = dot; node [shape=box style=filled fillcolor=gold penwidth=2] node [fillcolor=palegreen penwidth=1] 0 [label="Page\nMixing Background Colors" tooltip="name: Mixing Background Colors"] 1 [label="Page\nTransmission Color Model" tooltip="name: Transmission Color Model"] 2 [label="Page\nWatch the Weave" tooltip="name: Watch the Weave"] 3 [label="Page\nSearch Index Logs" tooltip="name: Search Index Logs"] 4 [label="Page\nSolo Super Collaborator" tooltip="name: Solo Super Collaborator"] 5 [label="Page\nSites to be Indexed" tooltip="name: Sites to be Indexed"] 6 [label="Page\nIndexing the Unknown" tooltip="name: Indexing the Unknown"] 0->1 [label="" labeltooltip="source: 1/27/2024"] 0->2 [label="" labeltooltip="source: 1/27/2024"] 3->4 [label="" labeltooltip="source: 1/27/2024"] 3->5 [label="" labeltooltip="source: 1/27/2024"] 1->0 [label="" labeltooltip="source: 1/27/2024"] }
We would like to separate this into two graphs, each with three nodes and discard the one node that is not linked within the neighborhood under consideration.
Our approach will find and copy nodes to one or more new graphs so long as a node has at least one connection. We will copy all reachable nodes at once starting a new graph each time.
As copy recurses through both inbound and outbound links new nodes are copied only once and in a non-specific order. As each node is copied, its outbound links are copied after possibly recursing to instantiate its target.
pages/identifying-distinct-activities