Search Index Dataflow

We once tried to document search indexing with the Graph plugin. It wasn't up to the task. We'll try again with our emerging graph support.

The scrape runs every six hours on a schedule that shifts with daylight savings time. The scrape is built from scripts that manipulates files in directories. Some files are rolled up from similarly named files in subdirectories. github

* [ ] ignore.rb * [x] rollup.rb * [ ] site-web.rb * [ ] slug-web.rb * [ ] neo-batch.rb * [x] found.rb * [ ] roster.rb * [x] activity.rb * [ ] counts.rb * [x] scrape.rb * [ ] server.rb * [ ] roster.sh * [ ] neo-build.sh * [ ] sites-present.sh * [x] cron.sh * [ ] sites-absent.sh * [ ] online.pl

We mimic the data entry from SigMod Example Unbound.

Shell:cron run Ruby:scrape run Ruby:rollup run Ruby:found run Ruby:activity write Logs:Now-0000 write Activity:Now-0000 write Public:sites.tgz Ruby:scrape write Pages:words.txt write Pages:sites.txt Ruby:activity write Pages:sites.txt Ruby:rollup write Sites:words.txt write Sites:sites.txt write Search:sites.txt Pages:sites.txt read Ruby:rollup Pages:words.txt read Ruby:rollup Ruby:found append Activity:Now-0000 Search:sites.txt read Ruby:found Ruby:activity run Ruby:roster.rb write Pages:recent-activity write Activity:sitemap.json Activity:Now-0000 read Ruby:activity

//wiki.ralfbarkow.ch/assets/pages/mock-graph-data/freeform.html HEIGHT 300