The Journal stores new copies of story items each time they are edited. This presumes items are small and edits touch all parts of them. Not true for caption edits of fat images. But, let's consider the Journal an asset library for story items in the few cases where this is an issue.
Nick Niemeir has suggested that the JSON Schema should include a top-level collection of assets for big things were duplication is wasteful.
assets = { key: asset, key: asset, ... }
But wait, the Journal already has that asset stored in a top-level structure, though admittedly not as simply indexed.
Principle
A plugin is allowed to apply the omission principle for any field where duplication is costly.
The omission principle says that should a field of an item be omitted, the plugin is allowed to search the Journal for a revision for which the field is known.
Application
For images this could be implemented by special case logic every place the url field is handled.
The Factory creates an image upon drop.
The TextEditor revises the caption of an image.
The Image is rendered into the DOM.
An Image is moved within a page.
An Image is moved between pages.
A Journal merge scrambles the order of the Journal.
The server handles edit Actions for Images.
A better approach might be to allow Image plugins to participate as encapsulated objects in each one of these interactions. This would be an expansion of the plugin API.
See Image Assets which if implemented would make this less important but still useful.
Could there be some incremental process that safely revised existing pages to exploit the omission principle? Could the principle be applied automatically to compress pages in transit? to collapse redundancy in memory?
Inheritance
All field lookups must start with the story, then most recent Journal entries and searching back until the most recent occurrence of that field is found. To remove a field from a paragraph, an explicit null must be inserted into that field.
This could be modeled in javascript using prototypical inheritance where the paragraph in the story -> most recent instance of paragraph in the journal -> ... -> first ever instance of paragraph in the journal. Then property lookups on this chain at any point in the chain would give us the proper state of the object as it existed at that point in history.