Remote Database Schema Migration

Ward was sure that they would not get the schema for WyCash Plus right on the first try. He was familiar with the object migration mechanisms of Smaltalk-80. He designed a version that could serve them in a commercial software distribution environment.

He chose to version each class independently and write that as a sequential integer in the serialized versions. Objects would be mutated to the current version on read. We supported all versions we ever had forever.

He recorded mutation vectors for each version to the present. These could add, remove and reorder fields within an object. One-off mutation methods handled the rare case where the vectors were not enough description.

With each new release, migrations were shipped to customers, who could run them on their own computers as needed without further intervention.

Occasionally, they would send certain customers field patches that contained migrations specific to their needs. These then inserted themselves into the ongoing development without further attention. Different users could migrate in different order, as long as each abstraction was migrated in sequence, there was no problem.

Management

Ward wrote a program that would manage a database of mutation vectors independent of their source code. This turned out to be hard for him to operate correctly. He was convinced fatal mistakes would be made so he discarded it in favor of hand-crafted vectors stored as an array per object in the running program.

Alan Darlington proved to be the most adept at managing these resources in the group. They changed what they wanted in development and then designed the migrations to be delivered, consulting Alan as needed.

Alan Darlington eventually discovered a way to keep primary data in source code comments and use a "do it" to generate correct mutation vector arrays when needed. This was genius. It keep version history in the source code management where it belonged.

Persistence

They were replacing spreadsheets and chose the same open/save approach to persistence.

They supplemented this with a transaction log that could be read on startup to recover unsaved data.

One customer didn't know that save was expected and only ever turned off their PC when their work was done. When startup starting getting slow they suggested a save. All was better.

They developed a small scale sharing mechanism coded by reading each other's recovery logs. This delayed the "high volume" implementation based on a shared database.

They rewrote their serialization to use binary data rather than parsing text strings. They stored these records in a db with four tables: instruments, transactions, portfolios and other. Their mutation mechanisms survived this conversion.

They found that interning symbols was slowing reading. They tried rewriting this as a custom primitive with no significant improvement. Success came with inserting their own symbol table in front of the system symbol table. With only their own symbols they achieved near perfect hashing.

Agility

They let powerful database structures emerge. For example some instruments contained the transaction that provided collateral to the investment. This bundle would be saved in the instrument table without touching the transaction table unless the collateral were redeemed.

Ward described to others how they let their implementation evolve. Others would claim that such flexibility would be impossible and cited database migration as an overwhelming cost. When he described their solution he was told that he cheated by changing the rules. Imagine that.