Category Archives: Operational Transformation

Sync: Operational Transformation vs. Conflict-Free Replication Data Types (CRDTs)

I need a solution for data sync/replication of offline data that doesn’t require my team to read whitepapers and understand theoretical mathematics.

There is an argument going on right now as to whether Operational Transformations (OT) or Conflict-Free Replication Data Types  (CRDTs) are the way to go here.  Both technologies are intended to solve the thorny problem of handling (or removing the potential) of conflicts when multiple parties are working on the same data without direct awareness of the efforts of another party or parties (perhaps because of temporal or location differences).

Maturity

I really like the idea of CRDTs, but there isn’t really a practical (or at least popular) implementation of full document CRDTs (think JSON) that I know of right now.  There also seems to be the (old) problem where the CRDTs are replicating correctly, but we are asking them to do the wrong thing… To be a little more clear, we are having difficulty getting the intent of the users expressed in the data structure that prevents conflicts.  We can get eventual consistency between the two bodies of data, but what would the two (or more) parties have created if they did it together side-by-side?

This is a problem that has been explored a little more in the world of Operational Transformation so the solutions (that I am aware of) are a little more mature.

Sync via peer to peer?

The primary downside of OT (that I can see) is that there really needs to be a single source of truth (think server) with Operational Transformations whereas CRDTs allow full mesh or peer-to-peer (P2P) sync.

Because P2P communication is almost as difficult right now as sync itself, it may just be practical to work with OT.

Some libraries to look at

I have been playing a bit with sharedb.  This seems to be the best OT work going on in JavaScript right now.  That said, there isn’t a huge community around the library and the owner’s (though amazing and brilliant people with other real jobs to keep down) do not appear to be super responsive to pull requests and issues.

If you are looking at doing the P2P thing, it seems like Scuttlebutt is a protocol/replication technology that is getting a bit of traction.  I believe it is inherently duplex though… so YMMV.  Here is a JavaScript implementation that might interest you.

 

DerbyJs or Racer on Windows

You could argue this just isn’t meant to be … and you might be right.  Unfortunately for me, flailing around in my Ubuntu VM is just a slow way for me to develop.  I know that makes me a terrible person, but sometimes the tools you use every day belong to the dark side.

In any case, I set out to get DerbyJs and Racer working on my windows machine.

DerbyJs and Racer are created by pretty much the same group of people.  They are JavaScript frameworks that run on Node.Js and use Operational Transformation to synchronize data in real-time across clients.

The tweaks

The first smack in the face is Redis.  You’ll need to install the windows version of that here, but that’s not remotely the hard part.  The difficulty is when you try to do your npm install for the example repositories of DerbyJs or Racer  .  Then all hell breaks loose, and you run into the “we don’t support windows” contingent with npm install redis.

Turns out the lovely people at hiredis do support windows though.  So here’s the trick(s).  Install hiredis globally FIRST (npm install -g hiredis).  Then go into your global npm cache and copy everything from %appdata%\npm\node_modules\hiredis\build\Release one directory down to %appdata%\npm\node_modules\hiredis\build (because REASONS).

Now npm install type things will magically start working — if you’ve put %appdata%\npm\node_modules into your system environment variables as NODE_PATH.  NPM doesn’t do that on installation because… it’s fun to google?

System Node Environment Variable
How to set your npm cache path

 

Now after all this effort  (at least at the time of this writing)  nothing will work.  You’ve got a couple more tweaks to make.

For Racer

You need to go in and remove all the “release kind of like this one” stuff from your package.json. (See JSON below with red things to remove.)  This is because the current release of Racer will not work with the example code.  No idea why because the problem seems to be somewhere in the Operational Transformation code — which is fiddly stuff.

"dependencies": {
"coffeeify": "~0.6.0",
"express": "~3.4.8",
"handlebars": "^1.3.0",
"livedb-mongo": "~0.4.0",
"racer": "^0.6.0-alpha32",
"racer-browserchannel": "^0.3.0",
"racer-bundle": "~0.1.1",
"redis": "^2.4.2"
}

BTW, below is, I think, my favorite bit of code ever.  When running the “Pad” example of Racer, it is fired over and over by a dependency (called syntax-error) of browserfy:

module.exports = function (src, file) {
if (typeof src !== 'string') src = String(src);
try {
eval('throw "STOP"; (function () { ' + src + '})()');
return;
}
catch (err) {
if (err === 'STOP') return undefined;
if (err.constructor.name !== 'SyntaxError') throw err;
return errorInfo(src, file);
}
};

I’m sure there is a good reason for it, but I can’t fathom it myself.  I had to comment it out.

 Now for DerbyJs

There is some sort of issue with how it is creating the paths for your views.  To get it to work, there is a patch you’ll need to add to your package.json.

After the patch is installed, you’ll need to go into the index.js of each of the examples and add it before the require(‘derby’):

require('derby-patch');
var app = module.exports = require('derby').createApp('hello', __filename);

I think that should do it.  I was able to get things running fairly well after that.

One last thing, if you happen to be using Visual Studio, I can recommend the Node.Js Tools for Visual Studio with some confidence now.  During the beta stage they were pretty bad about crashing my IDE, but (except when running unit tests) I actually prefer them to WebStorm now .  I know… sacrilege.