m-ld

George

gsvarovsky

05-Oct-20

The Playground is Available - A safe place to try out the m-ld API

As we continue our work on the features of m-ld, we've found that you (and we too) often want to quickly try out the query and transaction syntax against a temporary domain.

Now it's easy! The playground is designed just for that. It's all explained once you're there, and there's even an introductory video. We've also added some more examples to the Javascript engine transaction documentation, to help you find your way into the language.

Many thanks to everyone who has contacted us about use-cases in specialist domains. We're always open to having a chat, so get in touch by email, use the feedback form, or drop us a note on Twitter or LinkedIn.

Have a great Autumn!

George

gsvarovsky

01-Sep-20

Manifesto for Data - We believe data should be live and sharable by default.

Also published on codeburst.io.

The 'truth' should be the data that is being used, not the data in distant storage.

Distribute the data automatically, with the guarantee that all of it will converge on the same 'truth'.

Use a published open standard for encoding data with its meaning, and communicating changes to it.

Hi, I'm George. This year I left my day job as a software engineering leader, and plunged into lockdown under a mountain of work, uncertainty and risk. Last week, I pushed the button to launch the m-ld Developer Preview. In between has been a mad journey of creativity, anxiety, frustration, imposter syndrome, fight and flight, elation and time-dilation, and so! much! coffee!

But why?

As a data management app developer, I've used many ways to encode and store data. Frequently, they are combined in the same architecture, with one of the locations being blessed as the central 'truth':

centralised data

The specific technologies vary, but the overall pattern is very common. Motivations include properties of security, integrity, consistency, operational efficiency and cost. However, there are some other peculiar properties that stand out:

The 'truth' is on the far right-hand side; but the data is being used throughout, with particular value being realised on the left.
The software application is responsible for both distributing the data and for operating on it.
Every encoding syntax is specific to a technology, and does not expose the data's meaning enough to be independently understood.

The main consequence of these properties is application code complexity. We have to be incredibly careful to maintain an understanding, in the code, of how current (how close to the truth) our copy of the data is, operate on the data accordingly, and share the understanding with other components. This is hard, and frequently goes awry; resulting in software bugs which are very hard to reproduce, let alone fix.

In this blog, I'll argue that with recent advances in computer science we can make improvements to this, for many applications. Applying our manifesto, we want our architecture to look more like this:

live sharable data

But how?

One thing to notice in the centralised data pattern is that we're taking each encoding of the data and translating it into a new one, to make it suitable for computation, or storage, or to add security, or for whatever reason. At each translation the complexity of keeping the new encoding up-to-date with the previous ones ramps up.

What if we did away with the idea of re-encoding the current data, and instead transacted in changes? Humans do this naturally. When having a conversation about some information, we don't re-state it every time we want to adjust it. We refine information by discussing the delta between the old and the new. And we naturally switch between re-statement and deltas as required.

This concept is nothing new in software either – Event-Driven Architectures have been a common paradigm since at least the mid-2000s. But consumers of 'events' have a new problem: to apply the change to their encoding of the current data. This distributes logically duplicate program code to every consumer – and lines of code are at least linearly proportional to bugs. Even worse, the event ordering is critical, so the coordination of the totally ordered log of events becomes the new centralised 'truth' (and a literally bigger one).

Let's deal with the code duplication issue first. Being good engineers we take care not to repeat ourselves, but this becomes hard to do when re-stating something in different languages. So, what if we had a common language for data? One that could express both state and changes to state? Since we're here, let's have one in which we can encode the meaning of the data, per our manifesto, including a natural way to identify data universally. And further, can we have one for which native, widely-available, battle-hardened database engines exist, so sometimes we don't have to translate anything at all?

Sounds like a big ask. Luckily, academia and industry have been working on it for some time. But let's look at the other problem: change ordering.

Imagine if you shared some information with a friend, and then, every thought you had about it couldn't start until your friend finished whatever thought they were having about it. This is the strictest way that centralised data management systems maintain consistency.

To mitigate the impact of this on the fluency of data manipulation, there are various strategies available like fine-grained locking, optimistic locking and a choice of transaction isolation levels. These have various merits, but each of them re-introduces some of the very distributed application complexity we were trying to reverse, and they still require the central ordered log.

What if we went the other way, and just removed the ordered log entirely?

There are two approaches to concurrency control that don't need a total ordering of changes. One is called Conflict-free Replicated Data Types (CRDTs), and the other Operational Transformation (OT). These do provide the required guarantee that copies of the data will converge to the same 'truth'. But they don't remove the possibility that concurrent changes will disagree with each other and lead to a 'truth' that doesn't make sense.

But wait, you and your friend had no trouble refining your shared information, with no deterministic coordination whatsoever. How?

Humans employ myriad strategies for coordination. You withhold thoughts while someone else is talking. You undo and redo thoughts against new information, both before and after expressing them. You notice conflicts that corrupt the information or render it illogical, apply obvious resolutions, and negotiate others. You actively seek consensus, or delegate decisions.

In the case of document editing, we can go further and notice that, given a foundational level of concurrency control in the software – Google Docs uses OT – editing by multiple humans works fine, and doesn't require much explicit coordination at all. Research groups have found that this applies just as well to CRDTs.

There are many finer details to explore in practice. But we have established that our manifesto can be met, in principle, with application of current computer science.

The approach that we've taken with m-ld is to provide a protocol, with implementing engines, for distributing data in a distributed application.

The 'truth' is the data exposed to the app by the engine.
The data is automatically distributed by the engine with the guarantee that all engines will converge on the same 'truth'.
We use an open standard for encoding data with its meaning, and communicating changes to it.

For now, we're proving out the tech, and filling out the corners that we think are essential for collaboration and autonomy use-cases. But we think we're onto something important to data architectures in general.

We'd love to hear what you think.

If you're ready to try m-ld out, you can work with the Developer Preview right now. Let us know what you're building!

George

gsvarovsky

20-Aug-20

Developer Preview - The first m-ld engineering milestone!

Thanks for visiting! We have reached our first engineering milestone, the developer preview. On this website you'll find information about m-ld, including why and how to use it. There's also a cool demo app (press the big blue button) which shows one use-case for m-ld.

And of course, the main event is the Javascript engine, which you can download and begin to experiment with.

We're excited about the many interesting use-cases for m-ld! We're working really hard on the ergonomics, the performance and the security. This is a pre-release preview, and we're eager for your feedback.

If you have a request or a question to share, go to the Issues page of our dedicated GitHub feedback repo. If you're unsure where to start, or you'd just like to talk, you can email us any time.

I do hope everyone is thriving; and we'll stay in touch!

George

gsvarovsky

16-Jun-20

m-ld: An update! - Progress since we last spoke

Hi! and thanks for being interested in m-ld: a decentralised technology to enable sharing of live data, in a fraction of the time, for a fraction of the cost, and more reliably. I hope you are well and thriving in this new normal.

I'm writing because a lot has happened lately!

On the business side, our materials keep on coming as we get our message out, to UK innovation organisations, investors and potential customers. You've already seen our strapline; and you can explore increasing detail with:

An elevator pitch: https://bit.ly/m-ld-pitch-video

An investment one-pager: on request
A full introduction doc: on request

In engineering, we're working hard on two main deliverables that will drop next month: the live demo app, and the developer preview. The demo app will give everyone a flavour of just one kind of software where m-ld fits perfectly, though it's not the only one of course! You get a quick feel for it in our demo video: https://bit.ly/m-ld-demo-app-video.

The developer preview will let you try out a m-ld engine in your own code. First will be the engine for Javascript platforms, and it'll be quickly followed by Java and Docker, because we're committed to having good platform coverage early on. All the engines will be open-source.

If you're interested in m-ld and would like to get more involved, brilliant! Here are some ideas...

Want to invest, or can you help with investor networking? Email invest@m-ld.io, we'd love to talk!
Thinking about a great use-case? Email info@m-ld.io, we'd love to hear about it!
Want to join in the developer preview? Email preview@m-ld.io, we'll sign you up!
Interested in joining us or contributing? Email careers@m-ld.io, we've got lots to do!

news