Now Reading
Introduction to Loro’s Wealthy Textual content CRDT – Loro

Introduction to Loro’s Wealthy Textual content CRDT – Loro

2024-01-23 06:40:16

Introduction to Loro’s Wealthy Textual content CRDT

2024-01-22 by Zixuan Chen

This text presents the wealthy textual content CRDT algorithm applied in Loro, complying with Peritext (opens in a new tab)‘s standards for seamless wealthy textual content collaboration. Moreover, it may be constructed on high of any Record CRDT algorithms and switch them into wealthy textual content CRDTs.

Above is an internet demo of Loro’s wealthy textual content CRDT, constructed with Quill. After the replay, you may simulate real-time collaboration and concurrent modifying whereas offline. You may as well drag on the historical past view to replay the modifying historical past.

If CRDTs are new to you, our article What are CRDTs gives a short introduction.

Background

Loro is based on the Replayable Event Graph (REG) algorithm proposed by Seph Light, however this algorithm can not combine the unique model of Peritext. This motivates us to create a brand new wealthy textual content algorithm. It’s unbiased of the particular Record CRDTs, thus working properly with REG, and is developed on high of them to ascertain a wealthy textual content CRDT.

Earlier than diving into the algorithm of Loro’s wealthy textual content CRDT, I might wish to briefly introduce REG and Peritext, and why Peritext can’t be used on REG.

Recap on Record CRDTs

Brief Introduction to Replayable Event Graph

Seph Gentle (opens in a new tab), the writer of Replayable Occasion Graph, is presently engaged on the paper, and it is extremely anticipated.

REG is a novel CRDT algorithm that mixes the strengths of each OT and CRDTs. It has the distributed nature of CRDT that allows P2P collaboration and information possession. Furthermore, it achieves minimal overhead in eventualities devoid of concurrent edits, just like OT.


Whether or not in real-time collaboration or multi-end synchronization, a directed acyclic graph (DAG) kinds over the historical past of those parallel edits, just like Git’s historical past. The REG algorithm data the historical past of person edits on the DAG.

In contrast to typical CRDTs, REG can file simply the unique description of operations, not the metadata of CRDTs. As an illustration, in textual content modifying eventualities, the RGA algorithm (opens in a new tab) wants the op ID and Lamport timestamp (opens in a new tab) of the character to the left to find out the insertion level. Yjs (opens in a new tab)/Fugue, nevertheless, requires the op ID of each the left and proper characters at insertion. In distinction, REG simplifies this by solely recording the index on the time of insertion. Loro, which makes use of Fugue (opens in a new tab) upon REG, inherits these benefits.

An index shouldn’t be a secure place descriptor, because the index of an operation will be affected by different operations. For instance, if you happen to spotlight content material from index=x to index=y, and concurrently somebody inserts n characters at index=n the place n<x, then your highlighted vary ought to shift to cowl from x+n to y+n. However REG can decide the precise place of this index by replaying historical past. Thus, it might probably reconstruct the corresponding CRDT construction by replaying historical past.

Reconstructing historical past might sound time-consuming, however REG can backtrack just some. When merging updates from distant sources, it solely must replay operations parallel to the distant replace, reconstructing the native CRDTs to calculate the diff after making use of distant operations to the present doc.

The REG algorithm excels with its quick native replace speeds and eradicate issues about tombstone assortment in CRDTs.As an illustration, if an operation has been synchronized throughout all endpoints, no new operations will happen concurrently with it, permitting it to be safely faraway from the historical past.

What’s Fugue

Fugue is a brand new CRDT textual content algorithm, introduced in The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing (opens in a new tab) by Matthew Weidner (opens in a new tab) et al., properly solves the interleaving drawback.

The interleaving drawback was proposed within the paper Interleaving anomalies in collaborative text editors (opens in a new tab) by Martin Kleppmann et al.

An instance of interleaving:

  • A sort “Hi there ” from left to proper/proper to left
  • B sort “Hello ” from left to proper/proper to left
  • The anticipated outcome: “Hi there Hello ” or “Hello Hi there “
  • The interleaving outcome could seem like: “HHeil lo”
    • This occurs when typing from proper to left in RGA.

An example of an interleaving anomaly when using fractional indexing CRDT on text content. Source: **Martin Kleppmann, Victor B. F. Gomes, Dominic P. Mulligan, and Alastair R. Beresford. 2019. Interleaving anomalies in collaborative text editors. https://doi.org/10.1145/3301419.3323972

An instance of an interleaving anomaly when utilizing fractional indexing (opens in a new tab) CRDT on textual content content material.
Supply: **Martin Kleppmann, Victor B. F. Gomes, Dominic P. Mulligan, and Alastair R. Beresford. 2019. Interleaving anomalies in collaborative textual content editors. https://doi.org/10.1145/3301419.3323972 (opens in a new tab)

The Fugue paper (opens in a new tab) summarizes the present state of the interleaving issues within the desk.

Souece: Weidner, M., Gentle, J., & Kleppmann, M. (2023). The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing. ArXiv. /abs/2305.00583

Souece: Weidner, M., Light, J., & Kleppmann, M. (2023). The Artwork of the Fugue: Minimizing Interleaving in Collaborative Textual content Enhancing. ArXiv. /abs/2305.00583

The interleaving drawback typically are unsolvable when there are greater than 2 websites. See Fugue (opens in a new tab) paper Appendix B, Proof of Theorem 5 for detailed clarification.

The case where the interleaving problem is unsolvable Source: Weidner, M., Gentle, J., & Kleppmann, M. (2023). The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing. ArXiv. /abs/2305.00583

The case the place the interleaving drawback is unsolvable
Supply: Weidner, M., Light, J., & Kleppmann, M. (2023). The Artwork of the Fugue: Minimizing Interleaving in Collaborative Textual content Enhancing. ArXiv. /abs/2305.00583

Nevertheless, we are able to nonetheless decrease the prospect of interleaving. Fugue introduces the idea of maximal non-interleaving and solves it with a chic algorithm that’s simple to optimize. The definition of maximal non-interleaving makes a variety of sense to me and leaves little room for ambiguity. I will not reiterate the definition right here. However the fundamental concept is first to unravel ahead interleaving by leftOrigin. If there’s nonetheless ambiguity, then clear up the backward interleaving by rightOrigin. (The leftOrigin and rightOrigin consult with the ids of the unique neighbors when the character is inserted, identical to Yjs)

Transient Introduction to Peritext

Peritext (opens in a new tab) was proposed by Geoffrey Litt et al. It is the primary paper to debate wealthy textual content CRDTs. It may well merge concurrent edits in wealthy textual content format whereas preserving users’ intent as much as possible (opens in a new tab). Its main focus is merging the codecs and annotations of wealthy textual content content material, comparable to daring, italic, and feedback. It was applied in Automerge (opens in a new tab) and crdt-richtext (opens in a new tab).

💡 The precise definition of person intent within the context of concurrent wealthy textual content modifying cannot be clearly defined in just a few phrases. It is best understood by explicit examples.

Peritext is designed to unravel a few important challenges:

Firstly, it addresses the anticipated issues arising from conflicting model edits. As an illustration, contemplate a textual content instance, “The fast fox jumped.” If Consumer A highlights “The fast” in daring and Consumer B highlights “fast fox jumped,” the best merge ought to lead to the complete sentence, “The fast fox jumped,” being daring. Nevertheless, current algorithms may not meet this expectation, leading to both “The fast fox” or “The” and “jumped” being daring as a substitute.

Moreover, Peritext manages conflicts between model and textual content edits. In the identical instance, if Consumer A highlights “The fast” in daring, however Consumer B adjustments the textual content to “The quick fox jumped,” the best merge ought to lead to “The quick” being daring.

What’s extra, Peritext takes into consideration completely different expectations for increasing kinds. For instance, if you happen to sort after a daring textual content, you’ll sometimes need the brand new textual content to proceed being daring. Nevertheless, if you happen to’re typing after a hyperlink or a remark, you possible would not need the brand new enter to change into a part of the hyperlink or remark.

Link style should not expand

Illustration of Peritext’s inside state. It makes use of the IDs of the character’s ops to file the model ranges. Within the instance, the daring mark has the vary of { begin: { sort: "earlier than", opId: "9@B" }, finish: { sort: "earlier than", opId: "10@B" }}

Why Unique Peritext Cannot Be Instantly Used with REG

On the one hand, Peritext’s algorithm expresses style ranges through character OpIDs (opens in a new tab). With out replaying historical past, CRDTs primarily based on REG can not decide the particular positions corresponding to those OpIDs.

Alternatively, it is not possible to mannequin Peritext on REG by replaying. It’s because REG’s “native backtracking suffices” depends on the algorithm satisfying “the identical operation will produce the identical impact, whatever the present state,” which Peritext doesn’t adhere to. For instance, when inserting the character “x” at place p, whether or not “x” is daring depends upon “whether or not p is surrounded by daring” and “whether the tombstones at p contain boundaries of bold and other styles (opens in a new tab).”

Loro’s Wealthy Textual content CRDT

Algorithm

Loro implements rich text using special control characters called ‘style anchors’. Each matching pair of start anchor and end anchor contains the following information:

The strategy to find out a personality’s model is as follows:

  • Discover all model anchor pairs that embrace the character, the place every pair is created by the identical model operation
  • Mixture pairs based on the important thing. There could also be a number of model pairs with the identical key however completely different values. In such instances, the worth with the best Lamport timestamp is chosen (if Lamport timestamps are equal, then use the peer ID to interrupt the tie)

Opposite to Yjs’s method of using control characters (opens in a new tab) for wealthy textual content, our algorithm pairs begin and finish anchors once they originate from the identical model operation. This strategy precisely handles the next eventualities:

overlap_bold

These particular management characters should not uncovered to the person; every management character is successfully of zero size from the person’s perspective. Our information construction helps varied strategies of measuring textual content size for indexing textual content content material. Apart from Unicode, UTF-16, and UTF-8, we additionally measure our wealthy textual content size in Entity size. It treats every model anchor as an entity with a size of 1 and measures plain textual content in Unicode size.

len

Native Habits

Multiple valid insertion points can exist when users insert text at a specific Unicode index. It occurs due to style anchors, which are zero-length elements from the user’s perspective.

Two Different Kinds of Indexes

For example, in the case of <b>Hello</b> world, when a user inserts content at Unicode index=5, they face the choice of inserting to the left or right of </b>. If the user sets the expand behavior of bold to expand forward, the new character will be inserted to the left of </b>, making the inserted text bold as well.

When users delete text, Loro uses an additional mapping layer to avoid deleting style anchors within the text range.

To model the deletion of a style, a new style anchor pair with a null value is added.

We can implement the following optimizations to remove redundant style anchors:

  • The style anchors that include no text can be removed.
  • When styles completely negate each other, like a span of bold is canceled by a span of unbold, we can remove their style anchors.

All these behaviors happen locally, and the algorithm is independent of the specific List CRDT.

Behavior When Inserting Text at Style Boundaries

Most modern rich text editors (Google Doc, Microsoft Word, Notion) behave as follows: when new text is entered right after bold text, the new text should inherit the bold style; when entered after a hyperlink, the new content should not inherit the hyperlink style. Different styles have varying preferences for text insertion positions, leading to potential conflicts. This is reflected in the degree of freedom we have when inserting new text.

Users interact with rich text based on text-based indexes, like the Unicode index. Since style anchors have a Unicode Length of 0, a Unicode index with n style anchors presents n + 1 potential insertion positions.

We select the insertion position based on the following rules:

  1. Insertions occur before a start anchor of a style that should not expand backward.
  2. Insertions occur before style anchors that signify the end of bold-like marks (expand = “after” or expand = “both”).
  3. Insertions occur after style anchors that signify the end of link-like marks (expand = “none” or expand = “before”).

Rule 1 should be prioritized over rules 2 and 3 to prevent the accidental creation of a new style (opens in a new tab).

The present technique first scans ahead to seek out the final place satisfying guidelines 1 and a couple of.

Then, it scans backward to seek out the primary place satisfying rule 3.

Merging Distant Updates

Loro treats style anchors as a special element and handles them using the same List CRDT for resolving concurrent conflicts. The logic related to rich text is independent of the particular List CRDT. Therefore, this algorithm can rely on any List CRDT algorithm for merging remote operations. Loro utilizes the Fugue (opens in a new tab) Record CRDT algorithm.

When new model anchors are inserted by distant updates, new kinds are added; if outdated model anchors are deleted, the corresponding outdated kinds are eliminated.

Robust Eventual Consistency

The internal state of this algorithm consists of a list of elements, each either a text segment or a style anchor. The rich text document is derived from this internal state.

The internal state achieves strong eventual consistency through the upstream List CRDT.

Identical internal states result in identical rich text documents. Hence, the same set of updates will produce the same rich text documents, evidencing the strong eventual consistency of this algorithm.

Criteria in Peritext

The Peritext paper (opens in a new tab) specifies the intent-preserving merge habits for wealthy textual content inline format. Loro’s wealthy textual content algorithm efficiently passes all take a look at instances outlined therein.

1. Concurrent Formatting and Insertion

Loro easily supports this case by treating style anchors as special elements alongside text.

2. Overlapping Formatting

This case has been analyzed earlier. Since our style anchors contain style op ID information, we know there are two bold segments: one from 0 to 5 and another from 3 to 11, allowing us to merge them.

Multiple style types are easily supported.

Like Peritext, we model unbolding by adding a new style with the key bold and the value null.
The final value of each style key on each character is determined by the style with the greatest
Lamport (opens in a new tab) timestamp that features the character. Thus, it simply helps this case.

See Also

3. Textual content Insertion at Span Boundaries

Insertion right after a bold style should result in the newly inserted text also being bold.

bold_expand

However, insertion right after a link style should result in the newly inserted text not having the hyperlink style.

Link style should not expand

4. Styles that Support Overlapping

The problem of overlapping styles is related to how we represent them.

We represent the rich text using Quill’s Delta (opens in a new tab) format.

[
	{ insert: 'Gandalf', attributes: { bold: true } },
	{ insert: ' the ' },
	{ insert: 'Grey', attributes: { color: '#cccccc' } }
]

An instance of Quill’s Delta format

Nevertheless, it can not deal with instances with a number of values assigned to the identical key. So, it is a headache to deal with the kinds that help overlapping.

For instance, within the above case, the textual content “fox” is commented on by each Alice and Bob. We will not characterize it with Quill’s Delta format immediately.
So the doable workaround contains:

Flip the attribute worth into an inventory

[
  { insert: "The ", attributes: { comment: [{ ...commentA }] } },
  {
    insert: "fox",
    attributes: { remark: [{ ...commentA }, { ...commentB }] },
  },
  { insert: " jumped", attributes: { remark: [{ ...commentB }] } },
]

Use op ID that creates the op as the important thing of the attribute

[
  { insert: "The ", attributes: { "id:0@A": { key: "comment", ...commentA } } },
  {
    insert: "fox",
    attributes: {
      "id:0@A": { key: "comment", ...commentA },
      "id:0@B": { key: "comment", ...commentB },
    },
  },
  {
    insert: " jumped",
    attributes: { "id:0@B": { key: "comment", ...commentA } },
  },
]

However each require particular behaviors for each CRDT lib and for software code, that are painful to work with.

Lastly, we discovered that the optimum strategy to characterize an overlappable model was to make use of <key>: as a prefix and permit customers to assign a singular suffix to create a definite model key.
This technique simplifies the CRDTs library code, because it does not require dealing with particular instances.
It successfully addresses eventualities the place a number of feedback overlap and can be user-friendly for software coding.

[
	{ insert: "The ", attributes: { "comment:alice": "Hi" } },
	{
		insert: "fox",
		attributes: { "comment:alice": "Hi", "comment:bob": "Jump" },
	},
	{ insert: " jumped", attributes: { "comment:bob": "Jump" } },
]

Following is the instance code in Loro:

const doc = new Loro();
doc.configTextStyle({
	remark: { broaden: "none", }
})
const textual content = doc.getText("textual content");
textual content.insert(0, "The fox jumped.");
textual content.mark({ begin: 0, finish: 7 }, "remark:alice", "Hello");
textual content.mark({ begin: 4, finish: 14 }, "remark:bob", "Leap");
anticipate(textual content.toDelta()).toStrictEqual([
	{
		insert: "The ", attributes: { "comment:alice": "Hi" },
	},
	{
		insert: "fox", 
		attributes: { 
			"comment:alice": "Hi", 
			"comment:bob": "Jump" 
		},
	},
	{
		insert: " jumped", attributes: { "comment:bob": "Jump" },
	},
	{
		insert: ".",
	}
])

Implementation of Loro’s Wealthy Textual content Algorithm

The following is an overview of Loro’s implementation as of January, 2024.

Architecture of Loro

In line with the properties of Replayable Event Graph, Loro uses OpLog and DocState as the internal state.

OpLog is dedicated to recording history, while DocState only records the current document state and does not include historical operation information. When applying updates from remote sources, Loro uses the relevant operations from OpLog and computes the diff through a DiffCalculator. This diff is then applied to DocState. This architecture also makes time travel easier to implement.

For more details, see the documentation on DocState and OpLog (opens in a new tab).

Implementation of Loro’s Wealthy Textual content CRDT

For rich text, Loro reuses the same DiffCalculator as Loro List, based on the Fugue (opens in a new tab) algorithm. Consequently, the first logic associated to wealthy textual content is concentrated in DocState. This contains expressing kinds, inserting new characters, and representing a number of index codecs.

Within the illustration of wealthy textual content state, we distinguish between the info construction ContentTree, which expresses the textual content (together with model anchors), and StyleRangeMap, which expresses kinds.

Each constructions are constructed on B+Bushes.

ContentTree is accountable for environment friendly textual content discovering, insertion, and deletion. It may well index particular insertion/deletion positions utilizing Unicode/UTF-8/UTF-16/Entity index. It doesn’t retailer what particular model every textual content phase ought to have.

We constructed the next B+Tree construction primarily based on our generic-btree library (opens in a new tab) to specific textual content in reminiscence:

  • Every inside node within the B+Tree shops the Unicode char size, UTF-16 size, UTF-8 size, and Entity size of its subtree. The Entity size considers the size of fashion anchors as 1, in any other case 0.
  • The leaf nodes of the B+Tree are textual content or model anchors.

StyleRangeMap is accountable for environment friendly updating/querying of fashion ranges.

Within the StyleRangeMap B+Tree expressing kinds:

  • Every inside node shops the entity size of its subtree.
  • Every leaf node shops the gathering of fashion info for the corresponding vary and its `entity size.

Separating the textual content ContentTree and magnificence StyleRangeMap into two constructions goals for higher efficiency optimization. On wealthy textual content, model info is usually not plentiful and tends to have good continuity, comparable to a number of paragraphs having the identical format, which will be expressed with a single leaf node. Nevertheless, our construction for storing textual content is unsuitable for leaf nodes with giant content material, as conversion time between completely different encoding codecs would change into excessively lengthy.

When a person inserts a brand new character at Unicode index = i, the next happens:

  • Discover the place at Unicode index = i in ContentTree.
  • Examine if there are any adjoining model anchors at this place. If not, immediately insert.
  • If there are, resolve whether or not to insert to the left or proper of the corresponding model anchor primarily based on its sort and properties. If there are a number of such model anchors, insert them based on the earlier part on “Behavior When Inserting Text at Style Boundaries”.

Testing

We have written tests for the criteria proposed by Peritext and passed all of them.

To ensure the correctness of our CRDTs, we have added numerous fuzzing tests to simulate different collaborative behaviors, synchronization behaviors, and time-travel behaviors. These tests check for the strong eventual consistency and the correctness of internal invariants. We run these fuzzing tests continuously for several days after every critical modification to avoid oversights.

How to Use

Before using the Loro’s rich text module, it is necessary to define the configuration for rich text styles, specifying the expand behavior for different keys and whether overlap is allowed.

Here is an example of using Loro’s rich text in JavaScript:

const doc = new Loro();
doc.configTextStyle({
	bold: {
		expand: "after",
	},
	comment: {
		expand: "none",
		overlap: true
	},
	link: {
		expand: "none"
	}
});
 
const text = doc.getText("text");
text.insert(0, "Hello world!");
text.mark({ start: 0, end: 5 }, "bold", true);
expect(text.toDelta()).toStrictEqual([
	{
		insert: "Hello",
		attributes: { bold: true },
	}, 
	{
		insert: " world!",
	}
] as Delta<string>[]);
 
text.insert(5, "!");
expect(text.toDelta()).toStrictEqual([
	{
	    insert: "Hello!",
	    attributes: { bold: true },
	}, {
	    insert: " world!",
	}
] as Delta<string>[]);

Summary

This article presents Loro’s rich text algorithm design and implementation. Its correctness is readily demonstrable. It can be built upon any existing List CRDT algorithm. It allows Loro to support rich text collaboration using REG and Fugue (opens in a new tab), combining the strengths of a number of CRDT algorithms.

We’re repeatedly refining its design and actively looking for design companions. We’re open to all types of suggestions and constructive criticism. Ought to you’ve gotten any proposals for collaboration, please attain out to zx@loro.dev



Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top