In CouchDB mem3_sync
is the mechanism that keeps shards of the same range synchronised across multiple cluster nodes. It does so by implementing the CouchDB replication protocol over Erlang distributed messages. In slightly more detail this means it reads the by-seq index of a shard and copies out all live content into the new shard.
During cluster management operations like increasing the number of nodes it can become necessary to move an existing shard from one node (A) to another (B) where there is no such shard on node B already, because it just freshly joined the cluster and no database shard map points to it.
The easiest way to move a shard to node B is to edit the shard map for a respective database and change the entries for a particular shard range or ranges to point to node B instead of node A. mem3
will recognise the change in the shard map and notice the missing shard on node B and then will initiate mem3_sync
to fill up the shard to match the copies on the other cluster nodes.
Unfortunately, implementing CouchDB replication means that data transfer does not happen at network speed because following the protocol correctly means multiple request/response cycles during each batch of operation.
However, when replicating into an empty shard, none of these extra roundtrips are strictly necessary. They are necessary later when a shard might be slightly behind other nodes so that the shard can be caught up correctly, but for the initial sync, mem3_sync
does more work than strictly needed.
A common recommendation for speeding up this process, especially for large databases, is to first scp
or rsync
the .couch
shard file to the new host and then let mem3_sync
top it up.
It would be a lot nicer if CouchDB chose a faster binary-copy on its own when it knows it is replicating into an empty shard file.
We propose the following algorithm to speed up the process:
mem3_sync
is invoked: check if the target shard exists on node B.mem3_sync
nothing changes.dbname.timestamp.couch.hardlink.file-length
, e.g. db.1234567890.couch.hardlink.7865432
ln
. it should be in the couch_file
gen_server
loop therefore.mem3_sync
functionality, log a warning about using an odd file system.initial
suffix.
length(target)
to length(source)
(as recorded at the beginning of this), do not read until EOF
.mem3_sync
functionality to top up the shard file.initial
suffixTODO:
.hardlink
and .initial
files.or
or
By clicking below, you agree to our terms of service.
New to HackMD? Sign up
Syntax | Example | Reference | |
---|---|---|---|
# Header | Header | 基本排版 | |
- Unordered List |
|
||
1. Ordered List |
|
||
- [ ] Todo List |
|
||
> Blockquote | Blockquote |
||
**Bold font** | Bold font | ||
*Italics font* | Italics font | ||
~~Strikethrough~~ | |||
19^th^ | 19th | ||
H~2~O | H2O | ||
++Inserted text++ | Inserted text | ||
==Marked text== | Marked text | ||
[link text](https:// "title") | Link | ||
 | Image | ||
`Code` | Code |
在筆記中貼入程式碼 | |
```javascript var i = 0; ``` |
|
||
:smile: | ![]() |
Emoji list | |
{%youtube youtube_id %} | Externals | ||
$L^aT_eX$ | LaTeX | ||
:::info This is a alert area. ::: |
This is a alert area. |
On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?
Please give us some advice and help us improve HackMD.
Do you want to remove this version name and description?
Syncing