1
0
mirror of https://github.com/laurent22/joplin.git synced 2024-12-30 10:36:35 +02:00
joplin/readme/news/20200906-172325.md
2021-12-17 18:37:01 +01:00

6.2 KiB

created source_url
2020-09-06T17:23:25.000+00:00 https://www.patreon.com/posts/improving-sync-41310131

Improving the sync process in Joplin

The latest version of Joplin includes a mechanism to upgrade the  structure of a sync target. When you startup the app you will be asked  to upgrade before being able to sync. Once you start the process, the  app will briefly display an information screen, upgrade the sync target,  and then restart the app. You’ll then be able to sync with the new sync  target format. That first upgrade is quite simple as the goal for now  is to put the mechanism in place and verify that it works well.

From a user perspective this feature doesn’t do anything visible, although it caused some issues, so one might wonder why it’s even there. This post is meant to clarify this.

The structure of the sync target hasn’t really changed since the day  Joplin was released. It works well however it has some shortcomings that  should be fixed eventually for sync to remain performant.

There are also various improvements that could be made but were not  previously possible due to the lack of an upgrade mechanism. I have  listed below the 5 main limitations or issues with the current sync  process and how they could be fixed:

No upper limit on the number of items

Joplin’s UI works well even with millions of notes, however the sync  target will keep getting slower and slower as more files are added to  it. File systems often have a limit to the number of files they can  support in a directory. One user also has reached the limit of 150,000 items on OneDrive.

For now, this is not a big issue because most users don’t have  millions of items, but as more web pages are being clipped (clipped  pages often contain many small resources and images) and more note  revisions are created (one note can have hundreds of revisions), this  issue might start affecting more users.

One way to solve this issue would be to split the sync items into  multiple directories. For example if we split the main directory into  100 sub-directories, it will be possible to have 15,000,000 OneDrive  items instead of 150,000. Another way would be to support note  archiving, as described below. How exactly we’ll handle this problem is  still to be defined, but there are certainly ways.

Not possible to prioritise downloads

Currently, when syncing, the items are downloaded in a random way. So  it might download some notes, then some tags and notebooks, then back  to downloading notes, etc. For small sync operations it doesn’t matter,  but large ones, like when setting up a new device, it is very  inefficient. For example, the app might download hundreds of note  revisions or tags, but won’t display anything for a while because it  won’t have downloaded notebooks or notes.

A simple improvement would be to group the items by type on the sync  target. So all notebook items together, all tags together, etc. Doing so  means when syncing we can first download the notebooks, then the notes,  which means something will be displayed almost immediately in the app,  allowing the user to start using it. Then later less important items  like tags or note revisions will be downloaded.

End-to-end encryption is hard to setup

Currently, the encryption settings is a property of the clients. What  it means it that when you setup a new client, it doesn’t know whether  the other clients use encryption or not. It’s going to guess more or  less based on the data on the sync target. You can also force it to use  encryption but this has drawbacks and often mean a new master key is  going to be created, even though there might already be one on the sync  target.

E2EE works well once it’s setup, but doing so can be tricky and possibly confusing - if you didn’t follow this guide to the letter, you might end up with multiple master keys, or sending decrypted notes to an encrypted target.

A way to solve this would be to make the E2EE settings a property of  the sync target. Concretely there would be a file that tells if E2EE is  enabled or not, and maybe some way to quickly get the master key. It  would simplify setting up encryption a lot and make it more secure  (because you won’t be able to send non-encrypted notes to an encrypted  sync target). When you setup a new client, the client will know  immediately if it’s an encrypted target or not and set the client  accordingly.

Old notes that never change should be handled differently

It would be more efficient to treat old notes differently by allowing  the user to “archive” them. An archived note would be read-only. Then  one idea could be to group all these archived notes into a ZIP file on  the sync target. Doing so means that the initial sync would be much  faster (instead of downloading hundred of small files, which is slow, it  will download one large file, which is fast). It would also make the  structure more scalable - you could keep several years of archived notes  on the sync target while keeping sync fast and efficient.

The resource directory should be renamed

The folder that contains file attachments is named “.resources” on  the sync target. This causes troubles because certain platforms will  hide directories that start with dot “.”, and perhaps they will be  excluded from backup or skipped when moved somewhere else. Being able to  upgrade the sync target means we can rename this folder to just  “resources” instead.

Conclusion

That’s obviously a lot of possible improvements and it won’t be done  overnight, but having the sync upgrade mechanism in place means we can  start considering these options. Some of these, such as renaming the  “resources” dir are simpler and could be done relatively soon. Perhaps  other more complex ones will be group within one sync target upgrade to  minimise disruption. In any case, I hope this clarifies the reason for  this recent sync upgrade and that it gives some ideas of what to expect  in the future.