Rsync taking ages…how to speed up?

https://www.reddit.com/r/linuxquestions/comments/1fjxzvx/rsync_taking_ageshow_to_speed_up/

Rsync taking ages…how to speed up? I am transferring files from a remote location to local.

rsync -avP –rsync-path=/tmp/packages/dnf/usr/bin/rsync -e “ssh -p 22xxx -i /var/services/homes/xxx/.ssh/authkeys/id_xxx” xxxxxx@185.xxx.xxx.xx:/xxx/xxxxx/xxxxx /volume1/xxxxx/xxxxxx

This is a transfer from a company network to our synology NAS. They claim to be Gigabit and above, and our NAS has 3 1000Mbps connected to it. So the size of the file to transfer is 15.5TB and I am only getting 4~10MB/s speed….. this is going to take few months to finish. How can I take full bandwidth speed with rsync?

5 · 33 Comments Section Ath-ropos • 1y ago Are there a lot of files? If so you could speed things up by starting multiple rsync processes in parallel. For instance if there are 8 directories in your root directory, you could start 8 processes in parallel, one for each of these directories (assuming the number of files is more or less evenly distributed between these directories).

At work I need to synchronise hundreds of thousands of files with MS Azure, and using this method I could lower the sync time from 25mn to around 3mn (not talking about the initial sync which has to transfer the complete set of data).

2 u/lucasrizzini avatar lucasrizzini • 1y ago OP is limited by the bandwidth, how does using multiple rsync instances work around that exactly?

I could lower the sync time from 25mn to around 3mn

In a single rsync instance right? How about when you take into consideration all of them together?

1 Ath-ropos • 1y ago Because one rsync process generally doesn’t use the full bandwidth when syncing lots of files. For each file it has to check whether to synchronise or not, and during this time no data is transferred.

3 u/lucasrizzini avatar lucasrizzini • 1y ago Wow.. Really? That’s odd. Anyway, using multiple instances might indeed help then.

1 Ath-ropos • 1y ago rsync is an amazing tool and rock solid, but unfortunately not multi threaded.

3 AverageMedical5811 OP • 1y ago When u say in parallel, u mean open multiple ssh sessions and run sub folders separately at the same time?

1 Ath-ropos • 1y ago Yes, each call to rsync opens its own SSH session according to the command line in your post.

Instead of something like this:

rsync rootdir/ remote:rootdir/

You do this:

rsync rootdir/sub1 remote:rootdir/sub1

rsync rootdir/sub2 remote:rootdir/sub2

Beware that you may run into limitation of the number of parallel SSH sessions on the remote side depending on the SSH server configuration. For MS Azure I run 20 instances in parallel without any issue.

2 u/wizard10000 avatar wizard10000 • 1y ago Part of my server’s backup script. I start multiple rsync sessions in the background and need for all of them to complete before continuing with the script.

declare -i PID1=0 PID2=0 PID3=0 PID4=0

/usr/bin/rsync -qa –chown=wizard:wizard –del /etc/ /media/internal/server/etc & PID1=$! /usr/bin/rsync -qa –chown=wizard:wizard –del /usr/local/ /media/internal/server/usr-local & PID2=$! /usr/bin/rsync -qa –chown=wizard:wizard –exclude-from=/usr/local/etc/rsync/exclude –del /root/ /media/internal/server/root & PID3=$! /usr/bin/rsync -qa –chown=wizard:wizard –exclude-from=/usr/local/etc/rsync/exclude –del /home/wizard/ /media/internal/server/home & PID4=$!

wait $PID1 $PID2 $PID3 $PID4 2>/dev/null 1 pmodin • 1y ago Try rclone with –multi-thread-streams=N if you don’t want to fiddle too much. If you want to fiddle you could use xargs or parallel.

1 u/Individual-Cup-7458 avatar Individual-Cup-7458 • 1y ago What is the speed of your connection? It doesn’t matter how big the pipe is at the other end if you’re trying to suck through a straw.

1 AverageMedical5811 OP • 1y ago Three gigabit. So 3000mbps is the bandwith

1

1 more reply u/caa_admin avatar caa_admin • 1y ago Look into iperf it might reveal why you’re see slowdowns. Just because it’s claimed to be gigabit doesn’t mean it’s operating that way. https://www.baeldung.com/directory11/iperf-measure-network-performance

4 NotPrepared2 • 1y ago “Never underestimate the bandwidth of a station wagon full of (NAS) hurtling down the highway.”

Just drive there and do the transfer. Or ship the NAS there, and then ship it back.

4 u/Individual-Cup-7458 avatar Individual-Cup-7458 • 1y ago Was going to post exactly this. This is the answer.

2 u/scristopher7 avatar scristopher7 • 1y ago enable arcfour if this is private, massive rsync boost. I can say it has helped me greatly in the past with migrating around 4k mailboxes.

https://gist.github.com/KartikTalwar/4393116 outlines a example. Make sure to read what this is doing and dont just do it.

2 u/lucasrizzini avatar lucasrizzini • 1y ago Did you try using the -z flag?

–compress, -z compress file data during the transfer

4 u/scristopher7 avatar scristopher7 • 1y ago If the difference between compression on and off is minimal its usually better to not use compression because it will often slow the transfer down. But if the difference is significant then yes compression may be useful. However it also depends on the I/O on both sides which can cause a slowdown is one is slow, since the data must be compressed then uncompressed.

1 u/Gold-Program-3509 avatar Gold-Program-3509 • 1y ago now think about how much time is gonna be wasted by nas compressing 15tb of data

1

6 more replies br-rand • 1y ago You might wanna add a few more parameters to your rsync call

–numeric-ids

–delay-updates

–compress

But they won’t solve your network problem. Consider creating a single 100MB file and rsync that single file to test your network speeds. That will isolate io ops to single file being read sequentially.

Have you considered the possibility that your remote host might have throttled upload speeds? Look into rsync pull vs push. Sometimes it’s faster to run rsync from source host pushing to your remote rather than pulling from remote source to local

2 u/NoRecognition84 avatar NoRecognition84 • 1y ago You are transferring these files from a remote NAS to you over what kind of connection? Internet? Are you using a VPN?

2 u/michaelpaoli avatar michaelpaoli • 1y ago Rsync taking ages…how to speed up?

I am transferring files from a remote

Try various compression levels, from quite extreme to none at all, with a reasonable example sample of the data, and adjust accordingly to optimise. If you bottleneck on CPU, you’re compression level is too high. If you bottleneck on network, you can probably do with (bit) more compression - at least ‘till you start to bottleneck on CPU. If you bottleneck on drive(s), well, you optimise that if you can - otherwise not much else you can do to make that stuff go faster.

If you do use –checksum you’ll get better integrity, if you don’t it’ll be (at least a tiny bit) faster, at the cost of integrity.

If the performance is still quite unexpectedly slow, dig further, to determine where you’re bottlenecking.

1 timonix • 1y ago We had a similar issue where it would use almost the full speed when running from windows, and just crawl when running the same command from Linux.

I don’t actually know what fixed it. Our storage provider made an update which fixed it randomly one day.

1 TabsBelow • 1y ago Additional to the top replies, what about package size? If chunks are too small, there is to muc overhead, if they are too big you experience some “dead periods” where transmission wait until the other side acknowledged write completion fir that part.

1 u/Dolapevich avatar Dolapevich • 1y ago Before asking how to speed up rsync try to measure the speed of those links and making sure there are no droppped packets.

I can suggest smokeping to test the link quality, and doing a simple sftp copy to measure it.

1 NL_Gray-Fox • 1y ago There’s a big (massive) difference between syncing a million small files and syncing one big file, maybe try and tar it beforehand.

1 u/Gold-Program-3509 avatar Gold-Program-3509 • 1y ago you can have 3 gazilion gbps link if youre reading from small nas using hdd, you not gonna saturate the link no matter what..if youre syncing lot of very small files, the speed youre getting is very realistic (for hdd storage)..

Updated: