Automated Backup with rsync and cron: Stop Losing Data the Hard Way

Linux tutorial - IT technology blog
Linux tutorial - IT technology blog

The 2 AM Wake-Up Call Nobody Wants

A colleague of mine lost three weeks of database exports last year. Not because of a hardware failure — because he forgot to run the manual backup script before a migration went sideways. The files existed on the source server until they didn’t, and there was no automated copy anywhere.

Sound familiar? Manual backup routines collapse under real workload pressure. You skip one Friday, then two, and then something breaks on a Tuesday morning and you’re reconstructing data from memory.

The fix isn’t discipline — it’s automation. Specifically, rsync paired with cron handles this problem cleanly, without extra software or cloud subscriptions.

Why Manual Backups Always Fail Eventually

The root cause isn’t laziness. Manual backup processes fail for structural reasons:

  • Context switching — You’re focused on a deployment and the backup step gets skipped mentally.
  • Inconsistent timing — “I’ll do it tonight” becomes “I’ll do it tomorrow.”
  • Tooling mismatch — Copying with cp -r doesn’t handle symlinks, permissions, or partial transfers correctly.
  • No verification — You ran the backup, but did it complete? Was the destination actually written?

Once a backup depends on someone remembering to run it, you’ve already accepted data loss risk. Cron removes the memory requirement. rsync fixes the tooling problems.

rsync vs. Other Backup Approaches

Three tools typically come up when people talk Linux backup. Here’s what each one actually does — and where it falls short.

Option 1: cp -r

Simple, available everywhere. But it copies everything every time. For large directories, that means full re-transfers even if only one file changed. It also doesn’t preserve extended attributes or handle interrupted transfers gracefully.

Option 2: tar archives

Creates compressed archives, which is useful for long-term archival or moving backups off-site. The downside: restoring a single file requires extracting the whole archive. And again, every run is a full copy.

Option 3: rsync (incremental sync)

rsync only transfers what changed — comparing file sizes and modification timestamps (or checksums if you ask it to). On my production Ubuntu 22.04 server, nightly backups with tar were taking roughly 12 minutes. Switching to rsync dropped that to under 90 seconds. Only the changed files got transferred, so the delta was tiny.

rsync also handles interrupted transfers by resuming from where it stopped, and it preserves permissions, symlinks, timestamps, and ownership by default with the right flags.

For server backups — local or remote — rsync is the clear winner.

Setting Up rsync for Automated Backup

Install rsync

Most Linux distributions ship rsync by default. If not:

# Debian/Ubuntu
sudo apt install rsync

# RHEL/AlmaLinux/Rocky
sudo dnf install rsync

Basic rsync Command Structure

rsync -avz --delete /source/directory/ /destination/directory/

Breaking down the flags:

  • -a (archive mode) — preserves permissions, timestamps, symlinks, owner, group
  • -v (verbose) — shows what’s being transferred (useful in logs)
  • -z (compress) — compresses data during transfer (mainly useful for remote; skip it for local backups)
  • --delete — removes files from destination that no longer exist in source (keeps them in sync)

Note the trailing slash on the source/source/directory/ means “sync the contents of this directory.” Without it, rsync creates a subdirectory named directory inside the destination. This is a common gotcha.

Local Backup Example

# Backup /var/www to an external drive mounted at /mnt/backup
rsync -av --delete /var/www/ /mnt/backup/www/

Remote Backup Over SSH

# Backup to a remote server
rsync -avz --delete -e ssh /var/www/ user@backup-server:/backups/www/

# With a specific SSH key and non-standard port
rsync -avz --delete -e "ssh -i /root/.ssh/backup_key -p 2222" \
  /var/www/ user@backup-server:/backups/www/

For automated (unattended) remote backups, you need passwordless SSH authentication. Generate a dedicated key pair:

# Generate key on source server
ssh-keygen -t ed25519 -f /root/.ssh/backup_key -N ""

# Copy public key to destination
ssh-copy-id -i /root/.ssh/backup_key.pub user@backup-server

Excluding Files and Directories

# Exclude cache, temp files, and logs
rsync -av --delete \
  --exclude='cache/' \
  --exclude='*.log' \
  --exclude='tmp/' \
  /var/www/ /mnt/backup/www/

For many exclusions, put them in a file:

# /etc/rsync-excludes.txt
cache/
*.log
tmp/
*.swp
.git/
rsync -av --delete --exclude-from=/etc/rsync-excludes.txt \
  /var/www/ /mnt/backup/www/

Writing a Production-Ready Backup Script

Running rsync directly from cron works, but a wrapper script gives you logging, error handling, and somewhere to add logic later without touching the crontab.

#!/bin/bash
# /usr/local/bin/backup-www.sh

SOURCE="/var/www/"
DEST="/mnt/backup/www/"
LOG="/var/log/rsync-backup.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')

echo "[$DATE] Starting backup" >> "$LOG"

rsync -av --delete \
  --exclude='cache/' \
  --exclude='*.log' \
  --log-file="$LOG" \
  "$SOURCE" "$DEST"

EXIT_CODE=$?

if [ $EXIT_CODE -eq 0 ]; then
  echo "[$DATE] Backup completed successfully" >> "$LOG"
else
  echo "[$DATE] Backup FAILED with exit code $EXIT_CODE" >> "$LOG"
fi

exit $EXIT_CODE
# Make it executable
chmod +x /usr/local/bin/backup-www.sh

# Test it manually first
/usr/local/bin/backup-www.sh

Scheduling with cron

Once the script works manually, hand it to cron.

# Edit the root crontab
crontab -e

Add your schedule. cron syntax is: minute hour day month weekday command

# Run daily at 2:30 AM
30 2 * * * /usr/local/bin/backup-www.sh

# Run every 6 hours
0 */6 * * * /usr/local/bin/backup-www.sh

# Run every Sunday at 1:00 AM
0 1 * * 0 /usr/local/bin/backup-www.sh

If you’re unsure about cron syntax, crontab.guru lets you paste an expression and see plain-English output.

Verify cron Is Running Your Job

# Check system cron log (Debian/Ubuntu)
grep CRON /var/log/syslog | tail -20

# Check cron log (RHEL/AlmaLinux)
grep CRON /var/log/cron | tail -20

Keeping Multiple Backup Generations

Plain rsync gives you one copy — delete a file on the source and the next run removes it from the backup too. For versioned backups, use --backup with --backup-dir:

#!/bin/bash
# Versioned backup: keeps dated copies of changed/deleted files

SOURCE="/var/www/"
BACKUP_ROOT="/mnt/backup"
CURRENT="$BACKUP_ROOT/current"
DATED="$BACKUP_ROOT/versions/$(date +%Y-%m-%d)"

rsync -av --delete \
  --backup \
  --backup-dir="$DATED" \
  "$SOURCE" "$CURRENT/"

/mnt/backup/current/ stays as a live mirror. Any files that changed or were deleted land in a dated directory like /mnt/backup/versions/2026-03-08/. Point-in-time recovery without storing full copies every run.

Add a cleanup step to avoid accumulating years of version directories:

# Remove version directories older than 30 days
find "$BACKUP_ROOT/versions/" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;

Testing Restore — Don’t Skip This

Untested backups are just disk space you feel good about until something breaks. After setting this up, actually restore something:

# Restore a single file
rsync -av /mnt/backup/www/html/index.php /tmp/restore-test/

# Restore entire directory to a test location
rsync -av /mnt/backup/www/ /tmp/full-restore-test/

Do this quarterly at minimum. Backup processes drift — cron jobs get moved to different servers, mount points change, destination drives fill up and nobody notices. A restore test catches all of these before they matter.

Quick Reference

  • Always test your rsync command manually before adding to cron
  • Log output to a file — silent cron jobs hide failures
  • Use --dry-run to preview what rsync would transfer without actually doing it
  • For remote backups, set up a dedicated SSH key with restricted permissions on the destination
  • Check your backup destination has enough free space — rsync exits with an error mid-transfer, but you’ll only catch it if you’re watching the logs
# Dry run — see what would be transferred
rsync -av --dry-run --delete /var/www/ /mnt/backup/www/

# Check destination disk usage
df -h /mnt/backup

Initial setup takes about 30 minutes. After that, it runs every night without you thinking about it. That’s the whole point.

Share: