Claude 1207ee2838 Fix high-priority bugs and add comprehensive test coverage
This commit addresses 8 high-priority issues identified in code analysis.

Fixes #3, #4, #5, #7, #10, #19, #20, #21

## Critical Bug Fixes

1. **Race condition in lock file creation (#3)**
   - Changed to atomic file creation using os.O_CREAT | os.O_EXCL
   - Prevents two processes from both acquiring the lock
   - Location: curateipsum/backup.py:110-115

2. **Invalid lock file error handling (#4)**
   - Added try/except for corrupted/empty lock files
   - Gracefully removes corrupted locks and retries
   - Location: curateipsum/backup.py:121-133

3. **SIGKILL vs SIGTERM issue (#5)**
   - Now sends SIGTERM first for graceful shutdown
   - Waits 5 seconds before escalating to SIGKILL
   - Allows previous process to clean up resources
   - Location: curateipsum/backup.py:146-156

4. **Wrong stat object for permissions (#7)**
   - Fixed bug where dst_stat was used instead of src_stat
   - Permissions are now correctly updated during rsync
   - Location: curateipsum/fs.py:371

5. **os.chown() fails for non-root users (#10)**
   - Wrapped all os.chown() calls in try/except blocks
   - Logs debug message instead of crashing
   - Allows backups to succeed for non-root users
   - Locations: curateipsum/fs.py:217-221, 228-231, 383-387, 469-472

## Comprehensive Test Coverage

6. **Lock file tests (#19)**
   - Added TestBackupLock class with 7 test cases
   - Tests: creation, concurrent prevention, stale locks, corruption
   - Location: tests/test_backups.py:228-330

7. **Filesystem operation tests (#20)**
   - Added tests/test_fs_extended.py with 6 test classes
   - Tests: copy_file, copy_direntry, rsync, hardlink_dir, scantree, rm_direntry
   - 20+ test cases covering normal and edge cases
   - Location: tests/test_fs_extended.py

8. **Integration tests (#21)**
   - Added tests/test_integration.py with 2 test classes
   - Tests full backup workflow end-to-end
   - Tests: incremental backups, hardlinks, delta dirs, cleanup, recovery
   - 14 test cases covering complete backup lifecycle
   - Location: tests/test_integration.py

## Test Results
All 68 tests pass successfully:
- 11 original backup cleanup tests
- 7 new lock file tests
- 16 original fs tests
- 20 new fs extended tests
- 14 new integration tests

## Impact
These fixes address critical bugs that could cause:
- Data corruption from concurrent backups
- Incomplete cleanup from forced process termination
- Permission sync failures
- Tool unusability for non-root users

The comprehensive test coverage ensures these bugs are caught early
and provides confidence for future refactoring.
2026-02-03 22:06:35 -08:00
2021-11-12 20:50:32 +03:00

cura-te-ipsum

cura-te-ipsum is a space-efficient incremental backup utility for Linux and macOS that uses hardlinks to minimize storage usage while maintaining complete directory snapshots.

Similar to Time Machine or rsnapshot, cura-te-ipsum creates backups that appear as complete directory trees but intelligently share unchanged files between snapshots, dramatically reducing storage requirements.

Features

  • Space-Efficient Incremental Backups: Creates full directory snapshots using hardlinks, unchanged files share inodes with previous backups
  • Intelligent Retention Policies: Automatic cleanup with configurable grandfather-father-son rotation (daily/weekly/monthly/yearly)
  • Pure Python Implementation: No external dependencies required for basic operation (optional rsync support available)
  • Delta Tracking: Automatically identifies and tracks changed files between backups
  • Backup Integrity: Lock files and completion markers prevent concurrent runs and identify incomplete backups
  • Safe Operations: Dry-run mode to preview changes before execution
  • Cross-Platform: Supports both Linux and macOS

Installation

From Source

git clone https://github.com/snegov/cura-te-ipsum.git
cd cura-te-ipsum
pip install .

Requirements

  • Python 3.6 or higher
  • Linux or macOS operating system
  • Optional: rsync and GNU cp for alternative implementation modes

Usage

Basic Backup

cura-te-ipsum -b /path/to/backups /path/to/source

This creates a timestamped backup in /path/to/backups/YYYY-MM-DD_HH-MM-SS/.

Multiple Sources

cura-te-ipsum -b /backups /home/user/documents /home/user/photos

Command-Line Options

cura-te-ipsum -b BACKUPS_DIR SOURCE [SOURCE ...]

Required Arguments:
  -b BACKUPS_DIR        Directory where backups will be stored
  SOURCE                One or more directories to backup

Optional Arguments:
  -n, --dry-run         Preview changes without creating backup
  -f, --force           Force run even if previous backup is in progress
  -v, --verbose         Enable debug logging
  --external-rsync      Use external rsync instead of Python implementation
  --external-hardlink   Use cp/gcp command for hardlinking

Examples

Dry run to preview changes:

cura-te-ipsum -b /backups /home/user/data --dry-run

Verbose output for debugging:

cura-te-ipsum -b /backups /home/user/data --verbose

Using external rsync:

cura-te-ipsum -b /backups /home/user/data --external-rsync

How It Works

cura-te-ipsum creates complete directory snapshots, but files that haven't changed between backups share the same inode (hardlinked). This means:

  • Each backup appears as a complete, browseable directory tree
  • Only changed or new files consume additional disk space
  • Deleting old backups doesn't affect other snapshots until the last reference is removed

Backup Process

  1. Lock Acquisition: Creates .backups_lock to prevent concurrent operations
  2. Hardlink Creation: Hardlinks all files from the most recent backup
  3. Rsync Sync: Syncs source directories to the new backup, updating changed files
  4. Delta Tracking: Copies changed/new files to .backup_delta directory
  5. Completion Marker: Creates .backup_finished marker file
  6. Cleanup: Removes old backups based on retention policy
  7. Lock Release: Removes lock file

Retention Policy

Default retention (configurable in code):

  • 7 days: Keep all backups
  • 30 days: Keep one backup per day
  • 52 weeks: Keep one backup per week
  • 12 months: Keep one backup per month
  • 5+ years: Keep one backup per year

The cleanup process never deletes the only remaining backup.

Backup Structure

backups/
  2025-01-15_10-30-00/          # backup snapshot
    .backup_finished            # completion marker
    .backup_delta/              # changed files in this backup
    [your backed up files]      # complete directory tree
  2025-01-16_10-30-00/
    .backup_finished
    .backup_delta/
    [your backed up files]
  .backups_lock                 # lock file (only during backup)

Development

Running Tests

pip install -r requirements-dev.txt
pytest

CI/CD

GitHub Actions automatically runs tests on Python 3.6 through 3.11 for every push and pull request.

Author

Maks Snegov (snegov@spqr.link)

Project Status

Development Status: Pre-Alpha

This project is actively maintained and used in production for personal backups, but the API and configuration options may change in future releases.

Description
No description provided
Readme 387 KiB
Languages
Python 100%