Add comprehensive README documentation

- Project overview and features
- Installation and usage instructions
- Detailed explanation of how hardlink-based backups work
- Command-line options and examples
- Backup structure and retention policy details
- Development and testing information
This commit is contained in:
Claude 2025-11-08 07:20:34 +00:00 committed by Maks Snegov
parent 6d6038b027
commit 6839d21a77
2 changed files with 153 additions and 2 deletions

View File

@ -9,9 +9,9 @@ jobs:
matrix:
python-version: ["3.6", "3.7", "3.8", "3.9", "3.10", "3.11"]
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies

151
README.md
View File

@ -0,0 +1,151 @@
# cura-te-ipsum
**cura-te-ipsum** is a space-efficient incremental backup utility for Linux and macOS that uses hardlinks to minimize storage usage while maintaining complete directory snapshots.
Similar to Time Machine or rsnapshot, cura-te-ipsum creates backups that appear as complete directory trees but intelligently share unchanged files between snapshots, dramatically reducing storage requirements.
## Features
- **Space-Efficient Incremental Backups**: Creates full directory snapshots using hardlinks, unchanged files share inodes with previous backups
- **Intelligent Retention Policies**: Automatic cleanup with configurable grandfather-father-son rotation (daily/weekly/monthly/yearly)
- **Pure Python Implementation**: No external dependencies required for basic operation (optional rsync support available)
- **Delta Tracking**: Automatically identifies and tracks changed files between backups
- **Backup Integrity**: Lock files and completion markers prevent concurrent runs and identify incomplete backups
- **Safe Operations**: Dry-run mode to preview changes before execution
- **Cross-Platform**: Supports both Linux and macOS
## Installation
### From Source
```bash
git clone https://github.com/snegov/cura-te-ipsum.git
cd cura-te-ipsum
pip install .
```
### Requirements
- Python 3.6 or higher
- Linux or macOS operating system
- Optional: `rsync` and GNU `cp` for alternative implementation modes
## Usage
### Basic Backup
```bash
cura-te-ipsum -b /path/to/backups /path/to/source
```
This creates a timestamped backup in `/path/to/backups/YYYY-MM-DD_HH-MM-SS/`.
### Multiple Sources
```bash
cura-te-ipsum -b /backups /home/user/documents /home/user/photos
```
### Command-Line Options
```
cura-te-ipsum -b BACKUPS_DIR SOURCE [SOURCE ...]
Required Arguments:
-b BACKUPS_DIR Directory where backups will be stored
SOURCE One or more directories to backup
Optional Arguments:
-n, --dry-run Preview changes without creating backup
-f, --force Force run even if previous backup is in progress
-v, --verbose Enable debug logging
--external-rsync Use external rsync instead of Python implementation
--external-hardlink Use cp/gcp command for hardlinking
```
### Examples
**Dry run to preview changes:**
```bash
cura-te-ipsum -b /backups /home/user/data --dry-run
```
**Verbose output for debugging:**
```bash
cura-te-ipsum -b /backups /home/user/data --verbose
```
**Using external rsync:**
```bash
cura-te-ipsum -b /backups /home/user/data --external-rsync
```
## How It Works
### Hardlink-Based Snapshots
cura-te-ipsum creates complete directory snapshots, but files that haven't changed between backups share the same inode (hardlinked). This means:
- Each backup appears as a complete, browseable directory tree
- Only changed or new files consume additional disk space
- Deleting old backups doesn't affect other snapshots until the last reference is removed
### Backup Process
1. **Lock Acquisition**: Creates `.backups_lock` to prevent concurrent operations
2. **Hardlink Creation**: Hardlinks all files from the most recent backup
3. **Rsync Sync**: Syncs source directories to the new backup, updating changed files
4. **Delta Tracking**: Copies changed/new files to `.backup_delta` directory
5. **Completion Marker**: Creates `.backup_finished` marker file
6. **Cleanup**: Removes old backups based on retention policy
7. **Lock Release**: Removes lock file
### Retention Policy
Default retention (configurable in code):
- **7 days**: Keep all backups
- **30 days**: Keep one backup per day
- **52 weeks**: Keep one backup per week
- **12 months**: Keep one backup per month
- **5+ years**: Keep one backup per year
The cleanup process never deletes the only remaining backup.
## Backup Structure
```
backups/
2025-01-15_10-30-00/ # backup snapshot
.backup_finished # completion marker
.backup_delta/ # changed files in this backup
[your backed up files] # complete directory tree
2025-01-16_10-30-00/
.backup_finished
.backup_delta/
[your backed up files]
.backups_lock # lock file (only during backup)
```
## Development
### Running Tests
```bash
pip install -r requirements-dev.txt
pytest
```
### CI/CD
GitHub Actions automatically runs tests on Python 3.6 through 3.11 for every push and pull request.
## Author
Maks Snegov (<snegov@spqr.link>)
## Project Status
Development Status: Pre-Alpha
This project is actively maintained and used in production for personal backups, but the API and configuration options may change in future releases.