Yearly retention overrides monthly retention, violating keep_monthly guarantee #32

Open
opened 2026-02-05 04:42:03 +00:00 by snegov · 0 comments
Owner

Problem

When retention policies are combined, coarser tiers can remove backups already protected by finer tiers, violating documented guarantees. Affects multiple tier interactions, not just yearly/monthly.

Expected Behavior

Per docstrings: retention policies should honor their thresholds

  • keep_daily=7 should keep 7 days
  • keep_weekly=4 should keep 4 weeks
  • keep_monthly=6 should keep 6 months
  • keep_yearly=3 should keep 3 years

Actual Behavior - THREE BUGS FOUND

Bug 1: Weekly overrides daily

# Scenario: keep_all=2, keep_daily=7, keep_weekly=4
# Daily threshold: Nov 4 (should keep Nov 4-8)
# Weekly threshold: Oct 11

# Nov 4 backup:
#   ✓ Protected by daily threshold (Nov 4 >= threshold)
#   ✗ Removed by weekly tier (same week as Nov 3)
# Result: Only 4 daily backups instead of 5

Bug 2: Monthly overrides weekly

# Weekly threshold: Oct 11 (should keep weeks 42-44)
# Monthly threshold: May 1

# Oct 11 backup:
#   ✓ Protected by weekly threshold (Oct 11 >= threshold)
#   ✗ Removed by monthly tier (same month as Oct 1)
# Result: Only 3 weekly backups instead of 4

Bug 3: Yearly overrides monthly

# Monthly threshold: May 1 (should keep May-Oct = 6 months)
# Yearly threshold: Jan 1 2018

# May 15 backup:
#   ✓ Protected by monthly threshold (May 15 >= threshold)
#   ✗ Removed by yearly tier (same year as Jan 15)
# Result: Only 5 monthly backups instead of 6

Root Cause

Algorithm in backup.py:258-296 processes backups newest→oldest:

  1. Backup in finer tier → mark as keep, set as prev, continue
  2. Later backup in coarser tier → can mark prev for removal
  3. Coarser tier check doesn't verify if prev is protected by finer threshold
# At daily backup (Nov 4):
if backup.name > thresholds["daily"]:  # YES
    # Keep it, continue
    prev_backup = backup  # nov_04
    
# At weekly backup (Nov 3):
if backup.name > thresholds["weekly"]:  # YES
    # Same week as prev (Nov 4)? YES
    to_remove[prev_backup] = True  # Removes Nov 4!

Reproduction

Run test added in commit (updated):

pytest tests/test_backups.py::TestBackupCleanup::test_combined_retention_policies -v

Test now validates expected behavior and catches all 3 bug patterns.

Impact

  • Users get fewer backups than configured across ALL retention tiers
  • Violates documented retention guarantees for daily/weekly/monthly
  • Affects any combination of retention policies
  • Expected: 24 backups (5+5+4+6+4), Actual: 21 backups (5+4+3+5+4)

Proposed Fix

Prevent coarser retention tiers from removing backups within finer thresholds:

# When in coarser tier (e.g., weekly), check if prev is protected:
if backup.name > thresholds["weekly"]:
    # Check all finer thresholds
    if prev_backup.name > thresholds["daily"]:
        # Don't remove - protected by daily retention
        prev_backup = backup
        continue
    
    # Only remove if not protected by finer tier
    if same_week(prev_backup, backup):
        to_remove[prev_backup] = True

Context

Discovered while adding comprehensive combined retention policy test.
Initial investigation found monthly/yearly bug, deeper analysis revealed
systematic issue affecting all tier interactions.

## Problem When retention policies are combined, coarser tiers can remove backups already protected by finer tiers, violating documented guarantees. **Affects multiple tier interactions**, not just yearly/monthly. ## Expected Behavior Per docstrings: retention policies should honor their thresholds - `keep_daily=7` should keep 7 days - `keep_weekly=4` should keep 4 weeks - `keep_monthly=6` should keep 6 months - `keep_yearly=3` should keep 3 years ## Actual Behavior - THREE BUGS FOUND ### Bug 1: Weekly overrides daily ```python # Scenario: keep_all=2, keep_daily=7, keep_weekly=4 # Daily threshold: Nov 4 (should keep Nov 4-8) # Weekly threshold: Oct 11 # Nov 4 backup: # ✓ Protected by daily threshold (Nov 4 >= threshold) # ✗ Removed by weekly tier (same week as Nov 3) # Result: Only 4 daily backups instead of 5 ``` ### Bug 2: Monthly overrides weekly ```python # Weekly threshold: Oct 11 (should keep weeks 42-44) # Monthly threshold: May 1 # Oct 11 backup: # ✓ Protected by weekly threshold (Oct 11 >= threshold) # ✗ Removed by monthly tier (same month as Oct 1) # Result: Only 3 weekly backups instead of 4 ``` ### Bug 3: Yearly overrides monthly ```python # Monthly threshold: May 1 (should keep May-Oct = 6 months) # Yearly threshold: Jan 1 2018 # May 15 backup: # ✓ Protected by monthly threshold (May 15 >= threshold) # ✗ Removed by yearly tier (same year as Jan 15) # Result: Only 5 monthly backups instead of 6 ``` ## Root Cause Algorithm in `backup.py:258-296` processes backups newest→oldest: 1. Backup in finer tier → mark as keep, set as prev, continue 2. Later backup in coarser tier → can mark prev for removal 3. Coarser tier check doesn't verify if prev is protected by finer threshold ```python # At daily backup (Nov 4): if backup.name > thresholds["daily"]: # YES # Keep it, continue prev_backup = backup # nov_04 # At weekly backup (Nov 3): if backup.name > thresholds["weekly"]: # YES # Same week as prev (Nov 4)? YES to_remove[prev_backup] = True # Removes Nov 4! ``` ## Reproduction Run test added in commit (updated): ```bash pytest tests/test_backups.py::TestBackupCleanup::test_combined_retention_policies -v ``` Test now validates expected behavior and catches all 3 bug patterns. ## Impact - Users get fewer backups than configured across ALL retention tiers - Violates documented retention guarantees for daily/weekly/monthly - Affects any combination of retention policies - Expected: 24 backups (5+5+4+6+4), Actual: 21 backups (5+4+3+5+4) ## Proposed Fix Prevent coarser retention tiers from removing backups within finer thresholds: ```python # When in coarser tier (e.g., weekly), check if prev is protected: if backup.name > thresholds["weekly"]: # Check all finer thresholds if prev_backup.name > thresholds["daily"]: # Don't remove - protected by daily retention prev_backup = backup continue # Only remove if not protected by finer tier if same_week(prev_backup, backup): to_remove[prev_backup] = True ``` ## Context Discovered while adding comprehensive combined retention policy test. Initial investigation found monthly/yearly bug, deeper analysis revealed systematic issue affecting all tier interactions.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: snegov/cura-te-ipsum#32