cleanup_old_backups() loads all backups into memory #12

Open
opened 2025-11-15 03:44:14 +00:00 by snegov · 0 comments
Owner

cleanup_old_backups() loads all backups into memory

Priority: Low
Component: backup.py
Type: Performance

Description

The cleanup function loads all backup entries into memory as a list, even when only iterating through them once. For backup directories with thousands of backups, this is inefficient.

Location

curateipsum/backup.py:179-180

Current Code

all_backups = sorted(_iterate_backups(backups_dir),
                     key=lambda e: e.name, reverse=True)

Problem

sorted() requires materializing the entire iterable into memory. The backup entries themselves contain stat information that could be large.

Proposed Solution

If the backup count is reasonable (<1000), current approach is fine. But could be optimized if needed:

  1. Use a generator-based approach for very large backup sets
  2. Only load backup names and dates, not full DirEntry objects
  3. Process in chunks

Impact

Low - Only becomes a problem with thousands of backups, which is unlikely in normal use.

# cleanup_old_backups() loads all backups into memory **Priority:** Low **Component:** backup.py **Type:** Performance ## Description The cleanup function loads all backup entries into memory as a list, even when only iterating through them once. For backup directories with thousands of backups, this is inefficient. ## Location `curateipsum/backup.py:179-180` ## Current Code ```python all_backups = sorted(_iterate_backups(backups_dir), key=lambda e: e.name, reverse=True) ``` ## Problem `sorted()` requires materializing the entire iterable into memory. The backup entries themselves contain stat information that could be large. ## Proposed Solution If the backup count is reasonable (<1000), current approach is fine. But could be optimized if needed: 1. Use a generator-based approach for very large backup sets 2. Only load backup names and dates, not full DirEntry objects 3. Process in chunks ## Impact **Low** - Only becomes a problem with thousands of backups, which is unlikely in normal use.
Sign in to join this conversation.
No Label
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: snegov/cura-te-ipsum#12
No description provided.