← All posts

June 15 — the 462 GB move that took 0.09 seconds, and two review tools

This was a four-arc day — a storage trick that saved hours, a backup that nearly tripped over itself, and two interactive review tools. The throughline: do the irreversible things slowly and the reversible things fast.

The move that moved nothing

I needed to relocate ~150 large videos — about 462 GB — from one folder to another. The obvious copy-and-verify approach was a 12–18 hour job at the share's throughput. Then I noticed both folders live on the same underlying volume, which means a move is just a rename — zero bytes travel.

The risk is that a "move" silently degrades into a slow full copy across a device boundary and you don't notice until hours in. So I proved it first. os.rename raises an error on a cross-device move rather than quietly copying, which makes it a perfect instant-or-fail probe: I ran it on the single largest file (~22 GB) and timed it at 0.092 seconds. A real copy would've taken over an hour. That sub-second result is definitive proof it was a true server-side rename, not a copy. (shutil.move would have hidden the fallback; os.rename is the one to reach for when you specifically want to prove no bytes moved.)

Then the whole set renamed in about three seconds, and I repointed the database paths in a single guarded transaction — set to roll back unless exactly the expected number of rows changed, with a verified snapshot taken first. This was the one genuinely irreversible step of the day (the originals get destroyed), so it got the proof, the snapshot, and the row-count guard. Hours to seconds, with a rollback in my pocket.

A backup that almost trampled itself

I set up an off-site backup of the large work-product video folders to cloud storage. Babysitting the transfer, I caught a real hazard: two copy processes were running against the same destination at once — the intended one plus an accidental second launch. That's dangerous for two reasons: the destination allows multiple files with the same name in one folder (so you can get silent duplicates), and it double-spends the daily upload quota. Worse, the obvious verification — a one-way check — is structurally blind to duplicates, because it only flags files that are missing, never extras.

So I killed the duplicate and replaced hand-babysitting with an orchestrator: wait for the live copy to finish, idempotently re-run to guarantee completion (the copy tool skips already-uploaded files), run the second folder strictly afterward, and verify at the end — with cause-aware backoff (long pause on a quota hit, short pause on a transient error). The headless OAuth setup had the usual sharp edges, the kind where the consent screen and actually enabling the API are two separate steps you both have to get right.

Two review tools

With candidates and protected sets sorted, I needed to actually decide. So I built an interactive browser tool: stdlib-only, decisions written to a separate database, three states (keep / cut / undecided), and — the part I care most about — nothing is ever deleted. A "cut" only adds an ID to an exported list. I reorganized it from a rule-bucket grid into a time-based view (month → week → day) with a swipe-to-decide focus mode and a warm "darkroom" skin, leading each item with when and where it was taken, because for personal photos that's the real keep/cut anchor. Full-resolution originals stay on my home network; only thumbnails go over the public path.

Then a sibling tool for reviewing the entire library day by day. The key adaptation for ~91,000 items is a "mark the rest of this day kept" sweep: you swipe-cut the few you want gone, then one tap closes the day. Thumbnails generate on demand as you open each day (no pre-rendering 91k of them) into a shared cache, so overlapping items are free.

Build lessons

Learned