· 7 min read

Why Apple Review can't be your QA (and what to do about it)

A user pinged me with a screenshot. He had tapped 从 iCloud 恢复 in one of my apps — 鱼缸管家, an aquarium log, nothing exotic — and the screen showed:

Error fetching record <CKRecordID: 0x799aa4600;
recordName=aqualog-doc-_a599d8e4bae73ebf8e6bea10228037c6,
zoneID=_defaultZone:__defaultOwner__> from server: Record not found

Reasonable user reaction: that’s a bug. Reasonable engineer reaction: that’s not a bug, that’s the correct behavior — pullData() caught the error and called error.localizedDescription. The system told him the truth in the most honest way it had.

Both reactions are right. That’s what makes it a bug.

What actually happened

He installed the app. Tapped Restore. Had never synced anything to iCloud yet — there was nothing in his private database to restore. So db.record(for: id) came back with CKError(.unknownItem). The catch block, written some months ago when I cared about a different problem, did this:

do {
    let rec = try await db.record(for: docId)
    apply(rec)
} catch {
    errorMessage = error.localizedDescription
}

That treats every failure mode the same. Auth failure, network drop, schema mismatch, throttling, and “you have no backup yet” — all rendered as the raw CKError description. The first four are errors. The fifth one isn’t. It’s the empty state of the restore feature, dressed up in the costume of a runtime exception because that’s how the CloudKit API decided to encode it.

The fix is one line:

} catch let e as CKError where e.code == .unknownItem {
    statusMessage = "iCloud has no backup yet — sync first to create one."
} catch {
    errorMessage = error.localizedDescription
}

Five characters of typing once you see it. The interesting question is why nobody saw it across three minor releases before the user did. I have 53 iOS apps in production and a fairly aggressive QA setup. Something in the system was supposed to catch this. Nothing did.

The three things I was relying on, and why each one couldn’t see it

I want to be careful here. I was not, in some literal sense, expecting Apple Review to find this bug. Apple Review is a compliance gate, not a QA team, and I have known this since my second rejection. What I was relying on was a stack of overlapping checks that each had a bounded job, and I had never sat down and asked: which class of bug falls outside all of them at once? This was that class.

Apple Review. They approved the build in under a day. App Review staff spend somewhere on the order of 10 minutes per submission and they are hunting two things: guideline violations (private API use, missing IAP disclosure, paid functionality behind a button that says “Sign in with Google”, the usual list) and crashes. A wall of cryptic Chinese-English hex showing up after a button tap is neither. It’s bad UX, and bad UX is not a rejection criterion — if it were, half the App Store would be down. Apple is not your QA team. They are your customs officer. They check that you’ve declared the right things on the form. If your suitcase contains ugly clothing, that’s not their problem.

My static linter. I run a 70-rule pre-submit auditor on every release (apple-presubmit-audit, MIT, Python, no dependencies you don’t already have). It catches the mechanical mistakes: plist mismatches, auto-renewing missing from the subscription CTA, declared-but-unused permission strings, SwiftData fields without default values that will crash on migration, hardcoded prices that will be wrong in most markets. It is the single highest-ROI piece of code in the factory and I would not ship without it. It also could not have caught this. Grep doesn’t know that error.localizedDescription is the wrong string to surface here, because in roughly 80% of the catch blocks in any given codebase it is the right string. The pattern is correct on average and wrong in this specific neighborhood. That’s the worst kind of pattern to lint for, because the naive rule has too many false positives to live with.

My happy-path QA. Before each release a fleet of ten Claude Code subagents walks every screen of every app. They tap things. They check nothing crashes. They run on simulators with prepopulated demo data so the UI has something to render. None of them performed the specific journey of “fresh install on a device that has never synced to iCloud, then immediately tap Restore.” That is the empty state of the restore feature, and “test every empty state” was nowhere on their checklist because nobody had written that checklist down.

Three layers. The first one was never going to catch it. The second one couldn’t, by construction. The third one could have but didn’t, because the test matrix was wrong. The post-mortem isn’t really “Apple Review failed.” Apple Review did its job. The post-mortem is: my QA matrix had a hole shaped exactly like the empty state of every CloudKit / HealthKit / Photos call in 53 apps, and I had to find out from a stranger.

The new rule

I added one to apple-presubmit-audit (v0.5.2). It is opinionated and narrow, by design.

The rule says: if a Swift file imports CloudKit, HealthKit, EventKit, or Photos, or uses URLSession / FileManager / NSFetchRequest, scan it for a generic catch that ends with errorMessage = error.localizedDescription. If the same file does not contain at least one branch handling the empty case for that API (CKError.unknownItem, HKError.noData, a 404 statusCode check, fileDoesNotExist, isEmpty on a fetch result, etc.), flag it as high severity. The exact regex table lives in audit.py under EMPTY_PRONE_APIS; the seven APIs covered are the ones I have personally seen surface raw error strings to users.

$ apple-presubmit-audit --no-asc --project AquaLog
...
⚠️  CUSTOM empty-state-vs-error-state
    Files fetching from external sources surface raw
    error.localizedDescription to UI without a 'no-data / not-found'
    branch.
    Files: [('CloudKit', 'AquaLogApp.swift')]

It is a heuristic, not a proof. It will produce false positives when the empty state is handled in a different file from the catch. It will produce false negatives when the empty state is handled with if let instead of catch. Neither bothers me. The bar is “would this rule have caught the AquaLog bug,” and the answer is yes.

A brief side-quest, because it’s the part I find genuinely interesting: the reason errorMessage = error.localizedDescription is so common is that it’s the path of minimum apparent ambition. Every other choice requires you to commit to a taxonomy. Is this an auth failure or a network failure? A retryable error or a fatal one? Should the user retry, sign in again, or give up? The localized description sidesteps all of that. It says: here is the literal truth as the framework sees it, you deal with it. Frameworks reward this with strings like “Record not found” and “The operation couldn’t be completed. (HKError error 11.)”. The localized description is the API contract’s revenge on developers who won’t write a switch statement. Every catch block I have ever written that delegates to localizedDescription is, in retrospect, a TODO I disguised as code.

The fix that actually matters is the QA matrix, not the rule

The static rule is the easy part — a couple of hours including the test cases. The harder change is the test matrix. Going forward, every user-tappable button gets four states tested before release:

  1. Happy. Data + network + permissions, demo data installed.
  2. Empty. Fresh install. No data. No backup. No prior runs.
  3. Denied. Permission rejected. Signed out of iCloud. Airplane mode.
  4. Error. A forced exception in the catch path.

State 2 is the one nobody tests by default and it is also where the harshest support messages come from — “I opened the app and it doesn’t work” almost always means the user hit an empty state the designer never rendered for. State 3 is similar: the user revoked a permission and the app silently no-ops on the feature that depended on it, and now the user thinks the button is broken. Neither is in the default QA checklist that ships with “test every feature.” The default checklist assumes data and permissions; the bugs live exactly where the assumption doesn’t hold.

The static rule will catch the next AquaLog. The matrix change will catch the next class of AquaLog — the ones where the empty state isn’t even a catch block, just a list view rendering ForEach(items) { ... } on an items that’s empty and producing a 600-pixel-tall void with no explanation.

If you ship iOS apps: apple-presubmit-audit is one Python file, MIT, no telemetry, takes a project path and an optional ASC API key. Run it on whatever you’re about to submit. If the empty-state-vs-error-state rule fires, the warning is almost always pointing at a real catch block worth a second look. If it doesn’t fire, your code is either clean or you’re handling the empty case with if, and it’s probably worth opening the file anyway.

The fix is in the next build. v1.0.10 — “iCloud section visible to all users with a paywall on tap for non-subscribers, friendly empty-state message replacing the raw CKError” — goes to App Review today. If a user tells you a button shows a CloudKit error code, they’ve done you a favor. The next one might just uninstall.

Comments

Be the first to start a discussion. Reply with your thoughts on Bluesky and tag @jiexiang.dev — I'll link the thread back here.