Spooky Action at a Distance: When Fixing Your Proxy Breaks Your Application

Or: How I Learned That “Working” and “Correct” Are Not the Same Thing

“Spooky action at a distance” (spukhafte Fernwirkung) is what Einstein called quantum entanglement in 1947—the phenomenon where measuring one particle instantly affects another, no matter how far apart they are. He meant it dismissively; the idea seemed absurd.

I first heard the term applied to software almost ten years ago, in Marco Pivetta’s (@ocramius) talk Extremely Defensive PHP. The concept stuck with me: code in one place mysteriously affecting code somewhere else, with no visible connection.

This is a story about that kind of bug.


“The ranking calculations aren’t running.”

The System: A Callback-Based Pipeline

Our ranking tool queries a third-party API for search engine data. The pipeline looks simple enough:

Each task moves through a state machine tracked by three flags:

class AnalysisTask
{
    private bool $submitted = false;  // Sent to external API?
    private bool $completed = false;  // API finished processing?
    private bool $processed = false;  // We fetched the results?
}

The critical handoff happens via a webhook callback: when the external API finishes processing, it sends an HTTP request to our callback endpoint, which sets completed = true.

Simple, right?

The Bug: Tasks Stuck Forever

Tasks were being created and submitted successfully. The external API was processing them (we verified via their dashboard). But our process command found nothing to do. They stopped processing mid November 2025.

I pulled up the processing logic:

protected function execute(InputInterface $input, OutputInterface $output): int
{
    $tasksToProcess = $this->repository->findBy([
        'submitted' => true,
        'completed' => false,  // <-- Hmm...
        'processed' => false,
    ]);

    foreach ($tasksToProcess as $task) {
        $this->processTask($task);
    }

    return Command::SUCCESS;
}

Wait. We’re looking for tasks where completed = false?

If the callback sets completed = true when the API finishes… why would we query for tasks that haven’t completed?

The Assumption Trap

I checked the callback controller:

#[Route('/api/callback', methods: ['POST'])]
public function handleCallback(Request $request): Response
{
    $taskId = $request->get('task_id');

    $task = $this->repository->find($taskId);
    $task->setCompleted(true);  // <-- Sets completed = TRUE
    $this->entityManager->flush();

    return new Response('OK');
}

The callback clearly sets completed = true. The process command queries for completed = false. These are mutually exclusive.

This code could never have worked… could it? Basically we found the bug, fixed it and were done. But I wanted to know why it happened, what changed…

The Mystery: Code That Hadn’t Changed

Here’s where it gets weird. I checked the git history of the processing command:

git log --oneline -- src/Command/ProcessRankingTasksCommand.php

Last change: September 2025. Over two months ago.

I checked every file involved in the ranking pipeline:

git log --oneline --since="2025-08-01" -- \
    src/Command/ProcessRankingTasksCommand.php \
    src/Controller/RankingCallbackController.php \
    src/Entity/RankingTask.php \
    src/Repository/RankingTaskRepository.php \
    src/Service/RankingTaskProcessor.php

Nothing. Not a single commit. The ranking code hadn’t been touched since September 2025.

But the rankings stopped working in November 2025. If none of the code changed… what did?

The Temporal Investigation

We knew when it broke. November 2025. So instead of tracing code paths, I asked a different question: What else changed that month?

git log --oneline --after="2025-11-01" --before="2025-11-30"

That’s when I found something interesting, after going through some commits, which mostly changed unrelated stuff:

November 16, 2025 - “Consolidate shortlink into main app with host-based routing”

That looked interesting…

I opened the diff:

# BEFORE: Host-specific routing
http://app.example.com {
    import common_config
}

http://staging.example.com, http://localhost {
    import common_config
}

http://php {  # Internal Docker network
    import common_config
}
# AFTER: Accept all incoming requests
:80 {
    root /app/public
    php_server
}

The commit was about consolidating a shortlink service. The developer had no idea they were changing anything related to our ranking pipeline. But by switching from host-specific routing to :80, they inadvertently started accepting requests from any hostname—including callbacks from our external API.

Reconstructing the Timeline

The pieces fell into place, chronologically:

The Hidden Problem (from the start)

Our Caddy configuration only accepted requests from specific hostnames. But callbacks from the external API came from their servers—with a different Host header. Caddy rejected them at the proxy level. The callback controller was never invoked. completed stayed false forever.

Nobody knew. The system worked because of wrong assumptions.

September 2025: The Refactoring

  • Developer refactors the ranking pipeline
  • Looks at actual data in the database
  • Sees tasks with submitted = true, completed = false
  • Writes query to match observed reality
  • Tests pass (mocking the database)
  • Code ships

The code wasn’t wrong given what the developer observed. The infrastructure was silently dropping callbacks, and the code reflected that broken state.

  • Developer consolidates shortlink service into main app
  • Changes Caddy from host-specific routing to :80 (accept all)
  • Unintended side effect: callbacks from external API now reach the application
  • completed is now correctly set to true
  • Nobody realizes the process command has the inverted query
  • System appears to work (old tasks eventually age out or get manually handled)

January 2026: The Bug Surfaces

  • New batch of tasks submitted
  • Callbacks received correctly, completed = true
  • Process command queries for completed = false
  • Zero results
  • “The ranking calculations aren’t running”

The Fix

One property:

// BEFORE
'completed' => false,

// AFTER
'completed' => true,

The correct state machine:

// 1. Created:     submitted=false, completed=false, processed=false
// 2. Submitted:   submitted=true,  completed=false, processed=false
// 3. Callback:    submitted=true,  completed=true,  processed=false  ← PROCESS HERE
// 4. Done:        submitted=true,  completed=true,  processed=true

We want tasks in state 3: callback received, not yet processed.

Why This Was Hard to Find

The Code Didn’t Change

The first instinct in debugging is to check recent changes. But the ranking code hadn’t been touched in six months. The usual git blame approach led nowhere.

The Bug Wasn’t in the Buggy Code

The process command was “correct” given observed data. The actual bug was in Caddy - a completely different layer. You’d never find it by tracing code paths.

No Test Could Catch This

Unit tests mock the database. Integration tests don’t involve external callbacks. You’d need a full end-to-end test with actual third-party webhooks.

The Key Insight: Temporal Analysis

When code archaeology fails, try temporal analysis. We knew when it broke. That timestamp was our anchor. Instead of asking “what changed in this file?” we asked “what changed at all during that time?”

Lessons Learned

1. Document Your Assumptions

/**
 * Find tasks ready for result processing.
 *
 * We query for completed=true because:
 * - External API callback sets completed=true when done
 * - We want tasks finished by API but not yet processed by us
 *
 * If this returns empty but tasks exist, check:
 * - Are callbacks reaching the server? (check proxy logs)
 * - Is the callback endpoint accessible externally?
 */
$tasks = $this->repository->findTasksReadyForProcessing();

2. Infrastructure Changes Need Application Review

When changing proxy, routing, or networking config, ask:

  • What external services call our endpoints?
  • What hostnames or IPs do they use?
  • Will our new configuration accept those requests?

3. Add Observability for State Transitions

if (count($tasks) === 0) {
    // Check for tasks stuck in unexpected states
    $stuckCount = $this->repository->count([
        'submitted' => true,
        'completed' => false,
        'processed' => false,
    ]);

    if ($stuckCount > 0) {
        $this->logger->warning(
            'Found {count} tasks with completed=false. Callback may not be working.',
            ['count' => $stuckCount]
        );
    }
}

4. Query the Actual Data

When debugging state machine issues, look at the distribution:

SELECT
    submitted, completed, processed, COUNT(*) as count
FROM analysis_tasks
GROUP BY submitted, completed, processed;

This immediately shows tasks stuck in unexpected states.

The Taxonomy of Spooky Bugs

These bugs share common traits:

Dimension Spooky Bug Characteristic
Space Cause and symptom in different systems
Time Changes separated by weeks or months
Causality The change that triggers it is completely unrelated
Intent A “shortlink consolidation” breaks a “ranking pipeline”
Visibility Works in dev, fails in prod (or vice versa)

The solution pattern:

  1. Question assumptions about how the system should work
  2. Verify with data how it actually works
  3. Use temporal analysis — if you know when it broke, check everything that changed then
  4. Look beyond the code to infrastructure, config, external dependencies

Conclusion

The most insidious bugs aren’t in the code you’re looking at. They’re in the assumptions you don’t know you’ve made.

Our one-line fix was hiding behind a Caddy configuration change from the same month. The process command wasn’t wrong - it was written correctly for a broken system. When we fixed the system, the “correct” code became incorrect.

The breakthrough came from asking a different question. Instead of “what changed in this code?” we asked “what changed when it broke?” Six months of unchanged code meant the answer had to be somewhere else. Temporal analysis—looking at all commits from the failure date—led us straight to the infrastructure change.

In quantum physics, observing a particle changes its state. In distributed systems, fixing one component can change the assumptions that other components depend on.

Spooky action at a distance, indeed.


“The most exciting phrase to hear in science, the one that heralds new discoveries, is not ‘Eureka!’ but ‘That’s funny…’” — Isaac Asimov

In software engineering, it’s often: “Wait, why did we write it this way?”