DOCUMENTATION

Auto-Tuner Anatomy ④: Threshold and k Optimization — The Geometry of Decision-Making

How the optimal T and k are determined. Youden J statistic for threshold optimization, k Grid Search, sigmoid behavior analysis, and per-stage independent tuning explained at the code level.

In the previous part: ③ Grid Search Engine Anatomy, we dissected Impact, Dampening, and Silence optimization. This part dissects the two parameters that define the shape of decision-making — T and k.

1. What Are T and k?

1.1 The Impedance Function

I(P) = \frac{1}{1 + e^{-k(P - T)}}

This sigmoid function transforms the raw P(Win) into a decision signal:

P(Win) > T → Impedance rises → "Go" signal
P(Win) < T → Impedance drops → "No-Go" signal

1.2 Roles of T and k

Parameter	Role	Analogy
T	Where to judge	Exam passing score (cutoff)
k	How sharply to judge	How strictly the cutoff is enforced

1.3 Visual Understanding

k = 3 (gentle):       k = 12 (sharp):
     ___                    ___
    /                      │
   /          vs.          │
  /                        │
_/                      ___│
  T                        T

With small k, even a P(Win) slightly below T still gives a moderately high Impedance — "let's wait and see." With large k, a P(Win) just below T immediately yields near-zero Impedance — "red flag."

2. T Optimization: Youden J Statistic

2.1 Core Idea

"Find the cutoff that best distinguishes Won from Lost."

This is the exact question a doctor answers when setting a diagnostic test threshold: "At what blood sugar level should we diagnose diabetes — balancing missed diagnoses against false alarms?"

2.2 Youden J Definition

J(t) = \text{Sensitivity}(t) + \text{Specificity}(t) - 1

Where:

\text{Sensitivity}(t) = \frac{|\{p \in \text{Won} : P_{\text{win}}(p) \geq t\}|}{|\text{Won}|}

\text{Specificity}(t) = \frac{|\{p \in \text{Lost} : P_{\text{win}}(p) < t\}|}{|\text{Lost}|}

Sensitivity: "What fraction of Won projects have P(Win) ≥ T?" — The hit rate
Specificity: "What fraction of Lost projects have P(Win) < T?" — The correct rejection rate

J is maximized when both are simultaneously high.

2.3 Exhaustive Search

def optimal_threshold(stage_projects)
  won_p = stage_projects[:won].map(&:last_p_win)
  lost_p = stage_projects[:lost].map(&:last_p_win)

  best_t, best_j = nil, -1

  (1..99).each do |i|
    t = i / 100.0
    sens = won_p.count { |p| p >= t } / won_p.size.to_f
    spec = lost_p.count { |p| p < t } / lost_p.size.to_f
    j = sens + spec - 1

    if j > best_j
      best_j = j
      best_t = t
    end
  end

  { threshold: best_t, youden_j: best_j }
end

100 candidate thresholds from 0.01 to 0.99 are tested. The total computation is $100 \times (n_{\text{Won}} + n_{\text{Lost}})$ comparisons — completed in an instant.

2.4 Minimum J Threshold

If $J^* < 0.20$ , the recommendation is withheld.

Why? $J = 0.20$ means, for example, Sensitivity = 0.70 and Specificity = 0.50. A specificity of 0.50 means half the Lost projects are misclassified as Pass — practically a coin flip. No threshold can reliably differentiate the two groups.

if best_j < MIN_J_THRESHOLD  # 0.20
  { action: 'keep', reason: 'Insufficient separation' }
else
  { action: 'adjust', threshold: best_t, youden_j: best_j }
end

2.5 Real-World Example

Threshold (t)	Sensitivity	Specificity	J
0.20	1.00	0.20	0.20
0.30	0.90	0.47	0.37
0.38	0.80	0.67	0.47
0.50	0.50	0.87	0.37
0.60	0.20	0.93	0.13

At $t = 0.38$ : 80% of Won projects pass the threshold, and 67% of Lost projects are correctly rejected. This is the point with the best balance between the two goals.

3. k Optimization: Grid Search

3.1 Impact of k

Once T is fixed, k determines how sharply the transition occurs at T:

k	Behavior	Suitable Stage
3	Gentle transition	Discovery — exploration phase, ambiguity is OK
7	Moderate	Qualification, Solution-Fit — baseline discrimination
12	Sharp	Proposal, Negotiation — no room for ambiguity

3.2 Search Range

K_CANDIDATES = (1..12).to_a  # [1, 2, 3, ..., 12]

Why is the upper bound 12? Beyond this, the sigmoid becomes virtually a step function. A 0.01 difference in P(Win) around T would cause an extreme flip in Impedance — this is no longer "discrimination" but "binary chopping."

3.3 Process

def optimize_k(stage, optimal_t, projects)
  best_k = stage.k
  best_metric = current_impedance_separation(stage, projects)

  K_CANDIDATES.each do |k|
    metric = simulate_impedance_separation(stage, optimal_t, k, projects)
    if metric > best_metric
      best_metric = metric
      best_k = k
    end
  end

  { current: stage.k, recommended: best_k }
end

For each candidate k, the impedance separation (Impedance difference between Won and Lost) at the optimal T is calculated. The k with the highest separation is recommended.

3.4 Current Limitation

k's impact on the raw P(Win) is zero — k only affects the Impedance transformation. Since Impedance is a derived metric displayed alongside P(Win), k optimization currently influences visualization and interpretation more than the underlying probability calculation.

Future evolution: If k is incorporated into the stage-gate automation logic (e.g., automatically flagging deals when Impedance drops below 0.3), k optimization would gain direct operational impact.

4. Per-Stage Independence

4.1 Why Optimize Independently?

Each sales stage has fundamentally different characteristics:

Stage	Context	Optimal T	Optimal k
Discovery	Little data, exploratory	Low T (lenient)	Low k (gentle)
Qualification	Verifying basics	Moderate T	Moderate k
Solution-Fit	Confirming fit	Moderate T	Moderate k
Proposal	Cost commitment begins	High T (strict)	High k (sharp)
Negotiation	Final gate	Highest T	High k

"Allowing P(Win) = 30% in Discovery is fine — exploration. But at Proposal with P(Win) = 40%, serious reconsideration is needed."

4.2 Code Structure

for each stage:
  won_p = stage Won projects' P(Win) list
  lost_p = stage Lost projects' P(Win) list
  T* = optimal_threshold(won_p, lost_p)
  k* = optimize_k(stage, T*, projects)
  → recommend: { stage, T: T*, k: k* }

Each stage's T and k are optimized completely independently. Discovery's optimal T has zero influence on Proposal's optimal T.

5. The Four Scenarios

Scenario 1: Both T and k Appropriate

Won: [0.55, 0.60, 0.70, 0.80]  avg = 0.66
Lost: [0.15, 0.20, 0.30, 0.35]  avg = 0.25
T = 0.45, k = 7
→ Clean separation. No adjustment needed.

Scenario 2: T Too Low

Won: [0.55, 0.60, 0.70, 0.80]  avg = 0.66
Lost: [0.15, 0.20, 0.30, 0.35]  avg = 0.25
T = 0.10
→ All projects pass → No discrimination.
→ Recommendation: Raise T to 0.45.

Scenario 3: T Too High

Won: [0.55, 0.60, 0.70, 0.80]  avg = 0.66
Lost: [0.15, 0.20, 0.30, 0.35]  avg = 0.25
T = 0.75
→ Most Won projects also fail → Good deals get filtered out.
→ Recommendation: Lower T to 0.45.

Scenario 4: No Discrimination Possible

Won: [0.30, 0.40, 0.50, 0.60]  avg = 0.45
Lost: [0.25, 0.35, 0.45, 0.55]  avg = 0.40
→ The two distributions heavily overlap. J < 0.20 at all thresholds.
→ "At this stage, Won and Lost cannot be distinguished."
→ No recommendation. Data accumulation needed.

Next: ⑤ Statistical Validation Anatomy — ROC AUC, K-fold Cross-Validation, Prior α/β Recommendation.