Auto-Tuner Anatomy ②: Signal Lift — Is This Signal Really Meaningful?
Dissecting how Signal Lift measures the real-world discriminative power of each signal. Lift formula, Laplace smoothing, dynamic minimum appearance thresholds, and mismatch detection explained at the code level.
In the previous part: ① Engine Overview, we explored the Auto-Tuner's overall structure. This part answers the first question: "Is this signal really more common in Won projects?"
1. What is Lift?
1.1 Core Question
Signal Master classifies signals as Positive (Game Changer, Strong Affirmation, etc.) or Negative (Strong Negation, Weak Negation, etc.) based on domain expertise. But is this trust justified?
If "technical fit confirmed" appears equally in Won and Lost projects, it provides no discriminative power — even if it's classified as a Strong Affirmation.
Signal Lift quantifies this by measuring actual discriminative power from historical data.
1.2 Formula
- = How frequently signal appeared in Won projects
- = How frequently signal appeared in Lost projects
This is mathematically equivalent to a Bayes Factor:
"Given that this signal was observed, how much stronger is the evidence for Won compared to Lost?" — this is the exact question a Bayes Factor answers.
1.3 Interpretation
| Lift | Jeffreys' Scale | Interpretation |
|---|---|---|
| > 10 | Decisive | Overwhelmingly associated with Won |
| 3 ~ 10 | Strong | Strongly associated with Won |
| 1 ~ 3 | Moderate | Slight association with Won |
| ≈ 1 | None | No discriminative power |
| < 1 | Reverse | Actually appears more in Lost |
2. Concrete Example
2.1 Data
A company has 10 Won and 15 Lost completed projects. For each signal, we count the number of projects where it appeared:
| Signal | Won (10) | Lost (15) | P(s|Won) | P(s|Lost) | Lift |
|---|---|---|---|---|---|
| Technical Fit Confirmed | 8 | 3 | 0.80 | 0.20 | 4.00 |
| Budget Secured | 6 | 4 | 0.60 | 0.27 | 2.25 |
| Competitor Presence | 7 | 10 | 0.70 | 0.67 | 1.05 |
| Decision Maker Absent | 2 | 9 | 0.20 | 0.60 | 0.33 |
2.2 Interpretation
- Technical Fit Confirmed (4.00): Projects with this signal won 4× more often. This signal is genuinely meaningful.
- Budget Secured (2.25): Meaningful, but not decisive.
- Competitor Presence (1.05): Lift ≈ 1 → No discriminative power. Useless at differentiating Win/Lose.
- Decision Maker Absent (0.33): Appears 3× more in Lost. If currently classified as Positive, this is a classification error.
3. Laplace Smoothing — Preventing Division by Zero
3.1 The Problem
If "Game Changer" appeared in 5 Won projects but 0 Lost projects:
3.2 The Solution: Laplace Smoothing
def smoothed_rate(count, total)
(count + 1.0) / (total + 2.0)
end
Adding 1 to the numerator and 2 to the denominator:
| Value | Before Smoothing | After Smoothing |
|---|---|---|
| Won rate | 5/10 = 0.500 | 6/12 = 0.500 |
| Lost rate | 0/15 = 0.000 | 1/17 = 0.059 |
| Lift | ∞ | 0.500/0.059 = 8.50 |
Smoothing prevents infinite values while having minimal impact when sufficient data exists. With n=1000, the +1 and +2 cause only a 0.1% difference.
3.3 Why "+1" and "+2"?
This comes from the uniform prior (Beta(1,1)) of Bayesian inference. Viewing it as "adding one virtual success and one virtual failure" naturally connects to the Bayesian framework. It is the most widely used smoothing method, also known as "add-one smoothing."
4. Dynamic Minimum Appearance
4.1 The Problem
If signal "X" appeared in only 1 Won project and 0 Lost projects, Lift = ∞ (after smoothing, approximately 8.5). But the sample size of 1 cannot guarantee statistical validity. This might be entirely coincidental.
4.2 Phase-Based Thresholds
Auto-Tuner requires a minimum number of appearances:
SIGNAL_MIN_APPEARANCES = {
2 => 3,
3 => 5,
4 => 8,
5 => 10
}.freeze
| Phase | Min Appearances | Rationale |
|---|---|---|
| 2 | 3 | Even with sparse data, at least 3 observations |
| 3 | 5 | Basic statistical test possible |
| 4 | 8 | Sufficient for pattern convergence |
| 5 | 10 | Stringent standard |
Signals that fall below the minimum threshold receive a Lift of nil and are excluded from GridSearch targeting.
4.3 Why Phase-Dependent?
With scarce data, you must reference whatever information is available (even Lift from 3 appearances is better than nothing). With abundant data, stricter criteria can be demanded. This represents the adaptive balance between Data Humility and Ambition.
5. Classification Mismatch Detection
5.1 What is a Mismatch?
If signal "Positive Meeting Atmosphere" is classified as Moderate Affirmation (α increases), but in reality it appeared in 3 Won and 9 Lost projects:
Lift < 1 yet classified as Positive → This is a mismatch.
5.2 Why Does This Happen?
- Signal name vs. actual usage gap: The real-world context of "meeting atmosphere was positive" by the sales team might not match
- Environmental change: A signal that was once meaningful becomes meaningless due to changes in market conditions
- Insufficient sample: May resolve naturally as more data accumulates
5.3 Mismatch Report
def detect_mismatches(lift_results)
mismatches = []
lift_results.each do |signal_name, data|
expected_positive = data[:impact_type].positive?
actual_positive = data[:lift] > 1.0
if expected_positive != actual_positive && data[:total_appearances] >= min_appearances
mismatches << {
signal: signal_name,
classified_as: expected_positive ? 'Positive' : 'Negative',
actual_lift: data[:lift],
interpretation: expected_positive ?
"Classified as positive but Lift < 1 — appears more in Lost" :
"Classified as negative but Lift > 1 — appears more in Won"
}
end
end
mismatches
end
5.4 What Does the User Do?
Mismatches appear as warning alerts in the Auto-Tuner report. The administrator decides whether to:
- Reclassify the signal — e.g., change from Moderate Affirmation to Moderate Negative
- Maintain current classification — if the mismatch is believed to be temporary (due to data insufficiency)
- Remove the signal — retire it from the system if not meaningful
6. Signal Lift Summary
Lift > 1 Lift ≈ 1 Lift < 1
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Appears │ │ Equally │ │ Appears │
│ more in │ │ in both │ │ more in │
│ Won │ │ │ │ Lost │
└─────┬─────┘ └─────┬─────┘ └─────┬─────┘
│ │ │
✅ As Expected ⚠️ No Power ❌ Mismatch?
Signal Lift is the Auto-Tuner's first task. Before optimizing parameters, you must first confirm whether each signal is truly meaningful. Only then can the subsequent Grid Search, T optimization, and MCMC analysis deliver reliable results.
Next: ③ Grid Search Engine Anatomy — Phase-based dynamic ranges, simulations, and the mathematics of Compound Score.