DOCUMENTATION

Auto-Tuner Anatomy โ‘ : Engine Overview and Design Philosophy

A comprehensive anatomy of the EXAWin Auto-Tuner architecture. Six learning targets, five-stage data maturity, and the design principle of "not fitting, but making accurate."

This document series dissects the internals of EXAWin's Auto-Tuner engine. We explain the meaning behind every line of code, the rationale for each statistical technique, and why each parameter must remain within its specific range โ€” all in a lecture-style narrative.

By the time you finish this series, you will be able to explain why every Auto-Tuner recommendation is that specific value.



1. What is the Auto-Tuner?

1.1 One-Line Definition

Auto-Tuner = A system that "makes accurate" the Bayesian engine's parameters based on historical project outcomes (Won/Lost)

The key word here is "accurate." It does not raise P(Win) โ€” rather, it adjusts parameters so that Won deals have high P(Win) and Lost deals have low P(Win) โ€” aligning predictions with reality.

1.2 The Car Analogy

The engine (Bayesian formula) itself doesn't change. What the Auto-Tuner does is adjust the fuel mixture:

Engine ComponentCar AnalogyEXAWin Equivalent
Ignition thresholdIgnition timingT โ€” Stage threshold
Fuel injectionInjector open timeImpact โ€” Signal weights
Acceleration responseThrottle sensitivityk โ€” Slope (Velocity)
Exhaust treatmentCatalytic converter efficiencyDampening โ€” Duplicate signal attenuation
Fuel leak penaltyLeak alarmSilence Penalty โ€” Activity gap penalty

1.3 Five Design Principles

โ‘  Not fitting, but making accurate
โ‘ก Preserve the impedance dual-structure
โ‘ข Provide recommendation + rationale together
โ‘ฃ Human approval mandatory โ€” no automatic application
โ‘ค Stored data immutable โ€” simulations are pure computation

Principle โ‘ค is particularly important. The Auto-Tuner never modifies the database. When the analysis button is pressed, simulations run in memory, and only when the administrator clicks "Apply" does the database get updated.



2. Six Learning Targets

The Auto-Tuner analyzes and recommends exactly six parameters.

โ‘  Signal Lift โ€” Discriminative Power Analysis

"When this signal appears, does the probability of winning actually increase?"

Calculates the Lift = (appearance rate in Won) / (appearance rate in Lost) for each signal. Lift > 1 indicates a positive indicator; Lift < 1 indicates a negative indicator. Validates whether the current classification (Positive/Negative) matches actual discriminative power.

๐Ÿ“Œ Details: โ‘ก Signal Lift Anatomy

โ‘ก Impact Score โ€” Optimal Weights

"Is 5.0 really the optimal value for Game Changer?"

Varies each ImpactType's score within a ยฑ range to find the value that maximizes Separation (Won avg P(Win) โˆ’ Lost avg P(Win)). Search range expands by Phase.

๐Ÿ“Œ Details: โ‘ข Grid Search Engine Anatomy

โ‘ข T โ€” Threshold Optimization

"Where should each stage's threshold be placed to best distinguish Won from Lost?"

Finds the T that maximizes Youden J statistic = Sensitivity + Specificity โˆ’ 1. If J < 0.20, it means "the data cannot distinguish Won/Lost at this stage," so no recommendation is made.

๐Ÿ“Œ Details: โ‘ฃ Threshold ยท k Anatomy

โ‘ฃ k โ€” Slope (Velocity)

"How sharply should P(Win) react when crossing T?"

Previously used an empirical formula 1 + ln(ratio) based on evidence ratio (ฮฑ+ฮฒ), now switched to Grid Search-based optimization that directly maximizes separation. Upper bound is 12 per the theoretical reference.

๐Ÿ“Œ Details: โ‘ฃ Threshold ยท k Anatomy

โ‘ค Dampening โ€” Duplicate Signal Attenuation

"When three signals appear simultaneously in the same meeting, should they all receive equal weight?"

Compound Score = MAX(signals) + remaining ร— dampening. If dampening is 0, only the strongest signal counts; if 1, all signals are weighted equally. The current default of 0.25 is optimized via Grid Search.

โ‘ฅ Silence Penalty โ€” Activity Gap Penalty

"How much penalty should accumulate when the customer hasn't been contacted for an extended period?"

Optimizes the penalty ratio added to ฮฒ via Grid Search.



3. Five-Stage Data Maturity (Phase)

The Auto-Tuner prevents overfitting when data is scarce by assigning a 5-stage confidence level based on the lesser count of Won/Lost (min).

PhaseConditionEmojiAdjustment ScopeConfidence
1min < 5โŒAnalysis impossiblenone
2min 5โ€“9๐ŸŸ Direction reference only, apply lockedlow
3min 10โ€“19๐ŸŸกImpact, T, kmoderate
4min 20โ€“49๐ŸŸขImpact, T, k, Dampening, Silencehigh
5min โ‰ฅ 50๐Ÿ”ตAll + MCMC posteriorstable

Why min?

If there are 100 Won projects but only 3 Lost, you cannot claim "this parameter distinguishes Lost well" based on just 3 cases. Statistical significance is always limited by the smaller sample.

What Changes by Phase

As the Phase increases, the Auto-Tuner's behavior progressively expands:

BehaviorPhase 2Phase 3Phase 4Phase 5
Signal Lift min appearances35810
Grid Search rangeยฑ20%ยฑ30%ยฑ40%ยฑ50%
T/k adjustmentโŒโœ…โœ…โœ…
Dampening/Silence adjustmentโŒโŒโœ…โœ…
MCMC posteriorโŒโœ…โœ…โœ…
Prior ฮฑ/ฮฒ recommendationManualMoMMLEMLE


4. Core Metric: Separation

The Auto-Tuner's objective function is Separation.

Separation=P(Win)โ€พWonโˆ’P(Win)โ€พLost\text{Separation} = \overline{P(Win)}_{\text{Won}} - \overline{P(Win)}_{\text{Lost}}
  • Separation > 0.40: Excellent (A) โ€” Parameters closely reflect reality
  • 0.25 โ€“ 0.40: Good (B) โ€” Room for improvement
  • 0.10 โ€“ 0.25: Needs Improvement (C)
  • < 0.10: Urgent (D) โ€” Parameter adjustment required

Limitations of Separation and AUC

Separation only measures the difference in means. It does not account for distribution overlap.

Example:

  • Scenario A: Won avg 0.70, Lost avg 0.30 โ†’ Separation 0.40 โ†’ Excellent!
  • Scenario B: Won range [0.20, 0.90], Lost range [0.10, 0.80] โ†’ Same average difference but heavy overlap

To compensate, ROC AUC is introduced. AUC represents "the probability that a randomly selected Won project has a higher P(Win) than a randomly selected Lost project." Overlap reduces AUC.

๐Ÿ“Œ Details: โ‘ค Statistical Validation Anatomy



5. Simulation Engine

The core of the Auto-Tuner is memory-based simulation. Instead of using actual BayesianUpdate records stored in the database, it recalculates from scratch using raw data (activities, signals, Prior).

Why Recalculate?

To try different parameters, you need to calculate "what would P(Win) have been if Impact were 3.0?" This cannot be determined from stored historical results. Only by simulating from scratch with hypothetical parameters can you answer this.

One simulation cycle:
  ฮฑ, ฮฒ โ† Prior initial values
  for each activity (chronological):
    โ†’ Calculate Compound Score from activity's signals
    โ†’ ฮฑ += SWV ร— positive Compound
    โ†’ ฮฒ += SWV ร— negative Compound
    โ†’ ฮฒ += silence penalty (for activity gaps)
  P(Win) = ฮฑ / (ฮฑ + ฮฒ)

Repeating this simulation for all Won/Lost projects reveals the separation for those parameters.

DB Queries = 0

During simulation, not a single DB query is executed. All data is preloaded into memory during initialization, and only pure computation follows. This is the implementation of Principle โ‘ค.



6. Document Series Guide

PartTitleContent
โ‘  [Current]Engine Overview and Design PhilosophyOverall structure, 6 learning targets, Phase, Separation
โ‘กSignal Lift AnatomyLift calculation, Laplace smoothing, classification validation
โ‘ขGrid Search Engine AnatomyImpact optimization, Phase-based ranges, Dampening, Silence
โ‘ฃThreshold ยท k AnatomyYouden J, T optimization, k Grid Search
โ‘คStatistical Validation AnatomyAUC, K-fold CV, Prior recommendation
โ‘ฅMCMC Posterior AnatomyEmcee Ensemble MCMC, model definition, HDI, convergence diagnostics