Fraud Auto-Research Dashboard

Last updated: 2026-04-08 00:58:59  |  Selection on val · OOT is held-out reporting
Datasets
5
Experiments
91
Kept
32
Discarded
59

FDH

fdh per-scenario campaign step 3/3 v5: cyclic time (sin/cos hour/dow, is_night/weekend) on SOTA — removed unstable interactions, keeping stable signals
Experiments: 12
Kept: 5
Discarded: 2
Best AUPRC (val): 0.3798
Best AUPRC (OOT): 0.2933
Baseline (OOT): 0.1300
Improvement: +125.7%
fdh results
# Status Hypothesis AUPRC
val
AUPRC
OOT
AUROC
val
AUROC
OOT
Composite
val
PSI Feats
exp_000 KEEP baseline: frequency encoding + raw features, XGBoost 0.1325 0.1300 0.6287 0.6037 -0.0341 0.0000 6
exp_001 KEEP fdh per-scenario campaign step 1/3: velocity stack (1h/6h/24h/7d per CUSTOMER_ID+TERMINAL_ID) + customer behavioral deviation (amt z-score, amt_vs_p90) — targets scenarios 1+3 0.2798 0.2527 0.7008 0.6610 0.0208 0.0000 33
exp_002 KEEP fdh per-scenario campaign step 1/3 v2: velocity stack with training-tail continuity fix + behavioral deviation (scenarios 1+3) 0.2663 0.2513 0.7164 0.6743 0.0255 0.0001 32
exp_003 KEEP fdh per-scenario campaign step 2/3: add rolling terminal fraud rate 28d window (Recipe 17) — targets scenario 2 terminal compromise 0.3204 0.2343 0.7718 0.6603 0.0951 0.0201 34
exp_004 PSI reject fdh per-scenario campaign step 3/3: cyclic time (sin/cos hour/dow, is_night) + customer-terminal novelty + key interactions (amt_vs_p90*term_fraud_rate) 0.3295 0.2225 0.8071 0.6549 0.7914 38
exp_005 PSI reject fdh per-scenario campaign step 3/3 v2: cyclic time (sin/cos hour/dow, is_night/weekend) + amt_vs_p90*term_fraud_rate interaction — removed split-leaky novelty features 0.3295 0.2225 0.8071 0.6549 0.7914 36
exp_006 crash fdh per-scenario campaign step 3/3 v3: cyclic time (sin/cos hour/dow, is_night/weekend) on top of SOTA — fixed TX_DATETIME drop order
exp_007 PSI reject fdh per-scenario campaign step 3/3 v4: cyclic time (sin/cos hour/dow, is_night/weekend) + interactions — fixed feature count 0.4052 0.2787 0.8147 0.6594 0.9421 38
exp_008 KEEP fdh per-scenario campaign step 3/3 v5: cyclic time (sin/cos hour/dow, is_night/weekend) on SOTA — removed unstable interactions, keeping stable signals 0.3798 0.2933 0.7931 0.6755 0.1197 0.0173 36
exp_009 discard fdh model tuning: max_depth=4 + subsample=0.8 + colsample=0.8 + min_child_weight=5 + L2=2 to reduce AUROC gap 0.194 0.3596 0.2801 0.7910 0.6688 0.1123 0.0280 36
exp_010 crash fdh campaign follow-up: static terminal fraud rate (stable OOT) + customer terminal diversity + small-amount flag (scenario 3 CNP signals)
exp_011 discard fdh campaign follow-up: static terminal fraud rate (stable OOT) + customer terminal diversity + small-amount flag (scenario 3 CNP signals) 0.2161 0.1070 0.7072 0.6189 0.0459 0.0797 39

IEEE-CIS — Track B: Fresh StartTrack B

final-model campaign step 1: deeper XGBoost max_depth=6, lr=0.01, n_est=5000, stronger regularization (lambda=3, alpha=0.3, gamma=0.3) on SOTA 88-feature set
Experiments: 21
Kept: 11
Discarded: 10
Best AUPRC (val): 0.3885
Best AUPRC (OOT): 0.3077
Baseline (OOT): 0.2297
Improvement: +33.9%
ieee-cis-fresh results
# Status Hypothesis AUPRC
val
AUPRC
OOT
AUROC
val
AUROC
OOT
Composite
val
PSI Feats
exp_000 KEEP baseline step 1/5: minimal freq encoding + id_cluster_present, stock XGBoost max_depth=6 0.3084 0.2297 0.8297 0.7800 0.0705 0.0011 55
exp_001 KEEP regularization campaign step 2/5: max_depth=4, subsample=0.8, min_child_weight=10, drop TransactionDT to kill PSI=12.43 0.3213 0.2455 0.8329 0.8055 0.1287 0.0076 54
exp_002 discard TE campaign step 3/5: OOF target encoding for DeviceInfo/id_33/id_30/id_31/R_emaildomain/card1/P_emaildomain + null flags for all id_* cols 0.2946 0.2110 0.8175 0.7753 0.1080 0.0033 88
exp_003 discard TE campaign step 3b/5: OOF TE for R_emaildomain/ProductCD/card4/card6/P_emaildomain/card1 only (no null flags, focused set) 0.3088 0.2052 0.8309 0.7948 0.1251 0.0088 54
exp_004 discard TE campaign step 3b/5: OOF TE for R_emaildomain/ProductCD/card4/card6/P_emaildomain/card1 only (no null flags, focused set) 0.3088 0.2052 0.8309 0.7948 0.1251 0.0088 54
exp_005 discard uid-aggregation campaign step 4/5: card1+addr1 UID with 8 aggs (mean/std/max/median/count/p75/p25/iqr) + amount zscore/ratio + card1-level aggs 0.3099 0.2798 0.8408 0.8233 0.1247 0.0022 68
exp_006 discard velocity campaign step 5/5: per-card1 D1 velocity stats (median gap, burst count, txn count) + amount behavioral deviation (zscore, ratio) 0.3137 0.2724 0.8324 0.8089 0.1260 0.0063 58
exp_007 discard model-switch campaign step 1: LightGBM num_leaves=63, subsample=0.8 vs XGBoost — better handling of high-cardinality freq-encoded features 0.1453 0.1539 0.7481 0.7432 0.0833 0.0024 54
exp_008 KEEP cyclic-time campaign step 1: extract hour-of-day/day-of-week as cyclic sin/cos features from TransactionDT (mod 86400, stable vs temporal split) 0.3195 0.2530 0.8294 0.7965 0.1330 0.0032 60
exp_009 KEEP identity-consistency campaign step 1: per-card1 distinct email/device/addr counts + new card-email pair flag + is_modal_device 0.3189 0.2525 0.8382 0.8142 0.1353 0.0044 66
exp_010 discard identity-consistency campaign step 2: add card_device_is_new + card_addr_is_new flags + card1_n_card6 count (building on exp_009 card_email_is_new) 0.3146 0.2483 0.8376 0.8117 0.1313 0.0043 69
exp_011 discard encoding-fix campaign step 1: binary id_* cols (id_29/35/36/37/38) → 0/1 encoding + id_30/31/33/DeviceInfo/id_23/id_34 → smoothed TE instead of freq 0.2825 0.2182 0.8171 0.7856 0.1202 0.0013 63
exp_012 KEEP model-depth campaign step 2: XGBoost max_depth=5, lr=0.02, n_est=3000, gamma=0.1, reg_lambda=2.0 — deeper trees with stronger regularization on SOTA features 0.3444 0.2808 0.8456 0.8175 0.1393 0.0060 66
exp_013 KEEP amount-features campaign step 1: per-card1 amt baseline (mean/std/count) + amt zscore/ratio + log_amt + is_round + card6_amt_zscore 0.3589 0.2752 0.8568 0.8224 0.1481 0.0040 75
exp_014 KEEP amount-features campaign step 2: addr1 amount baseline (mean/std/count/zscore) + r_email_amt_ratio — building on amt_is_round/cents from exp_013 0.3603 0.2714 0.8591 0.8251 0.1495 0.0050 80
exp_015 discard card1-TE campaign step 1: add OOF smoothed TE for card1 (primary card identifier, min_samples=10) alongside existing features 0.3397 0.2281 0.8351 0.7892 0.1247 0.0056 81
exp_016 KEEP interaction campaign step 1: ProductCD×card6 freq + card4×card6 freq + dist1_log transform 0.3635 0.2851 0.8596 0.8272 0.1522 0.0049 83
exp_017 KEEP interaction campaign step 2: add R_emaildomain×card6 + R_emaildomain×ProductCD + addr1×card6 interaction freqs (building on productcd_card6_freq from exp_016) 0.3696 0.2796 0.8616 0.8270 0.1566 0.0055 86
exp_018 KEEP interaction campaign step 3: P_emaildomain×card6 + id12×card6 + card3_quantile×card6 interactions (building on r_email×card6 from exp_017) 0.3737 0.2911 0.8630 0.8256 0.1576 0.0047 88
exp_019 discard interaction campaign step 4: id29×card6 + id35×ProductCD + addr1_quantile×R_emaildomain interactions (building on 5-exp keep streak) 0.3701 0.2902 0.8623 0.8244 0.1561 0.0049 90
exp_020 KEEP final-model campaign step 1: deeper XGBoost max_depth=6, lr=0.01, n_est=5000, stronger regularization (lambda=3, alpha=0.3, gamma=0.3) on SOTA 88-feature set 0.3885 0.3077 0.8680 0.8331 0.1670 0.0048 88

ieee-cis_v3_archive

ieee-cis UID aggregation campaign step 1/4 v2: uid=card1+addr1, 7 label-free UID aggs (no uid_fraud_rate) + card1 amt deviation + P_emaildomain/ProductCD TE
Experiments: 11
Kept: 2
Discarded: 9
Best AUPRC (val): 0.2074
Best AUPRC (OOT): 0.1808
Baseline (OOT): 0.1728
Improvement: +4.6%
ieee-cis_v3_archive results
# Status Hypothesis AUPRC
val
AUPRC
OOT
AUROC
val
AUROC
OOT
Composite
val
PSI Feats
exp_000 KEEP baseline: frequency encoding + raw features, XGBoost 0.2017 0.1728 0.8065 0.7781 0.0839 0.0042 13
exp_001 discard ieee-cis UID aggregation campaign step 1/4: uid=card1+addr1, 8 UID aggs (mean/std/count/min/max amt, fraud_rate TE, n_emails, n_products) + card1 amt deviation + P_emaildomain/ProductCD/card4/card6 TE 0.2400 0.1845 0.8132 0.7717 0.0800 0.0014 24
exp_002 KEEP ieee-cis UID aggregation campaign step 1/4 v2: uid=card1+addr1, 7 label-free UID aggs (no uid_fraud_rate) + card1 amt deviation + P_emaildomain/ProductCD TE 0.2074 0.1808 0.8127 0.7987 0.0920 0.0051 23
exp_003 discard ieee-cis UID aggregation campaign step 2/4: add amt percentiles per UID + email domain match flag + R_emaildomain/DeviceType TEs 0.2411 0.1856 0.8229 0.7977 0.0322 0.0013 27
exp_004 discard ieee-cis UID aggregation campaign step 2/4 v2: add uid percentiles (p25/p75/p90/range) + DeviceType TE — removed R_emaildomain and email_domain_match (caused PSI=0.275) 0.2411 0.1856 0.8229 0.7977 0.0322 0.0013 27
exp_005 discard ieee-cis UID aggregation campaign step 3/4: card1 velocity (1h/24h/7d count+sum via TransactionDT with training-tail) + burst ratio — builds on UID SOTA without leaky percentiles 0.2046 0.1767 0.8141 0.8064 0.0892 0.0019 30
exp_006 discard ieee-cis UID aggregation campaign step 4/4: DeviceType TE + cyclic time (sin/cos hour/dow, is_night/weekend) from TransactionDT + drop raw TransactionDT (PSI=12 temporal drift) 0.2221 0.1746 0.8119 0.7837 0.0847 0.0038 28
exp_007 discard ieee-cis model regularization: max_depth=5, subsample=0.8, colsample_bytree=0.8, min_child_weight=5 — target auroc_gap (0.090) penalty reduction on SOTA features 0.2379 0.1845 0.8255 0.8017 0.0377 0.0016 23
exp_008 discard ieee-cis identity consistency campaign step 1/2 (Recipe 4): modal identity per card1 (P/R_email, DeviceType, DeviceInfo match flags) + entity sharing counts + diversity per card + identity_stability composite — pure label-free consistency signals 0.2244 0.1869 0.8199 0.7971 0.0298 0.0029 24
exp_009 discard ieee-cis amount patterns (stateless, no per-card history): log_amt + round-number flags + amt_cents_bucket + has_cents + log_amt*ProductCD_te interaction — avoids default-fill PSI trap by using only population-level or stateless features 0.2296 0.1824 0.8333 0.8143 0.0912 0.0051 29
exp_010 discard ieee-cis stateless amount patterns only (no interaction): log_amt + round-10 + round-100 + cents_bucket + has_cents — removes log*ProductCD_te interaction that caused auroc_gap increase in exp_009 0.2299 0.1783 0.8285 0.7934 0.0332 0.0024 28

Fraud-Sim

fraud-sim per-card diversity: distinct merchant count + distinct category count in training (fraud = narrower merchant range)
Experiments: 11
Kept: 5
Discarded: 6
Best AUPRC (val): 0.9489
Best AUPRC (OOT): 0.9496
Baseline (OOT): 0.6981
Improvement: +36.0%
fraud-sim results
# Status Hypothesis AUPRC
val
AUPRC
OOT
AUROC
val
AUROC
OOT
Composite
val
PSI Feats
exp_000 KEEP baseline: frequency encoding + raw features, XGBoost 0.6517 0.6981 0.9931 0.9936 0.3898 0.0001 16
exp_001 KEEP fraud-sim velocity stack campaign step 1/3: haversine distance + velocity (1h/1d/7d per card_id with training-tail) + behavioral deviation + cyclic time 0.9453 0.9414 0.9988 0.9992 0.7138 0.0001 32
exp_002 discard fraud-sim velocity stack campaign step 2/3: add smoothed merchant TE + merchant amount deviation on top of step 1 0.9447 0.9453 0.9986 0.9988 0.7136 0.0002 34
exp_003 KEEP fraud-sim velocity stack campaign step 3/3: replace merchant_freq_enc with merchant_te + add category_te (smoothed, min_samples=20) 0.9466 0.9458 0.9986 0.9987 0.7162 0.0004 33
exp_004 discard fraud-sim behavioral fingerprint: per-card typical hour deviation + per-card haversine distance zscore and vs-p90 (unusual distance for this card) 0.9466 0.9458 0.9986 0.9987 0.7162 0.0004 36
exp_005 discard fraud-sim add 10-minute velocity window to catch rapid card-testing sequences + 10min/1h burst ratio 0.9476 0.9485 0.9986 0.9987 0.7168 0.0004 39
exp_006 discard fraud-sim key interactions: log_amt*log_dist (big amount far from home) + vel_1h*log_dist (rapid card-testing far from home) 0.9468 0.9473 0.9984 0.9986 0.7155 0.0014 41
exp_007 KEEP fraud-sim demographic features: log_city_pop + amt_per_city_pop (relative txn size for city) 0.9482 0.9496 0.9986 0.9986 0.7191 0.0003 43
exp_008 discard fraud-sim expand TE: add state_te + job_te on top of merchant_te + category_te (replace freq enc for all these) 0.6490 0.6999 0.9768 0.9838 0.3173 0.0017 42
exp_009 discard fraud-sim add state_te (smoothed geographic risk) as additive feature on top of SOTA 0.9422 0.9445 0.9985 0.9983 0.7124 0.0005 44
exp_010 KEEP fraud-sim per-card diversity: distinct merchant count + distinct category count in training (fraud = narrower merchant range) 0.9489 0.9468 0.9987 0.9984 0.7202 0.0004 45

IEEE-CIS — Track A: ContinuedTrack A

model-tuning campaign step 3/3: max_depth=5 with stronger regularization (gamma=1.5, reg_lambda=7.0) to compensate; 77 features may benefit from more expressive trees with stronger L2
Experiments: 36
Kept: 9
Discarded: 27
Best AUPRC (val): 0.3689
Best AUPRC (OOT): 0.2761
Baseline (OOT): 0.2297
Improvement: +20.2%
ieee-cis results
# Status Hypothesis AUPRC
val
AUPRC
OOT
AUROC
val
AUROC
OOT
Composite
val
PSI Feats
exp_000 KEEP uid-aggregation campaign step 1/5: baseline with id_cluster_present and freq encoding for all categorical columns 0.3084 0.2297 0.8297 0.7800 0.0705 0.0011 55
exp_001 discard uid-aggregation campaign step 2/5: smoothed TE for ProductCD, P_emaildomain, R_emaildomain, card4, card6; drop TransactionDT (PSI=12.43) 0.3104 0.2260 0.8240 0.7745 0.0586 0.0021 59
exp_002 discard uid-aggregation campaign step 2+3/5: TE replaces freq for ProductCD/emaildomains/card4/card6; add UID=card1+addr1 7-agg; drop TransactionDT 0.3290 0.2248 0.8169 0.7736 0.0621 0.0020 61
exp_003 discard uid-aggregation campaign step 3/5 variation: drop 12 dead features + TransactionDT; add individual null flags for R_emaildomain/id_31/id_17/DeviceInfo/DeviceType 0.3198 0.2463 0.8209 0.7641 0.0637 0.0026 47
exp_004 KEEP uid-aggregation campaign step 4/5: model regularization max_depth=4, subsample=0.8, colsample=0.7, min_child_weight=10, gamma=1, reg_lambda=5 to reduce train_val_psi from 0.294 0.3215 0.2548 0.8360 0.8111 0.1111 0.0053 55
exp_005 KEEP uid-aggregation campaign step 5/5: smoothed TE replaces freq for R_emaildomain/ProductCD/card6/card4/P_emaildomain on regularized model 0.3260 0.2332 0.8325 0.8113 0.1240 0.0060 55
exp_006 KEEP velocity-stack campaign step 1/5: UID=card1+addr1 7 amount aggs (mean/std/min/max/count/distinct_email/distinct_product) on regularized model + TE 0.2974 0.2509 0.8321 0.8138 0.1296 0.0070 62
exp_007 discard velocity-stack campaign step 2/5: UID amount deviation (zscore, IQR ratio) + card1 zscore fallback + uid_q25/q75 percentiles 0.3278 0.2801 0.8389 0.8200 0.1205 0.0036 68
exp_008 KEEP velocity-stack campaign step 3/5: per-card1 velocity features (median_gap, std_gap, min_gap, burst_count, daily_rate) via Recipe 2 0.3227 0.2464 0.8452 0.8200 0.1357 0.0060 67
exp_009 discard velocity-stack campaign step 4/5: card1 smoothed TE (smoothing=50) for high-cardinality entity encoding; on top of UID aggs + velocity 0.3143 0.2021 0.8291 0.7933 0.1193 0.0043 68
exp_010 discard velocity-stack campaign step 5/5: TE for id_35+id_15 (strong IV flags); addr1 TE; log(TransactionAmt) on top of UID+velocity SOTA 0.3281 0.2339 0.8417 0.8168 0.1348 0.0054 69
exp_011 discard velocity-stack campaign step 1/3: addr1_te + email×card6 interaction TE + prod×card6 TE + id_null_count on top of UID+velocity SOTA 0.3250 0.2546 0.8431 0.8199 0.1321 0.0056 70
exp_012 discard velocity-stack campaign step 2/3: model max_depth=3 + addr1_te + id_null_count + id_present_count; reduce AUROC gap from 0.097 to allow stable features 0.2827 0.2308 0.8222 0.8085 0.1209 0.0070 70
exp_013 discard velocity-stack campaign step 3/3: addr1_te alone (smoothing=30, medium cardinality) on SOTA UID+velocity+TE model 0.3239 0.2448 0.8417 0.8167 0.1327 0.0061 68
exp_014 KEEP cyclic_time campaign step 1/2: sin/cos hour-of-day + day-of-week from TransactionDT; drop raw TransactionDT (PSI=12.43); log(TransactionAmt) 0.3361 0.2643 0.8476 0.8206 0.1424 0.0086 71
exp_015 discard cyclic_time campaign step 2/2: addr1_te + id_null_count + id_present_count on top of stable cyclic-time SOTA (tv_psi=0.013 gives room for additional features) 0.3272 0.2641 0.8431 0.8193 0.1322 0.0093 74
exp_016 discard deviceinfo-te campaign step 1/4: DeviceInfo OOF TE (IV=1.778, highest signal in dataset) replacing freq encoding; 1546 unique values, smoothing min_samples=100 to handle long tail 0.3203 0.2399 0.8339 0.8052 0.1275 0.0089 74
exp_017 discard deviceinfo-te campaign step 1/4: DeviceInfo OOF TE (IV=1.778, highest signal in dataset) replacing freq encoding; 1546 unique values, smoothing min_samples=100 to handle long tail 0.3203 0.2399 0.8339 0.8052 0.1275 0.0089 74
exp_018 discard deviceinfo-te campaign step 2/4: id_30 smoothed TE (71 unique, IV=0.619) + id_31 TE (108 unique, IV=0.538) + id_33 TE (183 unique, IV=0.879) replacing dead freq encodings; all have high raw IV but zero IV after freq encoding 0.3364 0.2580 0.8446 0.8193 0.1415 0.0027 71
exp_019 discard deviceinfo-te campaign step 2/4: id_30 smoothed TE (71 unique, IV=0.619) + id_31 TE (108 unique, IV=0.538) + id_33 TE (183 unique, IV=0.879) replacing dead freq encodings; all have high raw IV but zero IV after freq encoding 0.3364 0.2580 0.8446 0.8193 0.1415 0.0027 71
exp_020 discard deviceinfo-te campaign step 3/4: prune 8 dead features (IV<0.005 in SOTA): id_10, id_11, id_18, id_21, id_22, dist2 (numeric pass-throughs), id_30/id_33 (freq-enc goes dead), skip txn_dow_cos; reduce noise in 71-feature model 0.2855 0.2460 0.8300 0.8118 0.1270 0.0083 62
exp_021 discard psi-fix campaign step 1/3: drop TransactionID (auto-increment identifier causing PSI=12.43 OOT drift) — it leaks temporal ordering and is not a fraud signal 0.3269 0.2771 0.8455 0.8204 0.1363 0.0062 70
exp_022 discard psi-fix campaign step 2/3: drop TransactionID + add UID2=card1+card5 aggregations (5 stats: mean/std/min/max/count amt); card5 has 110 unique values, AUC=0.554, should improve entity resolution 0.3225 0.2722 0.8440 0.8187 0.1333 0.0061 75
exp_023 discard id-te campaign step 1/3: id_31 smoothed TE alone (IV=0.538, 108 unique, showed importance=0.0422 in exp_018); isolating id_31 TE to test if it adds signal without displacing id_17 0.3352 0.2574 0.8446 0.8168 0.1400 0.0020 71
exp_024 KEEP amount-patterns campaign step 1/3: amount pattern features (stateless, PSI-safe): round number flags, cents indicator, threshold-testing detection (5-100, 90-1000); no groupby, pure deterministic transforms 0.3404 0.2579 0.8554 0.8312 0.1479 0.0093 77
exp_025 KEEP amount-patterns campaign step 2/3: add anomaly score (Mahalanobis distance from training centroid, Recipe 6) on top of exp_024 SOTA; 20 PSI-safe numeric features, captures unusual transaction profiles 0.3453 0.2621 0.8557 0.8328 0.1500 0.0095 79
exp_026 discard amount-patterns campaign step 3/3: per-category amount deviation (amt vs ProductCD/card6 median/IQR corridor from fit()); amount deviation normalized by IQR captures out-of-corridor fraud 0.3453 0.2621 0.8557 0.8328 0.1500 0.0095 79
exp_027 discard amount-patterns campaign step 3/3 (fixed): per-category amount deviation (amt vs ProductCD/card6 median/IQR) computed before TE drop; fraud often uses amounts outside normal product range 0.3396 0.2294 0.8544 0.8276 0.1462 0.0090 81
exp_028 discard email-agg campaign step 1/3: R_emaildomain aggregations (mean/std amt per domain, card1 distinct per domain); ~60 unique email domains, PSI-stable; email_card_distinct captures compromised email domains 0.3420 0.2450 0.8546 0.8289 0.1459 0.0094 83
exp_029 discard identity-consistency campaign step 1/3: per-card1 modal identity profile (P_email/R_email/DeviceType match flags) + entity sharing counts (n_cards per email domain); Recipe 4 - 'new device + new email = suspicious' 0.3456 0.2578 0.8568 0.8316 0.1495 0.0089 86
exp_030 discard identity-consistency campaign step 2/3: entity sharing only — n_cards sharing same R_emaildomain/P_emaildomain/DeviceType; fraud rings share infrastructure; stateless about card1 so less temporal drift vs profile matching 0.3432 0.2606 0.8560 0.8324 0.1472 0.0094 82
exp_031 discard model-tuning campaign step 1/3: n_estimators=3000, lr=0.02 (from 2000/0.03); lower LR + more trees for better convergence; train_val_psi was 0.014 suggesting slight overfit headroom 0.3380 0.2568 0.8557 0.8293 0.1450 0.0087 79
exp_032 discard model-tuning campaign step 2/3: stronger regularization — subsample=0.75, colsample=0.65, min_child_weight=15, gamma=1.5, reg_lambda=7.0; reduce overfit given train_auroc_gap=0.097 0.3375 0.2605 0.8576 0.8319 0.1462 0.0089 79
exp_033 discard addr1-agg campaign step 1/3: addr1 aggregations (mean/std/count amt + distinct R_emaildomain + distinct card1 per addr1); 318 unique addr1 values, stable entity; geographic fraud pattern capture 0.3434 0.2563 0.8552 0.8285 0.1489 0.0091 84
exp_034 KEEP model-tuning campaign step 3/3: max_depth=5 with stronger regularization (gamma=1.5, reg_lambda=7.0) to compensate; 77 features may benefit from more expressive trees with stronger L2 0.3689 0.2761 0.8633 0.8315 0.1519 0.0070 79
exp_035 discard model-tuning campaign + psi-fix: max_depth=5 (gamma=1.5, reg_lambda=7.0) + drop TransactionID (PSI=12.43 leakage); combining exp_034 model gains with exp_021 leakage fix for better OOT generalization 0.3612 0.2873 0.8613 0.8282 0.1495 0.0072 78