Fraud Auto-Research Dashboard

FDH

fdh per-scenario campaign step 3/3 v5: cyclic time (sin/cos hour/dow, is_night/weekend) on SOTA — removed unstable interactions, keeping stable signals

Experiments: 12

Kept: 5

Discarded: 2

Best AUPRC (val): 0.3798

Best AUPRC (OOT): 0.2933

Baseline (OOT): 0.1300

Improvement: +125.7%

#	Status	Hypothesis	AUPRC val	AUPRC OOT	AUROC val	AUROC OOT	Composite val	PSI	Feats
exp_000	KEEP	baseline: frequency encoding + raw features, XGBoost	0.1325	0.1300	0.6287	0.6037	-0.0341	0.0000	6
exp_001	KEEP	fdh per-scenario campaign step 1/3: velocity stack (1h/6h/24h/7d per CUSTOMER_ID+TERMINAL_ID) + customer behavioral deviation (amt z-score, amt_vs_p90) — targets scenarios 1+3	0.2798	0.2527	0.7008	0.6610	0.0208	0.0000	33
exp_002	KEEP	fdh per-scenario campaign step 1/3 v2: velocity stack with training-tail continuity fix + behavioral deviation (scenarios 1+3)	0.2663	0.2513	0.7164	0.6743	0.0255	0.0001	32
exp_003	KEEP	fdh per-scenario campaign step 2/3: add rolling terminal fraud rate 28d window (Recipe 17) — targets scenario 2 terminal compromise	0.3204	0.2343	0.7718	0.6603	0.0951	0.0201	34
exp_004	PSI reject	fdh per-scenario campaign step 3/3: cyclic time (sin/cos hour/dow, is_night) + customer-terminal novelty + key interactions (amt_vs_p90*term_fraud_rate)	0.3295	0.2225	0.8071	0.6549	—	0.7914	38
exp_005	PSI reject	fdh per-scenario campaign step 3/3 v2: cyclic time (sin/cos hour/dow, is_night/weekend) + amt_vs_p90*term_fraud_rate interaction — removed split-leaky novelty features	0.3295	0.2225	0.8071	0.6549	—	0.7914	36
exp_006	crash	fdh per-scenario campaign step 3/3 v3: cyclic time (sin/cos hour/dow, is_night/weekend) on top of SOTA — fixed TX_DATETIME drop order	—	—	—	—	—	—	—
exp_007	PSI reject	fdh per-scenario campaign step 3/3 v4: cyclic time (sin/cos hour/dow, is_night/weekend) + interactions — fixed feature count	0.4052	0.2787	0.8147	0.6594	—	0.9421	38
exp_008	KEEP	fdh per-scenario campaign step 3/3 v5: cyclic time (sin/cos hour/dow, is_night/weekend) on SOTA — removed unstable interactions, keeping stable signals	0.3798	0.2933	0.7931	0.6755	0.1197	0.0173	36
exp_009	discard	fdh model tuning: max_depth=4 + subsample=0.8 + colsample=0.8 + min_child_weight=5 + L2=2 to reduce AUROC gap 0.194	0.3596	0.2801	0.7910	0.6688	0.1123	0.0280	36
exp_010	crash	fdh campaign follow-up: static terminal fraud rate (stable OOT) + customer terminal diversity + small-amount flag (scenario 3 CNP signals)	—	—	—	—	—	—	—
exp_011	discard	fdh campaign follow-up: static terminal fraud rate (stable OOT) + customer terminal diversity + small-amount flag (scenario 3 CNP signals)	0.2161	0.1070	0.7072	0.6189	0.0459	0.0797	39

IEEE-CIS — Track B: Fresh StartTrack B

final-model campaign step 1: deeper XGBoost max_depth=6, lr=0.01, n_est=5000, stronger regularization (lambda=3, alpha=0.3, gamma=0.3) on SOTA 88-feature set

Experiments: 21

Kept: 11

Discarded: 10

Best AUPRC (val): 0.3885

Best AUPRC (OOT): 0.3077

Baseline (OOT): 0.2297

Improvement: +33.9%

#	Status	Hypothesis	AUPRC val	AUPRC OOT	AUROC val	AUROC OOT	Composite val	PSI	Feats
exp_000	KEEP	baseline step 1/5: minimal freq encoding + id_cluster_present, stock XGBoost max_depth=6	0.3084	0.2297	0.8297	0.7800	0.0705	0.0011	55
exp_001	KEEP	regularization campaign step 2/5: max_depth=4, subsample=0.8, min_child_weight=10, drop TransactionDT to kill PSI=12.43	0.3213	0.2455	0.8329	0.8055	0.1287	0.0076	54
exp_002	discard	TE campaign step 3/5: OOF target encoding for DeviceInfo/id_33/id_30/id_31/R_emaildomain/card1/P_emaildomain + null flags for all id_* cols	0.2946	0.2110	0.8175	0.7753	0.1080	0.0033	88
exp_003	discard	TE campaign step 3b/5: OOF TE for R_emaildomain/ProductCD/card4/card6/P_emaildomain/card1 only (no null flags, focused set)	0.3088	0.2052	0.8309	0.7948	0.1251	0.0088	54
exp_004	discard	TE campaign step 3b/5: OOF TE for R_emaildomain/ProductCD/card4/card6/P_emaildomain/card1 only (no null flags, focused set)	0.3088	0.2052	0.8309	0.7948	0.1251	0.0088	54
exp_005	discard	uid-aggregation campaign step 4/5: card1+addr1 UID with 8 aggs (mean/std/max/median/count/p75/p25/iqr) + amount zscore/ratio + card1-level aggs	0.3099	0.2798	0.8408	0.8233	0.1247	0.0022	68
exp_006	discard	velocity campaign step 5/5: per-card1 D1 velocity stats (median gap, burst count, txn count) + amount behavioral deviation (zscore, ratio)	0.3137	0.2724	0.8324	0.8089	0.1260	0.0063	58
exp_007	discard	model-switch campaign step 1: LightGBM num_leaves=63, subsample=0.8 vs XGBoost — better handling of high-cardinality freq-encoded features	0.1453	0.1539	0.7481	0.7432	0.0833	0.0024	54
exp_008	KEEP	cyclic-time campaign step 1: extract hour-of-day/day-of-week as cyclic sin/cos features from TransactionDT (mod 86400, stable vs temporal split)	0.3195	0.2530	0.8294	0.7965	0.1330	0.0032	60
exp_009	KEEP	identity-consistency campaign step 1: per-card1 distinct email/device/addr counts + new card-email pair flag + is_modal_device	0.3189	0.2525	0.8382	0.8142	0.1353	0.0044	66
exp_010	discard	identity-consistency campaign step 2: add card_device_is_new + card_addr_is_new flags + card1_n_card6 count (building on exp_009 card_email_is_new)	0.3146	0.2483	0.8376	0.8117	0.1313	0.0043	69
exp_011	discard	encoding-fix campaign step 1: binary id_* cols (id_29/35/36/37/38) → 0/1 encoding + id_30/31/33/DeviceInfo/id_23/id_34 → smoothed TE instead of freq	0.2825	0.2182	0.8171	0.7856	0.1202	0.0013	63
exp_012	KEEP	model-depth campaign step 2: XGBoost max_depth=5, lr=0.02, n_est=3000, gamma=0.1, reg_lambda=2.0 — deeper trees with stronger regularization on SOTA features	0.3444	0.2808	0.8456	0.8175	0.1393	0.0060	66
exp_013	KEEP	amount-features campaign step 1: per-card1 amt baseline (mean/std/count) + amt zscore/ratio + log_amt + is_round + card6_amt_zscore	0.3589	0.2752	0.8568	0.8224	0.1481	0.0040	75
exp_014	KEEP	amount-features campaign step 2: addr1 amount baseline (mean/std/count/zscore) + r_email_amt_ratio — building on amt_is_round/cents from exp_013	0.3603	0.2714	0.8591	0.8251	0.1495	0.0050	80
exp_015	discard	card1-TE campaign step 1: add OOF smoothed TE for card1 (primary card identifier, min_samples=10) alongside existing features	0.3397	0.2281	0.8351	0.7892	0.1247	0.0056	81
exp_016	KEEP	interaction campaign step 1: ProductCD×card6 freq + card4×card6 freq + dist1_log transform	0.3635	0.2851	0.8596	0.8272	0.1522	0.0049	83
exp_017	KEEP	interaction campaign step 2: add R_emaildomain×card6 + R_emaildomain×ProductCD + addr1×card6 interaction freqs (building on productcd_card6_freq from exp_016)	0.3696	0.2796	0.8616	0.8270	0.1566	0.0055	86
exp_018	KEEP	interaction campaign step 3: P_emaildomain×card6 + id12×card6 + card3_quantile×card6 interactions (building on r_email×card6 from exp_017)	0.3737	0.2911	0.8630	0.8256	0.1576	0.0047	88
exp_019	discard	interaction campaign step 4: id29×card6 + id35×ProductCD + addr1_quantile×R_emaildomain interactions (building on 5-exp keep streak)	0.3701	0.2902	0.8623	0.8244	0.1561	0.0049	90
exp_020	KEEP	final-model campaign step 1: deeper XGBoost max_depth=6, lr=0.01, n_est=5000, stronger regularization (lambda=3, alpha=0.3, gamma=0.3) on SOTA 88-feature set	0.3885	0.3077	0.8680	0.8331	0.1670	0.0048	88

ieee-cis_v3_archive

ieee-cis UID aggregation campaign step 1/4 v2: uid=card1+addr1, 7 label-free UID aggs (no uid_fraud_rate) + card1 amt deviation + P_emaildomain/ProductCD TE

Experiments: 11

Kept: 2

Discarded: 9

Best AUPRC (val): 0.2074

Best AUPRC (OOT): 0.1808

Baseline (OOT): 0.1728

Improvement: +4.6%

#	Status	Hypothesis	AUPRC val	AUPRC OOT	AUROC val	AUROC OOT	Composite val	PSI	Feats
exp_000	KEEP	baseline: frequency encoding + raw features, XGBoost	0.2017	0.1728	0.8065	0.7781	0.0839	0.0042	13
exp_001	discard	ieee-cis UID aggregation campaign step 1/4: uid=card1+addr1, 8 UID aggs (mean/std/count/min/max amt, fraud_rate TE, n_emails, n_products) + card1 amt deviation + P_emaildomain/ProductCD/card4/card6 TE	0.2400	0.1845	0.8132	0.7717	0.0800	0.0014	24
exp_002	KEEP	ieee-cis UID aggregation campaign step 1/4 v2: uid=card1+addr1, 7 label-free UID aggs (no uid_fraud_rate) + card1 amt deviation + P_emaildomain/ProductCD TE	0.2074	0.1808	0.8127	0.7987	0.0920	0.0051	23
exp_003	discard	ieee-cis UID aggregation campaign step 2/4: add amt percentiles per UID + email domain match flag + R_emaildomain/DeviceType TEs	0.2411	0.1856	0.8229	0.7977	0.0322	0.0013	27
exp_004	discard	ieee-cis UID aggregation campaign step 2/4 v2: add uid percentiles (p25/p75/p90/range) + DeviceType TE — removed R_emaildomain and email_domain_match (caused PSI=0.275)	0.2411	0.1856	0.8229	0.7977	0.0322	0.0013	27
exp_005	discard	ieee-cis UID aggregation campaign step 3/4: card1 velocity (1h/24h/7d count+sum via TransactionDT with training-tail) + burst ratio — builds on UID SOTA without leaky percentiles	0.2046	0.1767	0.8141	0.8064	0.0892	0.0019	30
exp_006	discard	ieee-cis UID aggregation campaign step 4/4: DeviceType TE + cyclic time (sin/cos hour/dow, is_night/weekend) from TransactionDT + drop raw TransactionDT (PSI=12 temporal drift)	0.2221	0.1746	0.8119	0.7837	0.0847	0.0038	28
exp_007	discard	ieee-cis model regularization: max_depth=5, subsample=0.8, colsample_bytree=0.8, min_child_weight=5 — target auroc_gap (0.090) penalty reduction on SOTA features	0.2379	0.1845	0.8255	0.8017	0.0377	0.0016	23
exp_008	discard	ieee-cis identity consistency campaign step 1/2 (Recipe 4): modal identity per card1 (P/R_email, DeviceType, DeviceInfo match flags) + entity sharing counts + diversity per card + identity_stability composite — pure label-free consistency signals	0.2244	0.1869	0.8199	0.7971	0.0298	0.0029	24
exp_009	discard	ieee-cis amount patterns (stateless, no per-card history): log_amt + round-number flags + amt_cents_bucket + has_cents + log_amt*ProductCD_te interaction — avoids default-fill PSI trap by using only population-level or stateless features	0.2296	0.1824	0.8333	0.8143	0.0912	0.0051	29
exp_010	discard	ieee-cis stateless amount patterns only (no interaction): log_amt + round-10 + round-100 + cents_bucket + has_cents — removes log*ProductCD_te interaction that caused auroc_gap increase in exp_009	0.2299	0.1783	0.8285	0.7934	0.0332	0.0024	28

Fraud-Sim

fraud-sim per-card diversity: distinct merchant count + distinct category count in training (fraud = narrower merchant range)

Experiments: 11

Kept: 5

Discarded: 6

Best AUPRC (val): 0.9489

Best AUPRC (OOT): 0.9496

Baseline (OOT): 0.6981

Improvement: +36.0%

#	Status	Hypothesis	AUPRC val	AUPRC OOT	AUROC val	AUROC OOT	Composite val	PSI	Feats
exp_000	KEEP	baseline: frequency encoding + raw features, XGBoost	0.6517	0.6981	0.9931	0.9936	0.3898	0.0001	16
exp_001	KEEP	fraud-sim velocity stack campaign step 1/3: haversine distance + velocity (1h/1d/7d per card_id with training-tail) + behavioral deviation + cyclic time	0.9453	0.9414	0.9988	0.9992	0.7138	0.0001	32
exp_002	discard	fraud-sim velocity stack campaign step 2/3: add smoothed merchant TE + merchant amount deviation on top of step 1	0.9447	0.9453	0.9986	0.9988	0.7136	0.0002	34
exp_003	KEEP	fraud-sim velocity stack campaign step 3/3: replace merchant_freq_enc with merchant_te + add category_te (smoothed, min_samples=20)	0.9466	0.9458	0.9986	0.9987	0.7162	0.0004	33
exp_004	discard	fraud-sim behavioral fingerprint: per-card typical hour deviation + per-card haversine distance zscore and vs-p90 (unusual distance for this card)	0.9466	0.9458	0.9986	0.9987	0.7162	0.0004	36
exp_005	discard	fraud-sim add 10-minute velocity window to catch rapid card-testing sequences + 10min/1h burst ratio	0.9476	0.9485	0.9986	0.9987	0.7168	0.0004	39
exp_006	discard	fraud-sim key interactions: log_amtlog_dist (big amount far from home) + vel_1hlog_dist (rapid card-testing far from home)	0.9468	0.9473	0.9984	0.9986	0.7155	0.0014	41
exp_007	KEEP	fraud-sim demographic features: log_city_pop + amt_per_city_pop (relative txn size for city)	0.9482	0.9496	0.9986	0.9986	0.7191	0.0003	43
exp_008	discard	fraud-sim expand TE: add state_te + job_te on top of merchant_te + category_te (replace freq enc for all these)	0.6490	0.6999	0.9768	0.9838	0.3173	0.0017	42
exp_009	discard	fraud-sim add state_te (smoothed geographic risk) as additive feature on top of SOTA	0.9422	0.9445	0.9985	0.9983	0.7124	0.0005	44
exp_010	KEEP	fraud-sim per-card diversity: distinct merchant count + distinct category count in training (fraud = narrower merchant range)	0.9489	0.9468	0.9987	0.9984	0.7202	0.0004	45

IEEE-CIS — Track A: ContinuedTrack A

model-tuning campaign step 3/3: max_depth=5 with stronger regularization (gamma=1.5, reg_lambda=7.0) to compensate; 77 features may benefit from more expressive trees with stronger L2

Experiments: 36

Kept: 9

Discarded: 27

Best AUPRC (val): 0.3689

Best AUPRC (OOT): 0.2761

Baseline (OOT): 0.2297

Improvement: +20.2%

#	Status	Hypothesis	AUPRC val	AUPRC OOT	AUROC val	AUROC OOT	Composite val	PSI	Feats
exp_000	KEEP	uid-aggregation campaign step 1/5: baseline with id_cluster_present and freq encoding for all categorical columns	0.3084	0.2297	0.8297	0.7800	0.0705	0.0011	55
exp_001	discard	uid-aggregation campaign step 2/5: smoothed TE for ProductCD, P_emaildomain, R_emaildomain, card4, card6; drop TransactionDT (PSI=12.43)	0.3104	0.2260	0.8240	0.7745	0.0586	0.0021	59
exp_002	discard	uid-aggregation campaign step 2+3/5: TE replaces freq for ProductCD/emaildomains/card4/card6; add UID=card1+addr1 7-agg; drop TransactionDT	0.3290	0.2248	0.8169	0.7736	0.0621	0.0020	61
exp_003	discard	uid-aggregation campaign step 3/5 variation: drop 12 dead features + TransactionDT; add individual null flags for R_emaildomain/id_31/id_17/DeviceInfo/DeviceType	0.3198	0.2463	0.8209	0.7641	0.0637	0.0026	47
exp_004	KEEP	uid-aggregation campaign step 4/5: model regularization max_depth=4, subsample=0.8, colsample=0.7, min_child_weight=10, gamma=1, reg_lambda=5 to reduce train_val_psi from 0.294	0.3215	0.2548	0.8360	0.8111	0.1111	0.0053	55
exp_005	KEEP	uid-aggregation campaign step 5/5: smoothed TE replaces freq for R_emaildomain/ProductCD/card6/card4/P_emaildomain on regularized model	0.3260	0.2332	0.8325	0.8113	0.1240	0.0060	55
exp_006	KEEP	velocity-stack campaign step 1/5: UID=card1+addr1 7 amount aggs (mean/std/min/max/count/distinct_email/distinct_product) on regularized model + TE	0.2974	0.2509	0.8321	0.8138	0.1296	0.0070	62
exp_007	discard	velocity-stack campaign step 2/5: UID amount deviation (zscore, IQR ratio) + card1 zscore fallback + uid_q25/q75 percentiles	0.3278	0.2801	0.8389	0.8200	0.1205	0.0036	68
exp_008	KEEP	velocity-stack campaign step 3/5: per-card1 velocity features (median_gap, std_gap, min_gap, burst_count, daily_rate) via Recipe 2	0.3227	0.2464	0.8452	0.8200	0.1357	0.0060	67
exp_009	discard	velocity-stack campaign step 4/5: card1 smoothed TE (smoothing=50) for high-cardinality entity encoding; on top of UID aggs + velocity	0.3143	0.2021	0.8291	0.7933	0.1193	0.0043	68
exp_010	discard	velocity-stack campaign step 5/5: TE for id_35+id_15 (strong IV flags); addr1 TE; log(TransactionAmt) on top of UID+velocity SOTA	0.3281	0.2339	0.8417	0.8168	0.1348	0.0054	69
exp_011	discard	velocity-stack campaign step 1/3: addr1_te + email×card6 interaction TE + prod×card6 TE + id_null_count on top of UID+velocity SOTA	0.3250	0.2546	0.8431	0.8199	0.1321	0.0056	70
exp_012	discard	velocity-stack campaign step 2/3: model max_depth=3 + addr1_te + id_null_count + id_present_count; reduce AUROC gap from 0.097 to allow stable features	0.2827	0.2308	0.8222	0.8085	0.1209	0.0070	70
exp_013	discard	velocity-stack campaign step 3/3: addr1_te alone (smoothing=30, medium cardinality) on SOTA UID+velocity+TE model	0.3239	0.2448	0.8417	0.8167	0.1327	0.0061	68
exp_014	KEEP	cyclic_time campaign step 1/2: sin/cos hour-of-day + day-of-week from TransactionDT; drop raw TransactionDT (PSI=12.43); log(TransactionAmt)	0.3361	0.2643	0.8476	0.8206	0.1424	0.0086	71
exp_015	discard	cyclic_time campaign step 2/2: addr1_te + id_null_count + id_present_count on top of stable cyclic-time SOTA (tv_psi=0.013 gives room for additional features)	0.3272	0.2641	0.8431	0.8193	0.1322	0.0093	74
exp_016	discard	deviceinfo-te campaign step 1/4: DeviceInfo OOF TE (IV=1.778, highest signal in dataset) replacing freq encoding; 1546 unique values, smoothing min_samples=100 to handle long tail	0.3203	0.2399	0.8339	0.8052	0.1275	0.0089	74
exp_017	discard	deviceinfo-te campaign step 1/4: DeviceInfo OOF TE (IV=1.778, highest signal in dataset) replacing freq encoding; 1546 unique values, smoothing min_samples=100 to handle long tail	0.3203	0.2399	0.8339	0.8052	0.1275	0.0089	74
exp_018	discard	deviceinfo-te campaign step 2/4: id_30 smoothed TE (71 unique, IV=0.619) + id_31 TE (108 unique, IV=0.538) + id_33 TE (183 unique, IV=0.879) replacing dead freq encodings; all have high raw IV but zero IV after freq encoding	0.3364	0.2580	0.8446	0.8193	0.1415	0.0027	71
exp_019	discard	deviceinfo-te campaign step 2/4: id_30 smoothed TE (71 unique, IV=0.619) + id_31 TE (108 unique, IV=0.538) + id_33 TE (183 unique, IV=0.879) replacing dead freq encodings; all have high raw IV but zero IV after freq encoding	0.3364	0.2580	0.8446	0.8193	0.1415	0.0027	71
exp_020	discard	deviceinfo-te campaign step 3/4: prune 8 dead features (IV<0.005 in SOTA): id_10, id_11, id_18, id_21, id_22, dist2 (numeric pass-throughs), id_30/id_33 (freq-enc goes dead), skip txn_dow_cos; reduce noise in 71-feature model	0.2855	0.2460	0.8300	0.8118	0.1270	0.0083	62
exp_021	discard	psi-fix campaign step 1/3: drop TransactionID (auto-increment identifier causing PSI=12.43 OOT drift) — it leaks temporal ordering and is not a fraud signal	0.3269	0.2771	0.8455	0.8204	0.1363	0.0062	70
exp_022	discard	psi-fix campaign step 2/3: drop TransactionID + add UID2=card1+card5 aggregations (5 stats: mean/std/min/max/count amt); card5 has 110 unique values, AUC=0.554, should improve entity resolution	0.3225	0.2722	0.8440	0.8187	0.1333	0.0061	75
exp_023	discard	id-te campaign step 1/3: id_31 smoothed TE alone (IV=0.538, 108 unique, showed importance=0.0422 in exp_018); isolating id_31 TE to test if it adds signal without displacing id_17	0.3352	0.2574	0.8446	0.8168	0.1400	0.0020	71
exp_024	KEEP	amount-patterns campaign step 1/3: amount pattern features (stateless, PSI-safe): round number flags, cents indicator, threshold-testing detection (5-100, 90-1000); no groupby, pure deterministic transforms	0.3404	0.2579	0.8554	0.8312	0.1479	0.0093	77
exp_025	KEEP	amount-patterns campaign step 2/3: add anomaly score (Mahalanobis distance from training centroid, Recipe 6) on top of exp_024 SOTA; 20 PSI-safe numeric features, captures unusual transaction profiles	0.3453	0.2621	0.8557	0.8328	0.1500	0.0095	79
exp_026	discard	amount-patterns campaign step 3/3: per-category amount deviation (amt vs ProductCD/card6 median/IQR corridor from fit()); amount deviation normalized by IQR captures out-of-corridor fraud	0.3453	0.2621	0.8557	0.8328	0.1500	0.0095	79
exp_027	discard	amount-patterns campaign step 3/3 (fixed): per-category amount deviation (amt vs ProductCD/card6 median/IQR) computed before TE drop; fraud often uses amounts outside normal product range	0.3396	0.2294	0.8544	0.8276	0.1462	0.0090	81
exp_028	discard	email-agg campaign step 1/3: R_emaildomain aggregations (mean/std amt per domain, card1 distinct per domain); ~60 unique email domains, PSI-stable; email_card_distinct captures compromised email domains	0.3420	0.2450	0.8546	0.8289	0.1459	0.0094	83
exp_029	discard	identity-consistency campaign step 1/3: per-card1 modal identity profile (P_email/R_email/DeviceType match flags) + entity sharing counts (n_cards per email domain); Recipe 4 - 'new device + new email = suspicious'	0.3456	0.2578	0.8568	0.8316	0.1495	0.0089	86
exp_030	discard	identity-consistency campaign step 2/3: entity sharing only — n_cards sharing same R_emaildomain/P_emaildomain/DeviceType; fraud rings share infrastructure; stateless about card1 so less temporal drift vs profile matching	0.3432	0.2606	0.8560	0.8324	0.1472	0.0094	82
exp_031	discard	model-tuning campaign step 1/3: n_estimators=3000, lr=0.02 (from 2000/0.03); lower LR + more trees for better convergence; train_val_psi was 0.014 suggesting slight overfit headroom	0.3380	0.2568	0.8557	0.8293	0.1450	0.0087	79
exp_032	discard	model-tuning campaign step 2/3: stronger regularization — subsample=0.75, colsample=0.65, min_child_weight=15, gamma=1.5, reg_lambda=7.0; reduce overfit given train_auroc_gap=0.097	0.3375	0.2605	0.8576	0.8319	0.1462	0.0089	79
exp_033	discard	addr1-agg campaign step 1/3: addr1 aggregations (mean/std/count amt + distinct R_emaildomain + distinct card1 per addr1); 318 unique addr1 values, stable entity; geographic fraud pattern capture	0.3434	0.2563	0.8552	0.8285	0.1489	0.0091	84
exp_034	KEEP	model-tuning campaign step 3/3: max_depth=5 with stronger regularization (gamma=1.5, reg_lambda=7.0) to compensate; 77 features may benefit from more expressive trees with stronger L2	0.3689	0.2761	0.8633	0.8315	0.1519	0.0070	79
exp_035	discard	model-tuning campaign + psi-fix: max_depth=5 (gamma=1.5, reg_lambda=7.0) + drop TransactionID (PSI=12.43 leakage); combining exp_034 model gains with exp_021 leakage fix for better OOT generalization	0.3612	0.2873	0.8613	0.8282	0.1495	0.0072	78