Prediction Probability Distributions
Comparison of predicted fraud probabilities for actual fraud cases (y=1) across different undersampling ratios. Top row: histograms; bottom row: CDFs. Methods with subscript G indicate graph-enhanced versions using GCN embeddings.
Select Model Type:
NeuralNet
RandomForest
ExtraTrees
LightGBM
CatBoost
XGBoost
Undersampling Experiments
We evaluated the robustness of our proposed method across various undersampling ratios while keeping the test set unchanged. Methods with subscript G indicate graph-enhanced versions using GCN embeddings. Highlighted rows show graph-enhanced models.
Table 1: Performance Metrics (Fraud Ratio: 5%)
Method Accuracy Precision Recall F1-score AUC
NeuralNet 0.9952 0.5531 0.8679 0.6756 0.9954
NeuralNetG 0.9968 0.6528 0.9489 0.7735 0.9993
RandomForest 0.9954 0.5639 0.8406 0.6750 0.9903
RandomForestG 0.9971 0.6769 0.9284 0.7830 0.9989
ExtraTrees 0.9959 0.6054 0.8356 0.7021 0.9907
ExtraTreesG 0.9947 0.5193 0.9167 0.6631 0.9979
LightGBM 0.9946 0.5188 0.8207 0.6357 0.9933
LightGBMG 0.9972 0.6840 0.9434 0.7930 0.9992
CatBoost 0.9957 0.5966 0.7885 0.6793 0.9772
CatBoostG 0.9966 0.6464 0.9095 0.7557 0.9978
XGBoost 0.9951 0.5477 0.8168 0.6557 0.9899
XGBoostG 0.9942 0.4975 0.9306 0.6484 0.9885
KNeighbors 0.9841 0.0873 0.1877 0.1192 0.7354
KNeighborsG 0.9868 0.1126 0.1899 0.1414 0.7426
Table 2: Performance Metrics (Fraud Ratio: 10%)
Method Accuracy Precision Recall F1-score AUC
NeuralNet 0.9902 0.3625 0.9328 0.5221 0.9949
NeuralNetG 0.9952 0.5455 0.9783 0.7005 0.9994
RandomForest 0.9930 0.4453 0.8745 0.5901 0.9931
RandomForestG 0.9956 0.5707 0.9617 0.7163 0.9992
ExtraTrees 0.9945 0.5097 0.8629 0.6408 0.9917
ExtraTreesG 0.9924 0.4262 0.9589 0.5901 0.9988
LightGBM 0.9934 0.4609 0.9095 0.6118 0.9959
LightGBMG 0.9960 0.5914 0.9772 0.7369 0.9994
CatBoost 0.9938 0.4776 0.8684 0.6162 0.9940
CatBoostG 0.9959 0.5882 0.9706 0.7325 0.9991
XGBoost 0.9940 0.4839 0.8318 0.6118 0.9731
XGBoostG 0.9949 0.5298 0.9622 0.6834 0.9978
KNeighbors 0.9603 0.0470 0.3082 0.0816 0.7838
KNeighborsG 0.9613 0.0485 0.3087 0.0838 0.7855
Table 3: Performance Metrics (Fraud Ratio: 20%)
Method Accuracy Precision Recall F1-score AUC
NeuralNet 0.9811 0.2235 0.9289 0.3603 0.9924
NeuralNetG 0.9916 0.4055 0.9872 0.5748 0.9991
RandomForest 0.9861 0.2782 0.8962 0.4246 0.9933
RandomForestG 0.9910 0.3869 0.9778 0.5545 0.9989
ExtraTrees 0.9910 0.3780 0.8873 0.5302 0.9918
ExtraTreesG 0.9871 0.3052 0.9783 0.4653 0.9986
LightGBM 0.9869 0.2950 0.9317 0.4481 0.9954
LightGBMG 0.9935 0.4690 0.9906 0.6366 0.9995
CatBoost 0.9901 0.3620 0.9528 0.5247 0.9982
CatBoostG 0.9943 0.5027 0.9822 0.6650 0.9992
XGBoost 0.9865 0.2851 0.9062 0.4338 0.9950
XGBoostG 0.9947 0.5217 0.9822 0.6814 0.9994
KNeighbors 0.9016 0.0295 0.5069 0.0557 0.8111
KNeighborsG 0.9017 0.0295 0.5069 0.0558 0.8114
Table 4: Performance Metrics (Fraud Ratio: 30%)
Method Accuracy Precision Recall F1-score AUC
NeuralNet 0.9707 0.1576 0.9478 0.2703 0.9926
NeuralNetG 0.9853 0.2780 0.9839 0.4335 0.9980
RandomForest 0.9780 0.1976 0.9267 0.3258 0.9935
RandomForestG 0.9853 0.2782 0.9833 0.4337 0.9987
ExtraTrees 0.9860 0.2792 0.9089 0.4272 0.9925
ExtraTreesG 0.9773 0.1993 0.9839 0.3314 0.9981
LightGBM 0.9748 0.1783 0.9428 0.2999 0.9950
LightGBMG 0.9892 0.3455 0.9911 0.5123 0.9991
CatBoost 0.9849 0.2719 0.9739 0.4251 0.9983
CatBoostG 0.9917 0.4086 0.9883 0.5782 0.9993
XGBoost 0.9768 0.1900 0.9334 0.3158 0.9947
XGBoostG 0.9924 0.4276 0.9895 0.5972 0.9994
KNeighbors 0.8380 0.0221 0.6319 0.0427 0.8115
KNeighborsG 0.8404 0.0227 0.6385 0.0438 0.8163
Table 5: Performance Metrics (Fraud Ratio: 40%)
Method Accuracy Precision Recall F1-score AUC
NeuralNet 0.9467 0.0924 0.9417 0.1682 0.9883
NeuralNetG 0.9806 0.2266 0.9928 0.3690 0.9975
RandomForest 0.9692 0.1506 0.9434 0.2598 0.9937
RandomForestG 0.9795 0.2169 0.9883 0.3557 0.9986
ExtraTrees 0.9797 0.2101 0.9206 0.3422 0.9926
ExtraTreesG 0.9715 0.1658 0.9878 0.2839 0.9979
LightGBM 0.9711 0.1600 0.9539 0.2741 0.9942
LightGBMG 0.9870 0.3045 0.9956 0.4664 0.9991
CatBoost 0.9778 0.2023 0.9772 0.3351 0.9977
CatBoostG 0.9890 0.3410 0.9911 0.5074 0.9992
XGBoost 0.9677 0.1459 0.9567 0.2532 0.9946
XGBoostG 0.9888 0.3372 0.9906 0.5032 0.9992
KNeighbors 0.7703 0.0176 0.7152 0.0344 0.8047
KNeighborsG 0.7709 0.0177 0.7174 0.0346 0.8069
Table 6: Performance Metrics (Fraud Ratio: 50%)
Method Accuracy Precision Recall F1-score AUC
NeuralNet 0.9360 0.0788 0.9528 0.1456 0.9890
NeuralNetG 0.9769 0.1976 0.9933 0.3297 0.9969
RandomForest 0.9587 0.1179 0.9572 0.2099 0.9931
RandomForestG 0.9735 0.1766 0.9895 0.2997 0.9982
ExtraTrees 0.9678 0.1449 0.9445 0.2513 0.9928
ExtraTreesG 0.9649 0.1390 0.9895 0.2438 0.9977
LightGBM 0.9581 0.1167 0.9622 0.2081 0.9936
LightGBMG 0.9804 0.2255 0.9972 0.3678 0.9987
CatBoost 0.9691 0.1544 0.9828 0.2668 0.9972
CatBoostG 0.9875 0.3144 0.9956 0.4779 0.9994
XGBoost 0.9575 0.1161 0.9706 0.2075 0.9943
XGBoostG 0.9840 0.2618 0.9895 0.4141 0.9989
KNeighbors 0.6895 0.0143 0.7846 0.0281 0.7971
KNeighborsG 0.6899 0.0144 0.7857 0.0282 0.7985