Beta-boosted ensemble for big credit scoring data
Wirtschaftswissenschaftliche Fakultät
In this work we present a novel ensemble model for a credit scoring problem.
The main idea of the approach is to incorporate separate beta binomial distributions
for each of the classes to generate balanced datasets that are further used
to construct base learners that constitute the final ensemble model. The sampling
procedure is performed on two separate ranking lists, each for one class, where
the ranking is based on prepotency of observing positive class. Two strategies are
considered: one assumes mining easy examples and the second one forces good
classification of hard cases. The proposed solutions are tested on two big datasets
on credit scoring.
Dateien zu dieser Publikation