Original Article

Performance of Combined Models in Discrete Binary Classification

Anabela Marques

Barreiro College of Technology, Setúbal Polytechnic, Institute of Setùbal, Portugal

Search for more papers by this author

Ana Sousa Ferreira

Faculdade de Psicologia, Universidade de Lisboa & Business Research Unit (BRU-IUL), Portugal

Instituto Universitário de Lisboa (ISCTE-IUL), Business Research Unit (BRU-IUL), Lisboa, Portugal

Search for more papers by this author

, and

Margarida G. M. S. Cardoso

Instituto Universitário de Lisboa (ISCTE-IUL), Business Research Unit (BRU-IUL), Lisboa, Portugal

Search for more papers by this author

Published Online:February 16, 2017https://doi.org/10.1027/1614-2241/a000117

Abstract

Abstract. Diverse Discrete Discriminant Analysis (DDA) models perform differently in different samples. This fact has encouraged research in combined models which seems particularly promising when the a priori classes are not well separated or when small or moderate sized samples are considered, which often occurs in practice. In this study, we evaluate the performance of a convex combination of two DDA models: the First-Order Independence Model (FOIM) and the Dependence Trees Model (DTM). We use simulated data sets with two classes and consider diverse data complexity factors which may influence performance of the combined model – the separation of classes, balance, and number of missing states, as well as sample size and also the number of parameters to be estimated in DDA. We resort to cross-validation to evaluate the precision of classification. The results obtained illustrate the advantage of the proposed combination when compared with FOIM and DTM: it yields the best results, especially when very small samples are considered. The experimental study also provides a ranking of the data complexity factors, according to their relative impact on classification performance, by means of a regression model. It leads to the conclusion that the separation of classes is the most influential factor in classification performance. The ratio between the number of degrees of freedom and sample size, along with the proportion of missing states in the minority class, also has significant impact on classification performance. An additional gain of this study, also deriving from the estimated regression model, is the ability to successfully predict the precision of classification in a real data set based on the data complexity factors.

References

Abbott, D. W. (1999). Combining models to improve classifier accuracy and robustness. Proceedings of Second International Conference on Information Fusion, Fusion’99, (Vol. 1, pp. 289–295), San Jose, CA. First citation in article Google Scholar
Amershi, S. & Conati, C. (2009). Combining unsupervised and supervised classification to build user models for exploratory. JEDM-Journal of Educational Data Mining, 1, 18–71. First citation in article Google Scholar
Bacelar-Nicolau, H. (1985). The affinity coefficient in cluster analysis. Methods of Operations Research, 53, 507–512. First citation in article Google Scholar
Bache, K. & Lichman, M. (2013). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science. First citation in article Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140. First citation in article Crossref, Google Scholar
Breiman, L. (1998). Half & Half bagging and hard boundary points (Technical Report). Berkeley, CA: Statistics Department, University of California. First citation in article Google Scholar
Brito, I. (2002). Combinaison de modèles en analyse discriminante dans un contexte gaussien [Combining models in discriminant analysis in a Gaussian context]. (PhD thesis). France: Grenoble 1 University. First citation in article Google Scholar
Brito, I., Celeux, G. & Sousa Ferreira, A. (2006). Combining methods in supervised classification: A comparative study on discrete and continuous problems. REVSTAT – Statistical Journal, 4, 201–225. First citation in article Google Scholar
Celeux, G. & Nakache, J. P. (1994). Analyse discriminante sur variables qualitatives [Discrete discriminant analyses]. Paris: Polytechnica. First citation in article Google Scholar
Cesa-Bianchi, N., Claudio, G. & Luca, Z. (2006). Hierarchical classification: Combining Bayes with SVM. Proceedings of the 23rd international conference on machine learning. New York, NY: ACM. First citation in article Crossref, Google Scholar
Dietterich, T. G. (1997). Machine-learning research. AI Magazine, 18, 97–136. First citation in article Google Scholar
Duarte, A. (2009). A satisfação do consumidor nas instituições culturais: O caso do Centro Cultural de Belém [Consumer satisfaction in cultural institutions: The case of Centro Cultural de Belém]. (Master thesis). Portugal: ISCTE – IUL. First citation in article Google Scholar
Elder, J. F. & Pregibon, D. (1996). A statistical perspective on knowledge discovery in databases. In U. M. FayyadG. Piatetsky-ShapiroP. SmythR. UthurusamyEds., Advances in knowledge discovery and data mining (pp. 83–116). Menlo Park, CA: AAAI/MIT Press. First citation in article Google Scholar
Finch, H. & Schneider, M. K. (2007). Classification accuracy of neural networks vs. discriminant analysis, logistic regression, and classification and regression trees. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 2, 47–57. DOI: 10.1027/1614-2241.3.2.47 First citation in article Link, Google Scholar
Freund, Y. & Schapire, R. E. (1996, July). Experiments with a new boosting algorithm. ICML’96: Proceedings of the 13th International Conference on Machine Learning, (Vol. 96, pp. 148–156). Bari, Italy. First citation in article Google Scholar
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29, 1189–1232. First citation in article Crossref, Google Scholar
Friedman, J. H., Hastie, T. & Tibsharani, R. (1998). Additive logistic regression: A statistical view of boosting. (Technical Report). Stanford, CA: Statistics Department, Stanford University. First citation in article Google Scholar
Friedman, J. H. & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2, 916–954. First citation in article Crossref, Google Scholar
Goldstein, M. & Dillon, W. R. (1978). Discrete discriminant analysis. New York, NY: Wiley. First citation in article Google Scholar
Henningsen, A. (2010). Estimating Censored Regression Models in R using the censReg Package. R package vignettes. First citation in article Google Scholar
Ho, T. K. & Basu, M. (2002). Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 289–300. First citation in article Crossref, Google Scholar
Jain, A. K., Duin, R. P. W. & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 4–37. First citation in article Crossref, Google Scholar
Janusz, A. (2010). Combining multiple classification or regression models using genetic algorithms. In Marcin SzczukaMarzena KryszkiewiczSheela RamannaRichard JensenQinghua HuEds., Rough Sets and Current Trends in Computing. (pp. 130–137). Berlin-Heidelberg, Germany: Springer. First citation in article Google Scholar
Kotsiantis, S. (2011). Combining bagging, boosting, rotation forest and random subspace methods. Artificial Intelligence Review, 35, 223–240. First citation in article Crossref, Google Scholar
Kotsiantis, S. B., Zaharakis, I. D. & Pintelas, P. E. (2006). Machine learning: A review of classification and combining techniques. Artificial Intelligence Review, 26, 159–190. First citation in article Crossref, Google Scholar
Macia, N., Bernadó-Mansilla, E. & Orriols-Puig, A. (2008). Preliminary approach on synthetic data sets generation based on class separability measure. ICPR 2008 – 19th International Conference on Pattern Recognition. Tampa, FL: IEEE, 1–4. First citation in article Crossref, Google Scholar
Marques, A., Sousa Ferreira, A. & Cardoso, M. G. M. S. (2010). Classification and combining models. Proceedings of Stochastic Modeling Techniques and Data Analysis International Conference (CD-rom), Chania, Crete, Greece. First citation in article Google Scholar
Marques, A., Sousa Ferreira, A. & Cardoso, M. G. M. S. (2013). Selection of variables in discrete discriminant analysis. Biometrical Letters, 50, 1–14. First citation in article Crossref, Google Scholar
Marques, A., Sousa Ferreira, A. & Cardoso, M. G. M. S. (2015). Combining models in discrete discriminant analysis. International Journal of Data Analysis Techniques and Strategies, 2. http://www.inderscience.com/jhome.php?jcode=ijdats First citation in article Google Scholar
Matusita, K. (1955). Decision rules, based on the distance, for problems of fit, two samples, and estimation. The Annals of Mathematical Statistics, 26, 631–640. First citation in article Crossref, Google Scholar
Milgram, J., Cheriet, M. & Sabourin, R. (2004). Speeding up the decision making of support vector classifiers. Ninth International Workshop on Frontiers in Handwriting Recognition, (IWFHR-2004). Kokubunji, Tokyo: IEEE, 57–62. First citation in article Crossref, Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco, CA: Morgan Kaufmann. First citation in article Google Scholar
Pinches, G. E. (1980). Factors influencing classification results from multiple discriminant analysis. Journal of Business Research, 8, 429–456. First citation in article Crossref, Google Scholar
Prati, R. C., Batista, G. E. & Monard, M. C. (2004). Class imbalances versus class overlapping: an analysis of a learning system behavior. In R. MonroyG. Arroyo-FigueroaL. E. SucarH. SossaEds., MICAI 2004: Advances in Artificial Intelligence (pp. 312–321). Berlin-Heidelberg, Germany: Springer. First citation in article Google Scholar
Prazeres, N. L. (1996). Ensaio de um Estudo sobre Alexitimia com o Rorschach e a Escala de Alexitimia deToronto (TAS-20) [Study assay on alexithymia with the Rorschach and the Alexithymia Scale of Toronto (TAS-20)]. (Master thesis). Lisbon: Universidade de Lisboa. First citation in article Google Scholar
Raudys, S. J. & Jain, A. K. (1991). Small sample size effects in statistical pattern recognition: Recommendations for practitioners. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13, 252–264. First citation in article Crossref, Google Scholar
Re, M. & Valentini, G. (2012). Ensemble methods: A review. In Michael J. WayJeffrey D. ScargleKamal M. AliAshok N. SrivastavaEds., Advances in machine learning and data mining for astronomy (pp. 563–582). Boca Raton, FL: Chapman & Hall/CRC Press. First citation in article Google Scholar
Sotoca, J. M., Sanchez, J. S. & Mollineda, R. A. (2005). A review of data complexity measures and their applicability to pattern classification problems. Actas del III Taller Nacional de Mineria de Datos y Aprendizaje. TAMIDA, 77–83. Granada, Spain. First citation in article Google Scholar
Sousa Ferreira, A. (2000). Combining models in discrete discriminant analysis. (PhD Thesis, in Portuguese). Lisboa: University Nova de Lisboa. First citation in article Google Scholar
Sousa Ferreira, A. (2004). Combining Models in discrete discriminant analysis through a committee of methods. In D. BanksL. HouseF. R. McMorrisP. ArabieW. GaulEds., Classification clustering, and data mining applications (pp. 151–156). Berlin-Heidelberg, Germany: Springer. First citation in article Google Scholar
Sousa Ferreira, A. (2010). A comparative study on discrete discriminant analysis through a hierarchical coupling approach. In H. Locarek-JungeC. WeihsEds., Classification as a tool for research, studies in classification, data analysis, and knowledge organization (pp. 137–145). Berlin-Heidelberg, Germany: Springer. First citation in article Google Scholar
Sousa Ferreira, A., Celeux, G. & Bacelar-Nicolau, H. (2000). Discrete Discriminant Analysis: The performance of combining models by a hierarchical coupling approach. In H. A. L. KiersJ.-P. RassonP. J. F. GroenenM. SchaderEds., Data analysis, classification, and related methods (pp. 181–186). Berlin-Heidelberg, Germany: Springer. First citation in article Google Scholar
Steinberg, D. (1997). CART user’s manual. San Diego, CA: Salford Systems. First citation in article Google Scholar
Weiss, G. M. & Provost, F. (2003). Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research, 19, 315–354. First citation in article Crossref, Google Scholar
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5, 241–259. First citation in article Crossref, Google Scholar

Volume 13Issue 1January 2017

ISSN: 1614-1881eISSN: 1614-2241

History

ReceivedMarch 28, 2014
RevisedOctober 30, 2015
AcceptedJuly 12, 2016
Published onlineFebruary 16, 2017

Licenses & Copyright

Keywords

PDF download

Verify Phone

Congrats!

Performance of Combined Models in Discrete Binary Classification

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners

Change Password

Your password must have 8 characters or more and contain 3 of the following:

Password Changed Successfully

Create a new account

Request Username

Verify Phone

Congrats!

Performance of Combined Models in Discrete Binary Classification

Abstract

References

History

Licenses & Copyright

Support & Contact

Support & Contact

Legal information

Legal information

More offers

More offers

Our partners

Our partners