A Data Mining approach for forecasting failure root causes: A case study in an Automated Teller Machine (ATM) manufacturing company

Document Type : Original Manuscript


School of Industrial Engineering, I.U.S.T Tehran, Iran



Based on the findings of Massachusetts Institute of Technology, organizations’ data double every five years. However, the rate of using data is 0.3. Nowadays, data mining tools have greatly facilitated the process of knowledge extraction from a welter of data. This paper presents a hybrid model using data gathered from an ATM manufacturing company. The steps of the research are based on CRISP-DM. Therefore, based on the first step, business understanding, the company and its different units were studied. After business understanding, the data collected from sale's unit were prepared for preprocess. While preprocessing, data from some columns of dataset, based on their types and purpose of the research, were either categorized or coded. Then, the data have been inserted into Clementine software, which resulted in modeling and pattern discovery. The results clearly state that, the same Machines’ Code and the same customers in different provinces are struggling with significantly different Problems’ Code, that could be due to weather condition, culture of using ATMs, and likewise. Moreover, the same Machines’ Code and the same Problems’ Code, as well as differences in Technicians' expertise, seems to be some causes to significantly different Repair Time. This could be due to Technicians' training background level of their expertise and such. At last, the company can benefit from the outputs of this model in terms of its strategic decision-making.


Ahmed, S.R. (2001). Applications of data mining in retail business. International Conference on Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004, IEEE, 455-459.
Arumawadu, H.I., Rathnayaka, R.K.T., & Illangarathne, S. (2015). Mining Profitability of Telecommunication Customers Using K-Means Clustering. Journal of Data Analysis and Information Processing, 3(3), 63.
Balaji, S., & Srivatsa, S. (2012). Customer segmentation for decision support using clustering and association rule based approaches. International Journal of Computer Science & Engineering Technology, 3(11), 525-529.
Berry, M.J., & Linoff, G. (1997). Data mining techniques: for marketing, sales, and customer support. John Wiley & Sons, Inc.
Bhuvaneswari, C., Aruna, P., & Loganathan, D. (2014). A new fusion model for classification of the lung diseases using genetic algorithm. Egyptian Informatics Journal, 15(2), 69-77.
Chen, L-F. (2015). Exploring asymmetric effects of attribute performance on customer satisfaction using association rule method. International Journal of Hospitality Management, 47, 54-64.
Chen, Y-S., Cheng, C-H., Lai, C-J., Hsu, C-Y., & Syu, H-J. (2012). Identifying patients in target customer segments using a two-stage clustering-classification approach: A hospital-based assessment. Computers in Biology and Medicine, 42(2), 213-221.
Cheng, C-H., & Chen, Y-S. (2009). Classifying the segmentation of customer value via RFM model and RS theory. Expert systems with applications, 36(3), 4176-2184.
Chiang, W-Y. (2011). To mine association rules of customer values via a data mining procedure with improved model: An empirical case study. Expert Systems with Applications, 38(3), 1716-1122.
Cil, I. (2012). Consumption universes based supermarket layout through association rule mining and multidimensional scaling. Expert Systems with Applications, 39(10), 8611-8625.
Cimpoiu, C., Cristea, V-M., Hosu, A., Sandru, M., & Seserman, L. (2011).  Antioxidant activity prediction and classification of some teas using artificial neural networks. Food chemistry, 127(3), 1323-1328.
Cremaschi, S., Shin, J., & Subramani, H.J. (2015). Data clustering for model-prediction discrepancy reduction–A case study of solids transport in oil/gas pipelines. Computers & Chemical Engineering, 81, 355-363.
da Silva, C.E.T., Filardi, V.L., Pepe, I.M., Chaves, M.A., & Santos, C.M.S. (2015). Classification of food vegetable oils by fluorimetry and artificial neural networks. Food control, 47, 86-91.
de Amorim, R.C., & Hennig, C. (2015). Recovering the number of clusters in data sets with noise features using feature rescaling factors. Information Sciences, 324, 126-145.
de Oña, J., de Oña, R., & Calvo, F.J. (2012). A classification tree approach to identify key factors of transit service quality. Expert Systems with Applications, 39(12), 11164-11171.
Dhandayudam, P., & Krishnamurthi, D.I. (2012). An improved clustering algorithm for customer segmentation. International Journal of Engineering Science and Technology, 4(2), 99-102.
Doub, A.E., Small, M.L., Levin, A., LeVangie, K., & Brick, T.R. (2016). Identifying users of traditional and Internet-based resources for meal ideas: An association rule learning approach. Appetite, 103, 128-136.
Elyasigomari, V., Mirjafari, M., Screen, H.R., & Shaheed, M.H. (2015). Cancer classification using a novel gene selection approach by means of shuffling based on data clustering with optimization. Applied Soft Computing, 35, 43-51.
Friedman, J., Hastie, T., & Tibshirani, R. The elements of statistical learning. (2001). Springer series in statistics, New York.
Giraud-Carrier, C., & Povel, O. (2003). Characterising data mining software. Intelligent Data Analysis,7(3), 181-192.
Golmah, V., & Mirhashemi, G. (2012). Implementing a data mining solution to customer segmentation for decayable products-a case study for a textile firm. International Journal of Database Theory and Application, 5(3), 73-90.
Günter, S., & Bunke, H. (2003). Validation indices for graph clustering. Pattern Recognition Letters. 24(8), 1107-1113.
Gurrutxaga, I., Muguerza, J., Arbelaitz, O., Pérez, J.M., & Martín, J.I. (2011). Towards a standard methodology to evaluate internal cluster validity indices. Pattern
Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts and techniques. 3rd edition, Elsevier.
Hosseini, S.M.S., Maleki, A., & Gholamian, M.R. (2010). Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty. Expert Systems with Applications, 37(7), 5259-5264.
İşeri, A., & Karlık, B. (2009). An artificial neural networks approach on automobile pricing. Expert Systems with Applications, 36(2), 2155-2160.
Ivančević, V., Tušek, I., Tušek, J., Knežević, M., Elheshk, S., & Luković, I. (2015). Using association rule mining to identify risk factors for early childhood caries. Computer methods and programs in biomedicine, 122(2), 175-181.
Karabatak, M., & Ince, M.C. (2009). A new feature selection method based on association rules for diagnosis of erythemato-squamous diseases. Expert Systems with Applications, 36(10), 12500-12505.
Kareem, B., & Lawal, A. (2015). Spare parts failure prediction of an automobile under criticality condition. Engineering Failure Analysis, 56, 69-79.
Kargari, M., & Sepehri, M.M. (2012). Stores clustering using a data mining approach for distributing automotive spare-parts to reduce transportation costs. Expert Systems with Applications, 39(5), 4740-4748.
Keshavarzi, A., Sarmadian, F., Omran, E-SE., & Iqbal, M. (2015).  A neural network model for estimating soil phosphorus using terrain analysis. The Egyptian Journal of Remote Sensing and Space Science, 18(2), 127-135.
Kim, K-j., & Ahn, H. (2008). A recommender system using GA K-means clustering in an online shopping market. Expert systems with applications, 34(2), 1200-1209.
Kuo, R.J., & Zulvia, F.E. (2018). Automatic clustering using an improved artificial bee colony optimization for customer segmentation. Knowledge and Information Systems, 1-27.
Lee, P-J., Hu, Y-H., & Lu, K-T. (2018). Assessing the Helpfulness of Online Hotel Reviews: A Classification-based Approach. Telematics and Informatics, 35(2), 436-445.
Luo, Y., Li, Z., Guo, H., Cao, H., Song, C., Guo, X., & Zhang Y. (2017). Predicting congenital heart defects: A comparison of three data mining methods. PloS one, 12(5), e0177811.
Mahdavi, I., Cho, N., Shirazi, B., & Sahebjamnia, N. (2008). Designing evolving user profile in e-CRM with dynamic clustering of Web documents. Data & Knowledge Engineering, 65(2), 355-372.
Manolopoulou, .E, Kotsiantis, S., & Tzelepis, D. (2015). Application of association and decision rules on intellectual capital. Knowledge Management Research & Practice, 13(2), 225-234.
Martin, I., & Bestle, D. (2018). Automated eigenmode classification for airfoils in the presence of fixation uncertainties. Engineering Applications of Artificial Intelligence, 67, 187-196.
Mirabadi, A., & Sharifian, S. (2010). Application of association rules in Iranian Railways (RAI) accident data analysis. Safety Science, 48(10), 1427-1435.
Mitra, S., Pal, S.K., & Mitra, P. (2002). Data mining in soft computing framework: a survey. IEEE transactions on neural networks, IEEE, 13(1), 3-14.
Molina, C., Stepien, O., Pessegue, B., & Rameau, J-P. (2016). PKOM: A tool for clustering, analysis and comparison of big chemical collections. Digital Signal Processing, 48, 1-11.
Movagharnejad, K., Mehdizadeh, B., Banihashemi, M., & Kordkheili, M.S. (2011). Forecasting the differences between various commercial oil prices in the Persian Gulf region by neural network. Energy, 36(7), 3979-3984.
Nasibov, E., Savaş, S.K., Vahaplar, A., & Kınay, A.Ö. (2016). A survey on geographic classification of virgin olive oil with using T-operators in Fuzzy Decision Tree Approach. Chemometrics and Intelligent Laboratory Systems, 155, 86-96.
Ngai, E.W., Xiu, L., & Chau, D.C. (2009). Application of data mining techniques in customer relationship management: A literature review and classification. Expert systems with applications, 36(2), 2592-2602.
Palomares-Salas, J.C., Agüera-Pérez, A., de la Rosa, J.J.G., & Moreno-Muñoz, A. (2014). A novel neural network method for wind speed forecasting using exogenous measurements from agriculture stations. Measurement, 55, 295-304.
Parvaneh, A., Abbasimehr, H., & Tarokh, M.J. (2012). Integrating AHP and data mining for effective retailer segmentation based on retailer lifetime value. Journal of Optimization in Industrial Engineering, 5(11), 25-31.
Rumpf, T., Mahlein, A-K., Steiner, U., Oerke, E-C., Dehne, H-W., & Plümer, L. (2010). Early detection and classification of plant diseases with Support Vector Machines based on hyperspectral reflectance. Computers and Electronics in Agriculture, 74(1), 91-99.
Shaw, M.J., Subramaniam, C., Tan, G.W., & Welge, M.E. (2001). Knowledge management and data mining for marketing. Decision support systems, 31(1), 127-137.
Seret, A., vanden Broucke, S.K., Baesens, B., & Vanthienen, J. (2014). A dynamic understanding of customer behavior processes based on clustering and sequence mining. Expert Systems with Applications, 41(10), 4648-4657.
Shih, Y-Y., & Liu, D-R. (2008). Product recommendation approaches: Collaborative filtering via customer lifetime value and customer demands. Expert Systems with Applications, 35(1), 350-360.
Shim, B., Choi, K., & Suh, Y. (2012). CRM strategies for a small-sized online shopping mall based on association rules and sequential patterns. Expert Systems with Applications, 39(9), 7736-7742.
Silvera SAN, Mayne ST, Gammon MD, Vaughan TL, Chow W-H, Dubin JA, Dubrow, R., Stanford, J.L., West, A.B., Rotterdam, H. & Blot, W.J. (2014). Diet and lifestyle factors and risk of subtypes of esophageal and gastric cancers: classification tree analysis. Annals of epidemiology, 24(1), 50-57.
Sudha, L., Dillibabu, R., Srinivas, S.S., & Annamalai, A. (2016). Optimization of process parameters in feed manufacturing using artificial neural network. Computers and Electronics in Agriculture, 120, 1-6.
Tormo, M.T., Sanmartin, J., & Pace, J. F. (2009). Update and improvement of the traffic accident data collection procedures in Spain: The METRAS method of sequencing accident events. In 4th IRTAD Conference. Seoul, Korea. 16-17.
Tsai, C-F., & Chen, M-Y. (2010). Variable selection by association rules for customer churn prediction of multimedia on demand. Expert Systems with Applications, 37(3), 2006-2015.
Tsiptsis, K.K., & Chorianopoulos, A. (2011). Data mining techniques in CRM: inside customer segmentation, John Wiley & Sons.
Vafaie, M., Ataei, M., & Koofigar, H.R. (2014). Heart diseases prediction based on ECG signals’ classification using a genetic-fuzzy system and dynamical model of ECG signals. Biomedical Signal Processing and Control, 14, 291-296.
Van Der Schaaf, T., & Kanse, L. (2004). Biases in incident reporting databases: an empirical study in the chemical process industry. Safety Science42(1), 57-67.
Venkatesh, K., Ravi, V., Prinzie, A., & Van den Poel, D. (2014). Cash demand forecasting in ATMs by clustering and neural networks. European Journal of Operational Research, 232(2), 383-392.
Vermeulen-Smit, E., Ten Have, M., Van Laar, M., & De Graaf, R. (2015). Clustering of health risk behaviours and the relationship with mental disorders. Journal of affective disorders, 171, 111-119.
Videla-Cavieres, I.F., & Ríos, S.A. (2014). Extending market basket analysis with graph mining techniques: A real case. Expert Systems with Applications, 41(4), 1928-1936.
Vijayarani, S., & Sudha, S. (2015). An efficient clustering algorithm for predicting diseases from hemogram blood test samples. Indian Journal of Science and Technology, 8(17).
Wang, J-Y., Liu, C-S., Lung, C-H., Yang, Y-T., & Lin, M-H. (2017). Investigating spousal concordance of diabetes through statistical analysis and data mining. PloS one, 12(8), e0183413.
Wang, B., Miao, Y., Zhao, H., Jin, J., & Chen, Y. (2016). A biclustering-based method for market segmentation using customer pain points. Engineering Applications of Artificial Intelligence, 47, 101-109.
Weng, J., Zhu, J-Z., Yan, X., & Liu, Z. (2016). Investigation of work zone crash casualty patterns using association rules. Accident Analysis & Prevention, 92, 43-52.
Zhang, J., Feng, Q., Zhang, X., Zhang, X., Yuan, N., Wen, S., Wang, S., & Zhang, A. (2015). The use of an artificial neural network to estimate natural gas/water interfacial tension. Fuel, 157, 28-36.
Zhao, Z., Yang, Z., Lin, H., Wang, J., & Gao, S. (2016). A protein-protein interaction extraction approach based on deep neural network. International Journal of Data Mining and Bioinformatics, 15(2), 145-164.