K-Means and K-Medoids Algorithm Comparison for Clustering Forest Fire Location in Indonesia

Ichwanul Muslim Karo Karo; Sri Dewi; Mardiana Mardiana; Fanny Ramadhani; Putri Harliana

doi:10.33019/jurnalecotipe.v10i1.3896

Authors

Ichwanul Muslim Karo Karo Computer Science, Medan State of University ,
Sri Dewi Computer Science, Medan State of University ,
Mardiana Mardiana Electrical Engineering, Medan State Polytechnic ,
Fanny Ramadhani Computer Science, Medan State of University ,
Putri Harliana Computer Science, Medan State of University ,

DOI:

https://doi.org/10.33019/jurnalecotipe.v10i1.3896

Keywords:

Clustering, K-Means, K-Medoids, Feature Importance

Abstract

Forest fires are the most common cause of deforestation in Indonesia. This condition has a negative impact on the survival of living things. Of course, this has received special attention from various parties. One effort that can be made for prevention is to group these points into areas with the potential for fire using the clustering method. In this research, a comparative study of the clustering algorithm between K-Means and K-Medoids was conducted on hotspot location data obtained from Global Forest Watch (GFW). Besides that, important variables that affect the clustering process are also analyzed in terms of feature importance. There are nine important variables used in the clustering process, of which the Acq_time variable is the most important. The cluster quality of both algorithms is evaluated using the silhouette coefficient (SC). Both algorithms are capable of producing strong clusters. The best number of clusters is six clusters. The K-medoids algorithm is better at grouping data than K-means.

Downloads

Download data is not yet available.

References

I. M. K. Karo, “Implementasi Metode XGBoost dan Feature Importance untuk Klasifikasi pada Kebakaran Hutan dan Lahan,” Journal of Software Engineering, Information and Communication Technology, vol. 1, no. 1, pp. 10–16, 2020.

H. Wahyuni and S. Suranto, “Dampak Deforestasi Hutan Skala Besar terhadap Pemanasan Global di Indonesia,” JIIP: Jurnal Ilmiah Ilmu Pemerintahan, vol. 6, no. 1, 2021, doi: 10.14710/jiip.v6i1.10083.

N. A. Khairani and E. Sutoyo, “Application of K-Means Clustering Algorithm for Determination of Fire-Prone Areas Utilizing Hotspots in West Kalimantan Province,” International Journal of Advances in Data and Information Systems, vol. 1, no. 1, pp. 9–16, Apr. 2020, doi: 10.25008/ijadis.v1i1.13.

K. Pratama Simanjuntak and U. Khaira, “MALCOM: Indonesian Journal of Machine Learning and Computer Science Hotspot Clustering in Jambi Province Using Agglomerative Hierarchical Clustering Algorithm Pengelompokkan Titik Api di Provinsi Jambi dengan Algoritma Agglomerative Hierarchical Clustering,” vol. 1, pp. 7–16, 2021.

World Resources Institute, “Forest Monitoring, Land Use & Deforestation Trends | Global Forest Watch,” Global Forest Watch. 2021.

E. F. Sirat, B. D. Setiawan, and F. Ramdani, “Comparative Analysis of K-Means and Isodata Algorithms for Clustering of Fire Point Data in Sumatra Region,” in 2018 4th International Symposium on Geoinformatics, ISyG 2018, 2019. doi: 10.1109/ISYG.2018.8611879.

M. Kurniawan, R. R. Muhima, and S. Agustini, “Comparison of Clustering K-Means, Fuzzy C-Means, and Linkage for Nasa Active Fire Dataset,” International Journal of Artificial Intelligence & Robotics (IJAIR), vol. 2, no. 2, 2020, doi: 10.25139/ijair.v2i2.3030.

Laboratoire lorrain de recherche en informatique et ses applications and Institute of Electrical and Electronics Engineers, 1st IEEE International Workshop on Arabic Script Analysis & Recognition : April 3-5, 2017, LARIA, Nancy, France.

M. U. Salur and I. Aydin, “The Impact of Preprocessing on Classification Performance in Convolutional Neural Networks for Turkish Text,” in 2018 International Conference on Artificial Intelligence and Data Processing, IDAP 2018, 2019. doi: 10.1109/IDAP.2018.8620722.

H. Henderi, “Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer,” IJIIS: International Journal of Informatics and Information Systems, vol. 4, no. 1, 2021, doi: 10.47738/ijiis.v4i1.73.

I. M. Karo Karo and H. Hendriyana, “Klasifikasi Penderita Diabetes menggunakan Algoritma Machine Learning dan Z-Score,” Jurnal Teknologi Terpadu, vol. 8, no. 2, pp. 94–99, 2022.

S. Khalid, T. Khalil, and S. Nasreen, “A survey of feature selection and feature extraction techniques in machine learning,” in Proceedings of 2014 Science and Information Conference, SAI 2014, 2014. doi: 10.1109/SAI.2014.6918213.

I. M. Karo Karo, S. Nadia Amalia, and D. Septiana, “Klasifikasi Kebakaran Hutan Menggunakan Feature Selection dengan Algoritma K-NN, Naive Bayes dan ID3,” Journal of Software Engineering, Information and Communication Technology, vol. 3, no. 1, pp. 121–126, 2022.

E. Schubert and P. J. Rousseeuw, “Fast and eager k-medoids clustering: O(k) runtime improvement of the PAM, CLARA, and CLARANS algorithms,” Inf Syst, vol. 101, 2021, doi: 10.1016/j.is.2021.101804.

M. A. Ahmed, H. Baharin, and P. N. E. Nohuddin, “Analysis of K-means, DBSCAN and OPTICS Cluster algorithms on Al-Quran verses,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 8, 2020, doi: 10.14569/IJACSA.2020.0110832.

E. Lidrawati, S. Bahri, U. F. Zubaedi, V. P. Carolina, K. Kusrini, and D. Maulina, “Kebakaran Hutan Implementasi Metode CLARA Clustering Untuk Pengelompokan Data Potensi Kebakaran Hutan/Lahan Berdasarkan Persebaran Titik Panas (Hotspot),” Journal of Computer System and Informatics (JoSYC), vol. 3, no. 4, 2022, doi: 10.47065/josyc.v3i4.2006.

I. M. K. Karo and A. F. Huda, “Spatial clustering for determining rescue shelter of flood disaster in South Bandung using CLARANS Algorithm with Polygon Dissimilarity Function,” in Proceedings - 2016 12th International Conference on Mathematics, Statistics, and Their Applications, ICMSA 2016: In Conjunction with the 6th Annual International Conference of Syiah Kuala University, 2017. doi: 10.1109/ICMSA.2016.7954311.

S. Gultom, S. Sriadhi, M. Martiano, and J. Simarmata, “Comparison analysis of K-Means and K-Medoid with Ecluidience Distance Algorithm, Chanberra Distance, and Chebyshev Distance for Big Data Clustering,” in IOP Conference Series: Materials Science and Engineering, 2018, vol. 420, no. 1. doi: 10.1088/1757-899X/420/1/012092.

I. M. Karo Karo, A. Yusmanto, and R. Setiawan, “Segmentasi Nasabah Kartu Kredit Berdasarkan Perilaku Penggunaan Kartu Kreditnya Menggunakan Algoritma K-Means,” Journal of Software Engineering, Information and Communication Technology, vol. 2, no. 2, pp. 101–107, 2021, [Online]. Available: https://www.kaggle.com/arjunbhasin2013/ccdata.