ارائه روشی برای کاهش حساسیت الگوریتم های خوشه‌بندی افزایشی اسناد XML مبتنی بر الگوریتم های هوش دسته جمعی

نظری  فرخی, محمد; نظری  فرخی, ابراهیم; نوروزبخش, علی

doi:10.22091/jemsc.2024.9990.1176

	ارائه روشی برای کاهش حساسیت الگوریتم های خوشه‌بندی افزایشی اسناد XML مبتنی بر الگوریتم های هوش دسته جمعی
مدیریت مهندسی و رایانش نرم
مقاله 11، دوره 9، شماره 2 - شماره پیاپی 17، مهر 1402، صفحه 177-187 اصل مقاله (1.91 M)
نوع مقاله: مقاله پژوهشی
شناسه دیجیتال (DOI): 10.22091/jemsc.2024.9990.1176
نویسندگان
محمد نظری فرخی^* ؛ ابراهیم نظری فرخی؛ علی نوروزبخش
گروه مدیریت، دانشکده مدیریت، دانشگاه علوم و تحقیقات، تهران، ایران
چکیده
تاکنون روشهای مختلفی برای ذخیره سازی و بازیابی اطلاعات اسناد نیمه ساخت یافته ارائه شده است که بیشتر آنها در دو گروه با رهیافت دسته ای و افزایشی قرار می گیرند. در رهیافت دسته ای یا خوشه ای فرض بر این است که کل اسناد قابل دسترسی و خوشه بندی است و اسناد می توانند چندین بار مورد پردازش قرار گیرند که باعث افزایش زمان اجرای اینگونه الگوریتم ها می شود. در رهیافت افزایشی کل اسناد تماماً یک جا وجود ندارند بلکه به مرور زمان در اختیار روش دسته بندی قرار می گیرد که از این نظر زمان اجرای اینگونه الگوریتم ها نسبت به روش دسته ای کمتر و در نتیجه سرعت اجرای آنها بیشتر است. در این پژوهش روش پیشنهادی ما با روش هایXCLS و XCLS+ در سه معیار ارزیابی Entropy، Purity و Fscore مورد مقایسه قرار گرفت. نتایج نشان داد روش پیشنهادی در معیارهای Entropy، Purity و Fscore نسبت به دو روش XCLS و XCLS+ ارجحیت دارد و فقط در معیار Fscore نسبت به روش XCLS+ اندکی کارایی کمتری از خود نشان می‌دهد.
کلیدواژه‌ها
الگوریتم بهینه سازی ذرات؛ اسناد نیمه ساخت یافته؛ خوشه بندی افزایشی؛ هوش دسته جمعی
عنوان مقاله [English]
Presenting a method to reduce the sensitivity of incremental clustering algorithms of XML documents based on collective intelligence algorithms
نویسندگان [English]
Mohammad Nazari Farokhi؛ Ebrahim Nazari Farokhi؛ Ali Norouzbakhsh
Department of Management, Faculty of Management, University of Science and Research, Tehran, Iran
چکیده [English]
Until now, various methods have been presented for storing and retrieving information of semi-structured documents, most of them are placed in two groups with batch and incremental approach. In the batch or cluster approach, it is assumed that all the documents can be accessed and clustered, and the documents can be processed several times, which increases the execution time of such algorithms. In the incremental approach, all the documents do not exist in one place, but over time, they are provided to the classification method, and from this point of view, the execution time of such algorithms is less compared to the batch method, and as a result, their execution speed is faster. In this research, our proposed method was compared with XCLS and XCLS+ methods in three evaluation criteria: Entropy, Purity and Fscore. The results showed that the proposed method is preferable to the XCLS and XCLS+ methods in terms of Entropy, Purity and Fscore, and it is slightly less efficient than the XCLS+ method only in the Fscore criterion.
کلیدواژه‌ها [English]
particle optimization algorithm, semi-structured documents, incremental clustering, collective intelligence

مراجع
Algergawy, A., Mesiti, M., Nayak, R., & Saake, G. (2011). XML data clustering: An overview. ACM Computing Surveys (CSUR), 43(4), 1-41. https://doi.org/10.1609/icwsm.v7i1.14380 Alishahi, M., Naghibzadeh, M., & Aski, B. S. (2010). Tag name structure-based clustering of XML documents. International Journal of Computer and Electrical Engineering, 2(1), 119. https://doi.org/10.1609/icwsm.v7i1.21369 Costa, Gianni, Giuseppe Manco, Riccardo Ortale, and Ettore Ritacco. "Hierarchical clustering of XML documents focused on structural components." Data & Knowledge Engineering 84 (2013): 26-46. https://doi.org/10.1609/icwsm.v7i1.95647 Di Caprio, D., Ebrahimnejad, A., Alrezaamiri, H., & Santos-Arteaga, F. J. (2022). A novel ant colony algorithm for solving shortest path problems with -fuzzy arc weights. Alexandria Engineering Journal, 61(5), 3403-3415. https://doi.org/10.1609/icwsm.v7i1.6257 Eberhart, R. C., & Shi, Y. (2001). Particle swarm optimization: developments, applications and resources. In evolutionary computation, 2001. Proceedings of the 2001 Congress on (Vol. 1, pp. 81-86). IEEE. https://doi.org/10.1609/icwsm.v7i1.62957 Fister, I., Yang, X. S., & Brest, J. (2013). A comprehensive review of firefly algorithms. Swarm and Evolutionary Computation, 13, 34-46. https://doi.org/10.1609/icwsm.v7i1.62148 Gad, A. G. (2022). Particle Swarm Optimization Algorithm and Its Applications: A Systematic Review. Archives of Computational Methods in Engineering, 1-31. https://doi.org/10.1609/icwsm.v7i1.75924 Gürel, G. (2008). Mining XML documents with association rule algorithms (Doctoral dissertation, Izmir Institute of Technology (Turkey)). https://doi.org/10.1609/icwsm.v7i1.62597 Hwang, J. H., & Ryu, K. H. (2010). A weighted common structure based clustering technique for XML documents. Journal of Systems and Software, 83(7), 1267-1274. https://doi.org/10.1609/icwsm.v7i1.75391 James, J. Q., & Li, V. O. (2015). A social spider algorithm for global optimization. Applied Soft Computing, 30, 614-627. https://doi.org/10.1609/icwsm.v7i1.95173 Kim, J., & Kim, H. J. (2004). A partition index for XML and semi-structured data. Data & Knowledge Engineering, 51(3), 349-368. https://doi.org/10.1609/icwsm.v7i1.96358 Mishra, S., Shaw, K., & Mishra, D. (2012). A new meta-heuristic bat inspired classification approach for microarray data. Procedia Technology, 4, 802-806. Nayak, R. (2008). Fast and effective clustering of XML data using structural information. Knowledge and Information Systems, 14(2), 197-215. https://doi.org/10.1609/icwsm.v7i1.712596 Nayak, R. (2008). XML data mining: Process and applications. Idea Group Inc./IGI Global, 22. https://doi.org/10.1609/icwsm.v7i1.7193685 Nayak, R., & Tran, T. (2007). A progressive clustering algorithm to group the XML data by structural and semantic similarity. International Journal of Pattern Recognition and Artificial Intelligence, 21(04), 723-743. https://doi.org/10.1609/icwsm.v7i1.63215 Piernik, M., Brzezinski, D., & Morzy, T. (2016). Clustering XML documents by patterns. Knowledge and Information Systems, 46(1), 185-212. https://doi.org/10.1609/icwsm.v7i1.75931 Santos, L., Coutinho-Rodrigues, J., & Current, J. R. (2010). An improved ant colony optimization based algorithm for the capacitated arc routing problem. Transportation Research Part B: Methodological, 44(2), 246-266. https://doi.org/10.1609/icwsm.v7i1.71937 Yesodha, R., & Amudha, T. (2022). A bio-inspired approach: Firefly algorithm for Multi-Depot Vehicle Routing Problem with Time Windows. Computer Communications, 190, 48-56. https://doi.org/10.1609/icwsm.v7i1.93817 Zan, Z., Cong, Y., & Zhang, X. (2022, May). An Improved Bat Algorithm for Solving Nonlinear Algebraic Systems of Equations. In Proceedings of the 7th International Conference on Big Data and Computing (pp. 75-81). https://doi.org/10.1609/icwsm.v7i1.71987 Zang, H., Zhang, S., & Hapeshi, K. (2010). A review of nature-inspired algorithms. Journal of Bionic Engineering, 7, S232-S237. https://doi.org/10.1609/icwsm.v7i1.63158
آمار تعداد مشاهده مقاله: 1,233 تعداد دریافت فایل اصل مقاله: 311

سامانه مدیریت نشریات علمی دانشگاه قم

ارائه روشی برای کاهش حساسیت الگوریتم های خوشه‌بندی افزایشی اسناد XML مبتنی بر الگوریتم های هوش دسته جمعی