ارائه رویکرد ترکیبی مبتنی بر یادگیری عمیق و یادگیری ماشین جهت تشخیص اخبار جعلی: مطالعه موردی اخبار فارسی در حوزه کرونا ویروس | ||
| علوم و فنون مدیریت اطلاعات | ||
| مقاله 11، دوره 8، شماره 3، مهر 1401، صفحه 283-316 اصل مقاله (2.91 M) | ||
| نوع مقاله: مقاله پژوهشی | ||
| شناسه دیجیتال (DOI): 10.22091/stim.2021.7311.1640 | ||
| نویسندگان | ||
| وحید متقی1؛ مهدی اسماعیلی* 2؛ قاسمعلی بازایی3؛ محمدعلی افشارکاظمی4 | ||
| 1دانشجوی دکتری، گروه مدیریت فناوری اطلاعات، واحد قشم، دانشگاه آزاد اسلامی، قشم، ایران. | ||
| 2استادیار، گروه علوم کامپیوتر، واحد کاشان، دانشگاه آزاد اسلامی، کاشان، ایران | ||
| 3استادیار، گروه مدیریت، واحد تهران مرکز، دانشگاه آزاد اسلامی، تهران، ایران | ||
| 4دانشیار، گروه مدیریت، واحد تهران مرکز، دانشگاه آزاد اسلامی، تهران، ایران | ||
| چکیده | ||
| هدف: اطلاعات غلط یا تأیید نشده، دقیقاً مانند اطلاعات دقیق در وب منتشر میشوند. بنابراین، ممکن است ویروسی شوند و بر افکار عمومی و تصمیمات آن تأثیر بگذارند. اخبار جعلی و شایعات به ترتیب محبوبترین اشکال اطلاعات دروغ و تأیید نشده را نشان میدهند و برای جلوگیری از تأثیرات چشمگیر آنها باید در اسرع وقت کشف شوند. علاقه به تکنیکهای مؤثر در شناسایی، در سالهای اخیر بسیار سریع در حال افزایش است. مسئله تشخیص اخبار جعلی به عنوان یک مسئله طبقهبندی در پردازش زبان طبیعی و متنکاوی شناخته میشود و هدف آن تفکیک و تشخیص اخبار جعل از واقعی، در متنهای استخراج شده و بهبود در دقت تشخیص اخبار جعلی است. شبکههای عصبی کانولوشن به عنوان یکی از مهمترین مدلهای یادگیری عمیق دقت بالایی را بر روی این مسائل بدست آوردهاند. این شبکهها شامل مشکلاتی مثل عدم در نظر گرفتن موقعیت کلمات میباشند که مسأله مذکور با استفاده از شبکه کپسول برطرف گردیده و جهت حل مشکل پردازش سنگین لایههای تمام متصل و فضای پارامتریک الگوریتمهای XGBOOST و بهینهسازی ازدحام انبوه ذرات (PSO) برای دستیابی به دقت و صحّت بهینه پیشنهاد شده است. روش: مطالعه حاضر پژوهشی کاربردی بوده که در آن حدود 42000 اخبار فارسی از شهرهای مختلف ایران از توییتر جمعآوری شده و با استفاده از روشهای پاکسازی و پیشپردازش، اطلاعات اضافی حذف و پس از برچسب زدن، اخبار آماده بهکارگیری جهت رویکرد پیشنهادی با استفاده از نرمافزار پایتون و کتابخانههای مربوطه با الگوریتمهای یادگیری ماشین و یادگیری عمیق شد. یافتهها: طی بررسی، آزمایش و تست، برخی از الگوریتمهای یادگیری ماشین دارای قدرت بیشتری در مسائل طبقهبندی بودند، ولی با تغییرات و اعمال روشهای پیشنهادی که در ساختار الگوریتم شبکه کانولوشن و شبکه کپسول صورت گرفت، نتایج بهینه نسبت به الگوریتمهای یادگیری ماشین و سایر الگوریتمهای پایه و الگوریتمهای مورد ارزیابی بدست آمد. نتیجهگیری: راهکارهای پیشنهادی در این تحقیق در مقایسه با رویکردهای الگوریتمهای پایه و یا راهکارهای صورت گرفته جهت حل مشکلات مذکور بدون اضافه کردن سربار اضافی از لحاظ تعداد ویژگیها و عمق شبکه، با تغییر در ورودی توانسته است به نتایج بهتر و قابل قبول از سایر رویکردهای موجود در ادبیات دست یافته و به دقت و صحّت حدود 96 درصد دست یابد. | ||
| کلیدواژهها | ||
| پردازش زبان طبیعی؛ طبقهبندی متن؛ شبکههای عصبی کپسول؛ تشخیص اخبار جعل؛ کرونا ویروس؛ یادگیری عمیق؛ یادگیری ماشین؛ اخبار فارسی | ||
| عنوان مقاله [English] | ||
| Providing a Hybrid Approach Based on Deep learning and Machine Learning to Detect Fake News - A Case Study of Persian News in the Field of COVID-19 | ||
| نویسندگان [English] | ||
| Vahid Mottaghi1؛ Mahdi Esmaeili2؛ Ghasem Ali Bazaee3؛ Mohammad Ali Afshar Kazemi4 | ||
| 1PhD. Candidate in IT Management, Department of IT Management, Qeshm Branch, Islamic Azad University, Qeshm, Iran. | ||
| 2Assistant Professor, Department of Computer Science, Kashan Branch, Islamic Azad University, Kashan, Iran | ||
| 3Assistant Professor, Department of Management, Central Tehran Branch, Islamic Azad University, Tehran, Iran | ||
| 4Associate Professor, Department of Management, Central Tehran Branch, Islamic Azad University, Tehran, Iran | ||
| چکیده [English] | ||
| Objectives: False or unconfirmed information is published on the web like accurate information, so it can become viral and influence public opinion and decisions. Fake news and gossip show the most popular forms of false and unverified information, respectively, and they should be detected as soon as possible to avoid significant effects Interest in effective identification techniques has been increasing in recent years.The problem of detecting fake news is known as a classification problem in natural language processing and text mining, and its purpose is to distinguish fake news from real and extracted texts, and to improve the accuracy of detecting fake news is the main issue of this research. Convolutional neural networks, as one of the most important models of deep learning, have gained high accuracy on these issues. These networks include problems such as not considering the position of words, which is solved by using the capsule network, and in order to achieve optimal accuracy, two problems of heavy processing of all connected layers and reducing the parametric space using the algorithm XGBOOST and particle swarm optimization (PSO) algorithm are proposed. Methods: This study is an applied research in which about 42,000 Persian news from different cities of Iran were collected from Twitter and using additional methods of cleaning and preprocessing, additional information was removed and after tagging, the news was ready to be used for the proposed approach using Python software and related libraries are equipped with machine learning and deep learning algorithms. Results: During testing, some machine learning algorithms had more power in classification problems, but with the changes in the structure of the convolutional network and Capsul network algorithm, better results were obtained than machine learning algorithms and other similar algorithms. Conclusions: The proposed solutions in this research in comparison with the approaches of basic algorithms or solutions to solve the mentioned problems by replacing the optimal classifier and reducing the parametric space, by changing the input has been able to achieve better and more acceptable results than other approaches. And achieve an accuracy of about 96%. | ||
| کلیدواژهها [English] | ||
| Natural Language Processing, Text Classification, Capsule Neural Networks, Fake News Detection, Corona Virus Fake News | ||
| مراجع | ||
|
Aker, A., Derczynski, L. & Bontcheva, K. (2017). Simple open stance classification for rumour analysis. Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, (pp. 31–39). DOI: 10.26615/978-954-452-049-6_005
Allcott, H. & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of economic perspectives, 31(2): 211-236. DOI: 10.1257/jep.31.2.211
Allport, G.W. & Postman, L. (1946). An analysis of rumor. Public opinion quarterly, 10(4): 501-517. DOI: 10.1086/265813
Briscoe, E.J., Appling, D.S. & Hayes, H. (2014). Cues to deception in social media communications. In: 2014 47th Hawaii international conference on system sciences (pp. 1435-1443). IEEE. DOI: 10.1109/HICSS.2014.186 Bondielli, A. & Marcelloni, F. (2019). A survey on fake news and rumour detection techniques. Information Sciences, 497: 38-55. DOI: 10.1016/j.ins.2019.05.035
Castillo, C., Mendoza, M. & Poblete, B. (2011). Information credibility on twitter. In: Proceedings of the 20th international conference on World wide web (pp. 675-684). DOI: 10.1145/1963405.1963500 Chang, C., Zhang, Y., Szabo, C. & Sheng, Q.Z. (2016). Extreme user and political rumor detection on twitter. In: International conference on advanced data mining and applications (pp. 751-763). Springer, Cham.
Chen, H., Asteris, P.G., Jahed Armaghani, D., Gordan, B. & Pham, B.T. (2019). Assessing dynamic conditions of the retaining wall: developing two hybrid intelligent models. Applied Sciences, 9(6):1042. DOI: 10.3390/app9061042
Chen, Y.C., Liu, Z.Y. & Kao, H.Y. (2017). Ikm at semeval-2017 task 8: Convolutional neural networks for stance detection and rumor verification. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017) (pp. 465-469). DOI: 10.18653/v1/S17-2081
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H. & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, pp.1724-1734. DOI: 10.3115/v1/D14-1179 Conroy, N.K., Rubin, V.L. & Chen, Y. (2015). Automatic deception detection: Methods for finding fake news. Proceedings of the association for information science and technology, 52(1), P.1-4. DOI: 10.1002/pra2.2015.145052010082
Eberhart, R. & Kennedy, J. (1995). A new optimizer using particle swarm theory. In: MHS'95. Proceedings of the sixth international symposium on micro machine and human science (pp. 39-43). Ieee. DOI: 10.1109/MHS.1995.494215 Giasemidis, G., Singleton, C., Agrafiotis, I., Nurse, J.R., Pilgrim, A., Willis, C. & Greetham, D.V. (2016). Determining the veracity of rumours on Twitter. In: International Conference on Social Informatics (pp. 185-205). Springer, Cham. DOI: 10.1007/978-3-319-47880-7_12
Gorrell, G., Bontcheva, K., Derczynski, L., Kochkina, E., Liakata, M. & Zubiaga, A. (2018). Rumoureval 2019: Determining rumour veracity and support for rumours. Proceedings of the 13th International Workshop on Semantic Evaluation: 845–854. DOI: 10.18653/v1/S19-2147 Jacovi, A., Shalom, O.S. & Goldberg, Y. (2018). Understanding convolutional neural networks for text classification. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics: 56-65. DOI: 10.18653/v1/W18-5408 Kim,Y. (2014). Convolutional Neural Networks for Sentence Classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics: 1746–1751. DOI: 10.3115/v1/D14-1181
Kochkina, E., Liakata, M. & Zubiaga, A. (2018). All-in-one: Multi-task learning for rumour verification. Association for Computational Linguistics. Proceedings of the 27th International Conference on Computational Linguistics: 3402–3413.
Kwon, S., Cha, M., Jung, K., Chen, W. & Wang, Y. (2013). Prominent features of rumor propagation in online social media. In: 2013 IEEE 13th international conference on data mining (pp. 1103-1108). IEEE. DOI: 10.1109/ICDM.2013.61
Le, L.T., Nguyen, H., Zhou, J., Dou, J. & Moayedi, H. (2019). Estimating the heating load of buildings for smart city planning using a novel artificial intelligence technique PSO-XGBoost. Applied Sciences, 9(13): 2714. DOI: 10.3390/app9132714.
Le, L.T., Nguyen, H., Dou, J. & Zhou, J. (2019). A comparative study of PSO-ANN, GA-ANN, ICA-ANN, and ABC-ANN in estimating the heating load of buildings’ energy efficiency for smart city planning. Applied Sciences,9(13): 2630. DOI: 10.3390/app9132630.
LeCun, Y., Kavukcuoglu, K. & Farabet, C. (2010). Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE international symposium on circuits and systems (pp. 253-256). IEEE. DOI: 10.1109/ISCAS.2010.5537907
Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B.J., Wong, K.F. & Cha, M. (2016). Detecting rumors from microblogs with recurrent neural networks. Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016): 3818-3824.
Meyer, J.K. (1969). Bibliography on the urban crisis: The behavioral, psychological, and sociological aspects of the urban crisis (no. 1948). National Institute of Mental Health.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26: 3111–3119.
Miller, T., Howe, P. & Sonenberg, L. (2017). Explainable AI: Beware of inmates running the asylum or: How I learnt to stop worrying and love the social and behavioural sciences. arXiv preprint arXiv: 1712.00547. DOI: https://doi.org/10.48550/arXiv.1712.00547
Moon, A. (2017). Two-thirds of American adults get news from social media: survey. Recuperado de: https://uk. reuters. com/article/us-usa-internet-socialmedia/two-thirds-of-american-adults-get-news-from-social-media-survey-idUKKCN1BJ2A8.
Nguyen, H.H., Yamagishi, J. & Echizen, I. (2019). Use of a capsule network to detect fake images and videos arXiv 2019. arXiv preprint arXiv:1910.12467. DOI: 10.48550/arXiv.1910.12467
Qin, Y., Wurzer, D., Lavrenko, V. & Tang, C. (2016). Spotting rumors via novelty detection. DOI: 10.48550/arXiv.1611.06322. Rathnayaka, P., Abeysinghe, S., Samarajeewa, C., Manchanayake, I. & Walpola, M. (2018). Sentylic at IEST 2018: Gated recurrent neural network and capsule network based approach for implicit emotion detection. Association for Computational Linguistics. Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis: p.254-259. DOI: 10.18653/v1/W18-6237
Rubin, V.L., Chen, Y. & Conroy, N.K. (2015). Deception detection for news: three types of fakes. Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community, 2015, p. 83: American Society for Information Science. DOI: 10.1002/pra2.2015.145052010083. Rubin, V.L., Conroy, N., Chen, Y. & Cornwell, S. (2016). Fake news or truth? using satirical cues to detect potentially misleading news. In: Proceedings of the second workshop on computational approaches to deception detection (pp. 7-17). DOI: 10.18653/v1/W16-0802
Ruchansky, N., Seo, S. & Liu, Y. (2017). Csi: A hybrid deep model for fake news detection. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 797-806). DOI: 10.1145/3132847.3132877
Sabour, S., Frosst, N. & Hinton, G.E. (2017). Dynamic routing between capsules. Advances in neural information processing systems, Proceedings of the 31st International Conference on Neural Information Processing Systems: 3859-3869.
Schuster, M. & Paliwal, K.K. (1997). Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11): 2673-2681. DOI: 10.1109/78.650093
Vakili, M., Ghamsari, M. & Rezaei, M. (2020). Performance analysis and comparison of machine and deep learning algorithms for IoT data classification. Third International Conference on Computing and Network Communications. DOI: 10.48550/arXiv.2001.09636
Vosoughi, S. (2015). Automatic detection and verification of rumors on Twitter.Doctoral dissertation, Massachusetts Institute of Technology. Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences.
Vosoughi, S., Mohsenvand, M.N. & Roy, D. (2017). Rumor gauge: Predicting the veracity of rumors on Twitter. ACM transactions on knowledge discovery from data (TKDD), 11(4): 1-36. DOI: 10.1145/3070644
Yang, Y., Zheng, L., Zhang, J., Cui, Q., Li, Z. & Yu, P.S. (2018). TI-CNN: Convolutional neural networks for fake news detection. arXiv preprint arXiv:1806.00749. DOI: 10.48550/arXiv.1806.00749 Yang, F., Liu, Y., Yu, X. & Yang, M. (2012). Automatic detection of rumor on sina weibo. In: Proceedings of the ACM SIGKDD workshop on mining data semantics (pp. 1-7). DOI: 10.1145/2350190.2350203 Zeng, L., Starbird, K. & Spiro, E.S. (2016). # unconfirmed: Classifying rumor stance in crisis-related social media messages. In: Tenth International AAAI Conference on Web and Social Media.
Zhang, H., Fan, Z., Zheng, J. & Liu, Q. (2012). An improving deception detection method in computer-mediated communication. Journal of Networks, 7(11): 1811. DOI:10.4304/jnw.7.11.1811-1816 Zhou, L., Twitchell, D.P., Qin, T., Burgoon, J.K. & Nunamaker, J.F. (2003). An exploratory study into deception detection in text-based computer-mediated communication. In: 36th Annual Hawaii JanuaryInternational Conference on System Sciences, 2003. IEEE. DOI: 10.1109/HICSS.2003.1173793 Zubiaga, A., Liakata, M. & Procter, R. (2016). Learning reporting dynamics during breaking news for rumour detection in social media. arXiv preprint arXiv:1610.07363. DOI: 10.48550/arXiv.1610.07363 Zubiaga, A., Liakata, M., Procter, R., Wong Sak Hoi, G. & Tolmie, P. (2016). Analysing how people orient to and spread rumours in social media by looking at conversational threads. PloS one, 11(3): e0150989. DOI: doi.org/10.1371/journal.pone.0150989
Zubiaga, A., Aker, A., Bontcheva, K., Liakata, M. & Procter, R. (2018). Detection and resolution of rumours in social media: A survey. ACM Computing Surveys (CSUR), 51(2): 1-36. DOI: 10.1145/3161603 | ||
|
آمار تعداد مشاهده مقاله: 1,480 تعداد دریافت فایل اصل مقاله: 998 |
||
