Assessing semantic similarity of texts

Rozeva, A. G.; Zerkova, S. I.

Autors: Rozeva, A. G., Zerkova, S. I.
Title: Assessing semantic similarity of texts
Keywords: Text mining, semantic similarity, knowledge discovery, laten

Abstract: Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis,which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurr

References

Issue

AIP Conference Proceedings, vol. 1910, issue 1, pp. 060012, 2017, United States, AIP Publishing LLC

Цитирания (Citation/s):
1. V. Priya; K. Umamaheswari, A document similarity approach using grammatical linkages with graph databases, International Journal of Enterprise Network Management (IJENM), Vol. 10, No. 3/4, 2019, https://doi.org/10.1504/IJENM.2019.103143 - 2019 - в издания, индексирани в Scopus и/или Web of Science
2. Kim, Hyun-Jin and Baek, Ji-Won and Chung, Kyungyong, Optimization of Associative Knowledge Graph using TF-IDF based Ranking Score, Applied Sciences, vol.10, No.3, 2020, https://www.mdpi.com/2076-3417/10/13/4590 - 2020 - в издания, индексирани в Scopus и/или Web of Science
3. George S.K., Jagathy Raj V.P., Gopalan S.K. (2020) Personalized News Media Extraction and Archival Framework with News Ordering and Localization. In: Tuba M., Akashe S., Joshi A. (eds) Information and Communication Technology for Sustainable Development. Advances in Intelligent Systems and Computing, vol 933. Springer, Singapore. https://doi.org/10.1007/978-981-13-7166-0_46 - 2020 - в издания, индексирани в Scopus и/или Web of Science
4. Du, X., Kowalski, M., Varde, A.S., de Melo, G. and Taylor, R.W., 2020. Public opinion matters: Mining social media text for environmental management. ACM SIGWEB Newsletter, (Autumn), pp.1-15., https://doi.org/10.1145/3352683.3352688 - 2020 - в издания, индексирани в Scopus и/или Web of Science
5. Ibrishimova M.D., Li K.F. (2020) Introducing Connotation Similarity. In: Barolli L., Hellinckx P., Natwichai J. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2019. Lecture Notes in Networks and Systems, vol 96. Springer, Cham. https://doi.org/10.1007/978-3-030-33509-0_13 - 2020 - в издания, индексирани в Scopus и/или Web of Science
6. Almgerbi, M., De Mauro, A., Kahlawi, A., Poggioni, V., A Systematic Review of Data Analytics Job Requirements and Online-Courses, Journal of Computer Information Systems, Taylor & Francis, doi:10.1080/08874417.2021.1971579 - 2021 - в издания, индексирани в Scopus и/или Web of Science
7. Abdeljaber, H.A., Automatic Arabic Short Answers Scoring Using Longest Common Subsequence and Arabic WordNet, IEEE Access 9,9437188, pp. 76433-76445 - 2021 - в издания, индексирани в Scopus и/или Web of Science
8. Sintia, S., Defit, S., & Nurcahyo, G. W. (2021). Product Codefication Accuracy With Cosine Similarity And Weighted Term Frequency And Inverse Document Frequency (TF-IDF) . Journal of Applied Engineering and Technological Science (JAETS), 2(2), 62–69. https://doi.org/10.37385/jaets.v2i2.210 - 2021 - от чужди автори в чужди издания, неиндексирани в Scopus или Web of Science
9. Almgerbi, M., De Mauro, A., Kahlawi, A., Poggioni, V. (2022). A Systematic Review of Data Analytics Job Requirements and Online-Courses, Journal of Computer Information Systems, Vol. 62, No 2, pp.422-434, https://doi.org/10.1080/08874417.2021.1971579 - 2022 - в издания, индексирани в Scopus и/или Web of Science
10. Albayrakoğlu, M. M., Aydın, M. N. (2022). Influence Of Different Theories Of Ethics On Organizational Codes Of Conduct Or Ethics: A Comparative Semantic Analysis, Journal of Research in Business, Vol. 7, No IMISC2021 Special Issue, pp. 33-47, https://doi.org/10.54452/jrb.1026523 - 2022 - от чужди автори в чужди издания, неиндексирани в Scopus или Web of Science
11. Lin, T.-H., Huang, Y.-H., Putranto, A. (2022) Intelligent question and answer system for building information modeling and artificial intelligence of things based on the bidirectional encoder representations from transformers model, Automation in Construction 142,104483, https://doi.org/10.1016/j.autcon.2022.104483 - 2022 - в издания, индексирани в Scopus и/или Web of Science
12. Devarajan, V., Subramanian, R. (2022) Analyzing semantic similarity amongst textual documents to suggest near duplicates, Indonesian Journal of Electrical Engineering and Computer Science 25(3), pp. 1703-1711, dOI: http://doi.org/10.11591/ijeecs.v25.i3.pp1703-1711 - 2022 - в издания, индексирани в Scopus и/или Web of Science
13. Viji, D., Revathy, S. (2022) Semantic Similarity Detection from text document using XLNet with a DKM-Clustered Bi-LSTM Model, 2022 13th International Conference on Computing Communication and Networking Technologies, ICCCNT 2022, DOI: 10.1109/ICCCNT54827.2022.9984333 - 2022 - в издания, индексирани в Scopus и/или Web of Science
14. Al-Mahmoud, R.H., Sharieh, A. (2022) NGram Approach for Semantic Similarity on Arabic Short Text, International Journal of Advanced Computer Science and Applications 13(11), pp. 857-866, (DOI) : 10.14569/IJACSA.2022.0131199 - 2022 - в издания, индексирани в Scopus и/или Web of Science
15. Basel AlHaj, Iyad AlAgha (2022) Exploiting Wikipedia to Measure the Semantic Relatedness between Arabic Terms, Journal of Engineering Research and Technology, Vol.9. No2 - 2022 - от чужди автори в чужди издания, неиндексирани в Scopus или Web of Science
16. Phumelele P. Kubheka, Pius A. Owolawi, Gbolahan Aiyetoro (2022), Topic Modeling Using Latent Dirichlet Allocation and Latent Semantic Indexing on South African Telco Twitter Data, International Journal of Computer and Information Engineering Vol:16, No:8, pp.329-333 - 2022 - от чужди автори в чужди издания, неиндексирани в Scopus или Web of Science
17. Fekadu Wayissa, Mesn Leranso , Girma Asefa, Abduljebar Kedir, Ayodeji Olalekan Salau (2022) Pattern-Based Hybrid Book Recommendation System using Semantic Relationships, Research Square, https://doi.org/10.21203/rs.3.rs-1873957/v1 - 2022 - от чужди автори в чужди издания, неиндексирани в Scopus или Web of Science
18. Wilianto, D., Girsang, A.S. Automatic Short Answer Grading on High School’s E-Learning Using Semantic Similarity Methods, TEM Journal 12(1), pp. 297-302 - 2023 - в издания, индексирани в Scopus и/или Web of Science
19. b. Wayesa, F., Leranso, M., Asefa, G. et al. Pattern-based hybrid book recommendation system using semantic relationships. Sci Rep 13, 3693 (2023). https://doi.org/10.1038/s41598-023-30987-0 - 2023 - в издания, индексирани в Scopus и/или Web of Science
20. Birthriya S.K., Ahlawat D.P., Jain D.A.K. Phishing URLs Detection Method Using Hybrid Feature and Convolutional Neural Networks with Attention Mechanisms (2024) Communications in Computer and Information Science, 2090 CCIS, pp. 290 - 303 DOI: 10.1007/978-3-031-64076-6_19 - 2024 - в издания, индексирани в Scopus и/или Web of Science
21. Birthriya S.K., Ahlawat P., Jain A.K. Phishing URL Detection using Deep Q-Networks with Convolutional Neural Networks (2024) 2024 International Conference on Intelligent Systems for Cybersecurity, ISCS 2024 DOI: 10.1109/ISCS61804.2024.1058120 - 2024 - в издания, индексирани в Scopus и/или Web of Science
22. Akhilesh P., Amal Krishna K., Karthick Bharadwaj S., Venugopalan M. Semantic Similarity Analysis for Resume Filtering using PySpark (2024) 2024 IEEE 9th International Conference for Convergence in Technology, I2CT 2024 DOI: 10.1109/I2CT61223.2024.1054343 - 2024 - в издания, индексирани в Scopus и/или Web of Science

Вид: статия в списание, публикация в издание с импакт фактор, публикация в реферирано издание, индексирана в Scopus и Web of Science

Е-Публикации
Технически университет - София

Детайли за публикация от базата данни на ТУ - София (Publication Details)