Autors: Rozeva, A. G., Zerkova, S. I.
Title: Assessing semantic similarity of texts
Keywords: Text mining, semantic similarity, knowledge discovery, latent semantic analysis

Abstract: Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis,which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurr



    AIP Conference Proceedings, vol. 1910, issue 1, pp. 060012, 2017, United States, AIP Publishing LLC

    Цитирания (Citation/s):
    1. V. Priya; K. Umamaheswari, A document similarity approach using grammatical linkages with graph databases, International Journal of Enterprise Network Management (IJENM), Vol. 10, No. 3/4, 2019, - 2019 - в издания, индексирани в Scopus или Web of Science
    2. Kim, Hyun-Jin and Baek, Ji-Won and Chung, Kyungyong, Optimization of Associative Knowledge Graph using TF-IDF based Ranking Score, Applied Sciences, vol.10, No.3, 2020, - 2020 - в издания, индексирани в Scopus или Web of Science
    3. George S.K., Jagathy Raj V.P., Gopalan S.K. (2020) Personalized News Media Extraction and Archival Framework with News Ordering and Localization. In: Tuba M., Akashe S., Joshi A. (eds) Information and Communication Technology for Sustainable Development. Advances in Intelligent Systems and Computing, vol 933. Springer, Singapore. - 2020 - в издания, индексирани в Scopus или Web of Science
    4. Du, X., Kowalski, M., Varde, A.S., de Melo, G. and Taylor, R.W., 2020. Public opinion matters: Mining social media text for environmental management. ACM SIGWEB Newsletter, (Autumn), pp.1-15., - 2020 - в издания, индексирани в Scopus или Web of Science
    5. Ibrishimova M.D., Li K.F. (2020) Introducing Connotation Similarity. In: Barolli L., Hellinckx P., Natwichai J. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2019. Lecture Notes in Networks and Systems, vol 96. Springer, Cham. - 2020 - в издания, индексирани в Scopus или Web of Science
    6. Almgerbi, M., De Mauro, A., Kahlawi, A., Poggioni, V., A Systematic Review of Data Analytics Job Requirements and Online-Courses, Journal of Computer Information Systems, Taylor & Francis, doi:10.1080/08874417.2021.1971579 - 2021 - в издания, индексирани в Scopus или Web of Science
    7. Abdeljaber, H.A., Automatic Arabic Short Answers Scoring Using Longest Common Subsequence and Arabic WordNet, IEEE Access 9,9437188, pp. 76433-76445 - 2021 - в издания, индексирани в Scopus или Web of Science
    8. Sintia, S., Defit, S., & Nurcahyo, G. W. (2021). Product Codefication Accuracy With Cosine Similarity And Weighted Term Frequency And Inverse Document Frequency (TF-IDF) . Journal of Applied Engineering and Technological Science (JAETS), 2(2), 62–69. - 2021 - от чужди автори в чужди издания, неиндексирани в Scopus или Web of Science

    Вид: статия в списание, публикация в издание с импакт фактор, публикация в реферирано издание, индексирана в Scopus и Web of Science