Natural Language Processing Challenges: Problems and Solutions

Khadem S.S.; Trifonov, R. I.; Pavlova, G. V.

Autors: Khadem S.S., Trifonov, R. I., Pavlova, G. V.
Title: Natural Language Processing Challenges: Problems and Solutions
Keywords: challenges, evolution of NLP, natural language processing, transformers

Abstract: This paper provides an examination of challenges facing Natural Language Processing and their corresponding solutions. It starts with evolution from rule-based systems through statistical approaches to modern transformer architectures, and then dives into systematically categorizing contemporary challenges across linguistic, computational, and societal dimensions, and surveys recent methodological approaches addressing these limitations.

References

Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261-266. https://cs224d.stanford.edu/papers/advances.pdf
Lee, L. (2000). Review of Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze. http://www.cs.cornell.edu/home/llee/papers/topost.pdf
Jelinek, F. (2004). Some of my Best Friends are Linguists. Presentation at LREC 2004. http://www.lrecconf.org/lrec2004/doc/jelinek.pdf
Chen, S. F., & Goodman, J. (1996). An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pages 310-318. Retrieved from https://aclanthology.org/P96-1041/
Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781. Retrieved from https://arxiv.org/pdf/1301.3781
Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5, 157-166. https://ieeexplore.ieee.org/iel4/72/6922/00279181.pdf
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. arXiv preprint arXiv:1706.03762. Retrieved from https://arxiv.org/pdf/1706.03762
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901. Retrieved from https://arxiv.org/abs/2005.14165
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842-866. Retrieved from https://aclanthology.org/2020.tacl-1.54/
Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623. Retrieved from https://dl.acm.org/doi/10.1145/3442188.3445922
Tenney, I., Xia, P., Chen, B., Wang, A., Poliak, A., McCoy, R. T., ... & Pavlick, E. (2019). What do you learn from context? Probing for sentence structure in contextualized word representations. In 7th International Conference on Learning Representations (ICLR 2019). Retrieved from https://arxiv.org/abs/1905.06316
Wang, W., Dai, J., Chen, Z., Huang, Z., Li, Z., Zhu, X., ... & Lu, L. (2022). Towards Data-Efficient Detection Transformers. arXiv preprint arXiv:2203.09507. Retrieved from https://arxiv.org/abs/2203.09507
Abeysiriwardana, M., & Sumanathilaka, D. (2024). A Survey on Lexical Ambiguity Detection and Word Sense Disambiguation. arXiv preprint arXiv:2403.16129. Available at: https://arxiv.org/abs/2403.16129
Lewis, M., Nayak, N. V., Yu, P., Yu, Q., Merullo, J., Bach, S. H., & Pavlick, E. (2024). Grounded learning for compositional vector semantics. arXiv preprint arXiv:2401.06808. Available at: https://arxiv.org/html/2401.06808v1
Jurafsky, Daniel, and James H. Martin. 2023. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3rd ed. draft). Chapter 27: Discourse Coherence. Stanford University. Available at: https://web.stanford.edu/~jurafsky/slp3/27.pdf
Koehn, Philipp, and Rebecca Knowles. 2017. "Six Challenges for Neural Machine Translation." In Proceedings of the First Workshop on Neural Machine Translation, pages 28–39, Vancouver. Association for Computational Linguistics. DOI: 10.18653/v1/W17-3204. Available at: https://aclanthology.org/W17-3204/
Bolukbasi, Tolga, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, and Adam Kalai. 2016. "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings." In Advances in Neural Information Processing Systems 29 (NIPS), pages 4349–4357. Available at: https://arxiv.org/abs/1607.06520
Strubell, Emma, Ananya Ganesh, and Andrew McCallum. 2019. "Energy and Policy Considerations for Deep Learning in NLP." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3645–3650, Florence, Italy. Association for Computational Linguistics. DOI: 10.18653/v1/P19-1355. Available at: https://aclanthology.org/P19-1355/
Ang, Phyllis, Bhuwan Dhingra, and Lisa Wu Wills. 2022. "Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models." In Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP, pages 113–121, Dublin, Ireland. Association for Computational Linguistics. DOI: 10.18653/v1/2022.nlppower-1.12. Available at: https://aclanthology.org/2022.nlppower-1.12/
Ribeiro, Marco Tulio, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. "Beyond Accuracy: Behavioral Testing of NLP Models with CheckList." In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4902-4912. Online: Association for Computational Linguistics, 2020. DOI: 10.18653/v1/2020.acl-main.442. Available at: https://aclanthology.org/2020.acl-main.442/
Doshi-Velez, Finale, and Been Kim. 2017. "Towards A Rigorous Science of Interpretable Machine Learning." arXiv preprint arXiv:1702.08608. Available at: https://arxiv.org/abs/1702.08608
Baltrusaitis, Tadas, Chaitanya Ahuja, and Louis-Philippe Morency. 2019. "Multimodal Machine Learning: A Survey and Taxonomy." IEEE Transactions on Pattern Analysis and Machine Intelligence 41, no. 2 (2019): 423-443. DOI: 10.1109/TPAMI.2018.2798607. Available at: https://arxiv.org/abs/1705.09406
Carlini, Nicholas, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramèr, and Chuan Zhang. 2023. "Quantifying Memorization Across Neural Language Models." International Conference on Learning Representations (ICLR), Spotlight Presentation. Available at: https://arxiv.org/abs/2202.07646
Zellers, Rowan, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. "Defending Against Neural Fake News." In Advances in Neural Information Processing Systems 32 (NeurIPS), pages 9051-9062. Curran Associates, Inc. Available at: https://arxiv.org/abs/1905.12616
Ye, T., Dong, L., Xia, Y., Sun, Y., Zhu, Y., Huang, G., & Wei, F. (2024). Differential Transformer. Proceedings of the International Conference on Learning Representations (ICLR). Available at: https://arxiv.org/pdf/2410.05258
Nangia, N., Vania, C., Bhalerao, R., & Bowman, S. R. (2020). CrowS-pairs: A challenge dataset for measuring social biases in masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 1953-1967. Available at: https://aclanthology.org/2020.emnlp-main.154/
Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. Available at: https://arxiv.org/abs/1910.01108
Dao, T., Fu, D., Ermon, S., Rudra, A., & Ré, C. (2022). FlashAttention: Fast and memory-efficient exact attention with IO-awareness. In International Conference on Machine Learning, 4169-4183. Available at: https://arxiv.org/abs/2205.14135
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, 8748-8763. Available at: https://arxiv.org/abs/2103.00020
Klymenko, O., Meisenbacher, S., & Matthes, F. (2022). Differential Privacy in Natural Language Processing: The Story So Far. In Proceedings of the Fourth Workshop on Privacy in Natural Language Processing, 1-11. Available at: https://aclanthology.org/2022.privatenlp-1.1/
Morris, J. X., Lifland, E., Yoo, J. Y., Grigsby, J., Jin, D., & Qi, Y. (2020). TextAttack: A framework for adversarial attacks, data augmentation, and adversarial training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 119-126. Available at: https://aclanthology.org/2020.emnlp-demos.16/

Issue

2025 13th International Scientific Conference on Computer Science, COMSCI 2025 - Proceedings, 2025, Albania, https://doi.org/10.1109/COMSCI67172.2025.11225255

Вид: публикация в международен форум, публикация в реферирано издание, индексирана в Scopus

Е-Публикации
Технически университет - София

Детайли за публикация от базата данни на ТУ - София (Publication Details)