Autors: Khadem S.S., Trifonov, R. I., Pavlova, G. V. Title: Natural Language Processing Challenges: Problems and Solutions Keywords: challenges, evolution of NLP, natural language processing, transformersAbstract: This paper provides an examination of challenges facing Natural Language Processing and their corresponding solutions. It starts with evolution from rule-based systems through statistical approaches to modern transformer architectures, and then dives into systematically categorizing contemporary challenges across linguistic, computational, and societal dimensions, and surveys recent methodological approaches addressing these limitations. References - Hirschberg, J., & Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245), 261-266. https://cs224d.stanford.edu/papers/advances.pdf
- Lee, L. (2000). Review of Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze. http://www.cs.cornell.edu/home/llee/papers/topost.pdf
- Jelinek, F. (2004). Some of my Best Friends are Linguists. Presentation at LREC 2004. http://www.lrecconf.org/lrec2004/doc/jelinek.pdf
- Chen, S. F., & Goodman, J. (1996). An empirical study of smoothing techniques for language modeling. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pages 310-318. Retrieved from https://aclanthology.org/P96-1041/
- Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781. Retrieved from https://arxiv.org/pdf/1301.3781
- Bengio, Y., Simard, P., and Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5, 157-166. https://ieeexplore.ieee.org/iel4/72/6922/00279181.pdf
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. arXiv preprint arXiv:1706.03762. Retrieved from https://arxiv.org/pdf/1706.03762
- Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901. Retrieved from https://arxiv.org/abs/2005.14165
- Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842-866. Retrieved from https://aclanthology.org/2020.tacl-1.54/
- Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623. Retrieved from https://dl.acm.org/doi/10.1145/3442188.3445922
- Tenney, I., Xia, P., Chen, B., Wang, A., Poliak, A., McCoy, R. T., ... & Pavlick, E. (2019). What do you learn from context? Probing for sentence structure in contextualized word representations. In 7th International Conference on Learning Representations (ICLR 2019). Retrieved from https://arxiv.org/abs/1905.06316
- Wang, W., Dai, J., Chen, Z., Huang, Z., Li, Z., Zhu, X., ... & Lu, L. (2022). Towards Data-Efficient Detection Transformers. arXiv preprint arXiv:2203.09507. Retrieved from https://arxiv.org/abs/2203.09507
- Abeysiriwardana, M., & Sumanathilaka, D. (2024). A Survey on Lexical Ambiguity Detection and Word Sense Disambiguation. arXiv preprint arXiv:2403.16129. Available at: https://arxiv.org/abs/2403.16129
- Lewis, M., Nayak, N. V., Yu, P., Yu, Q., Merullo, J., Bach, S. H., & Pavlick, E. (2024). Grounded learning for compositional vector semantics. arXiv preprint arXiv:2401.06808. Available at: https://arxiv.org/html/2401.06808v1
- Jurafsky, Daniel, and James H. Martin. 2023. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (3rd ed. draft). Chapter 27: Discourse Coherence. Stanford University. Available at: https://web.stanford.edu/~jurafsky/slp3/27.pdf
- Koehn, Philipp, and Rebecca Knowles. 2017. "Six Challenges for Neural Machine Translation." In Proceedings of the First Workshop on Neural Machine Translation, pages 28–39, Vancouver. Association for Computational Linguistics. DOI: 10.18653/v1/W17-3204. Available at: https://aclanthology.org/W17-3204/
- Bolukbasi, Tolga, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, and Adam Kalai. 2016. "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings." In Advances in Neural Information Processing Systems 29 (NIPS), pages 4349–4357. Available at: https://arxiv.org/abs/1607.06520
- Strubell, Emma, Ananya Ganesh, and Andrew McCallum. 2019. "Energy and Policy Considerations for Deep Learning in NLP." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3645–3650, Florence, Italy. Association for Computational Linguistics. DOI: 10.18653/v1/P19-1355. Available at: https://aclanthology.org/P19-1355/
- Ang, Phyllis, Bhuwan Dhingra, and Lisa Wu Wills. 2022. "Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models." In Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP, pages 113–121, Dublin, Ireland. Association for Computational Linguistics. DOI: 10.18653/v1/2022.nlppower-1.12. Available at: https://aclanthology.org/2022.nlppower-1.12/
- Ribeiro, Marco Tulio, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. "Beyond Accuracy: Behavioral Testing of NLP Models with CheckList." In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4902-4912. Online: Association for Computational Linguistics, 2020. DOI: 10.18653/v1/2020.acl-main.442. Available at: https://aclanthology.org/2020.acl-main.442/
- Doshi-Velez, Finale, and Been Kim. 2017. "Towards A Rigorous Science of Interpretable Machine Learning." arXiv preprint arXiv:1702.08608. Available at: https://arxiv.org/abs/1702.08608
- Baltrusaitis, Tadas, Chaitanya Ahuja, and Louis-Philippe Morency. 2019. "Multimodal Machine Learning: A Survey and Taxonomy." IEEE Transactions on Pattern Analysis and Machine Intelligence 41, no. 2 (2019): 423-443. DOI: 10.1109/TPAMI.2018.2798607. Available at: https://arxiv.org/abs/1705.09406
- Carlini, Nicholas, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramèr, and Chuan Zhang. 2023. "Quantifying Memorization Across Neural Language Models." International Conference on Learning Representations (ICLR), Spotlight Presentation. Available at: https://arxiv.org/abs/2202.07646
- Zellers, Rowan, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. "Defending Against Neural Fake News." In Advances in Neural Information Processing Systems 32 (NeurIPS), pages 9051-9062. Curran Associates, Inc. Available at: https://arxiv.org/abs/1905.12616
- Ye, T., Dong, L., Xia, Y., Sun, Y., Zhu, Y., Huang, G., & Wei, F. (2024). Differential Transformer. Proceedings of the International Conference on Learning Representations (ICLR). Available at: https://arxiv.org/pdf/2410.05258
- Nangia, N., Vania, C., Bhalerao, R., & Bowman, S. R. (2020). CrowS-pairs: A challenge dataset for measuring social biases in masked language models. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 1953-1967. Available at: https://aclanthology.org/2020.emnlp-main.154/
- Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108. Available at: https://arxiv.org/abs/1910.01108
- Dao, T., Fu, D., Ermon, S., Rudra, A., & Ré, C. (2022). FlashAttention: Fast and memory-efficient exact attention with IO-awareness. In International Conference on Machine Learning, 4169-4183. Available at: https://arxiv.org/abs/2205.14135
- Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, 8748-8763. Available at: https://arxiv.org/abs/2103.00020
- Klymenko, O., Meisenbacher, S., & Matthes, F. (2022). Differential Privacy in Natural Language Processing: The Story So Far. In Proceedings of the Fourth Workshop on Privacy in Natural Language Processing, 1-11. Available at: https://aclanthology.org/2022.privatenlp-1.1/
- Morris, J. X., Lifland, E., Yoo, J. Y., Grigsby, J., Jin, D., & Qi, Y. (2020). TextAttack: A framework for adversarial attacks, data augmentation, and adversarial training in NLP. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 119-126. Available at: https://aclanthology.org/2020.emnlp-demos.16/
Issue
| 2025 13th International Scientific Conference on Computer Science, COMSCI 2025 - Proceedings, 2025, Albania, https://doi.org/10.1109/COMSCI67172.2025.11225255 |
|