Autors: Zhelev, S. M., Rozeva, A. G.
Title: Data Analytics and Machine Learning with Java
Keywords: Data Analytics, Machine Learning, Big Data, Data Lakes, Java Ecosystem, DeepLearning4j, Spark, Hadoop

Abstract: Data analysis is an important issue for companies as it provides deep business insights thus facilitating their performance. Usually enterprise applications and platforms for Big Data processing live in the Java ecosystem. Therefore it is important to investigate how Java-based platforms for data analytics and machine learning work and the functionality they provide. The paper analyzes the most widely used machine learning libraries in the Java world - Deeplearning4j and Spark MLlib. Using data lakes is a common practice for storing large amount of data for data analytic tasks. Hadoop has proved to be the common choice for building data lakes. The paper will review Hadoop ecosystem components, their advantages and potential problems concerning machine learning tasks.



    AIP Conference Proceedings, vol. 2048, issue 1, pp. 060020, 2018, Bulgaria, AIP Publishing LLC

    Copyright Publisher

    Вид: статия в списание, публикация в издание с импакт фактор, публикация в реферирано издание, индексирана в Scopus и Web of Science