Machine Learning (ML) has demonstrated great promises in various fields, e.g., smart health, smart surveillance, smart home, self-driving, smart grid, which are fundamentally altering the way individuals and organizations live, work and interact. Big data is one of the key promotion factors that boosts machine learning development, following the significant successes and progress of machine learning models (especially deep learning models) in many domains in recent years.
It is urgent to shift model training and inference from the cloud to the edge. Traditionally, to develop these intelligent services and applications, big data should be stored and processed in the cloud data center in a centralized mode. However, with the growing workloads that related to 5G, the Internet of Things (IoT), and real-time analytics, Traditional centralized learning frameworks require to upload all training data from different sources to a remote data server, which incurs significant communication overhead, service latency, as well as security and privacy issues. According to a report of Cisco, nearly 847ZB data will be generated at the edge while the storage capability of data centers will only reach 19.5ZB by 2021. Moreover, plenty of emerging applications require the strict guarantee of response latency, e.g., the well-known self-driving and Industry 4.0 usually require ms-level or even us-level latency.
The integration of edge computing and machine learning results in a new interdisciplinary, named edge AI or edge intelligence, which is beginning to receive a tremendous amount of interest. In fact, wide deployment of edge devices promotes the significant increase of computing capacity on edge environments, far exceeding the increasing speed of network bandwidth. From this aspect, edge devices can be viewed as the extension of the cloud because of their huge computing capacity. By taking advantage of both cloud and edge, big data analytics could be more efficient. Edge Learning paradigm, i.e., distributed machine learning over edge devices, enables distributed edge nodes to cooperatively train models and conduct inferences with their locally cached data. Edge Learning can be seen as a revolutionary learning paradigm enabling pervasive intelligence!
This book – Edge Learning for Distributed Big Data Analytics: Theory, Algorithms, and System Design – explores the new characteristics and potential prospects of edge learning. It provides a comprehensive and systematic introduction of the recent research efforts on edge learning. While edge learning has great potential for many intelligent applications – e.g., smart cities and self-driving cars -, it is quite challenging to realize it in an efficient and secure manner due to the inherent characteristics of the cloud-edge environment. In this book, we first discuss the challenge issues existing in Edge Learning. Then, we introduce the optimization algorithms, fundamental theory, communication-efficient technologies, computation acceleration, heterogeneous data distribution, privacy protection and security guarantee mechanisms, learning architectures, and incentive mechanisms for Edge Learning. We also discuss the popular programming frameworks for edge learning and present the inspiration of how to implement Edge Learning in realistic scenarios.
This book is aimed at graduate students and researchers who study and work in the related fields. Edge learning is a typical interdisciplinary, which integrates machine learning, edge computing, and distributed data processing. This book can be used as a supplementary textbook for the courses of distributed data processing and machine learning. We believe that this book will stimulate fruitful discussions, and inspire further research ideas on this field.