Uploaded by DR. VIVEK KUMAR PRASAD

gaming FL

advertisement
64
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 9, NO. 1, FEBRUARY 2022
Dynamic Games for Social Model Training Service
Market via Federated Learning Approach
Wenqing Cheng, Yuze Zou , Jing Xu , and Wei Liu , Member, IEEE
Abstract— In recent years, an increasing amount of new
social applications have been emerging and developing with the
profound success of deep learning technologies, which have been
significantly reshaping our daily life, e.g., interactive games and
virtual reality. Deep learning applications are generally driven
by a huge amount of training samples collected from the users’
participation, e.g., smartphones and watches. However, the users’
data privacy and security issues have been one of the main restrictions for a broader distribution of these applications. In order
to preserve privacy while utilizing deep learning applications,
federated learning becomes one of the most promising solutions,
which gains growing attention from both academia and industry.
It can provide high-quality model training by distributing the
training tasks to individual users, relying on on-device local
data. To this end, we model the users’ participation in social
model training as a training service market. The market consists
of model owners (MOs) as consumers (e.g., social applications)
who purchase the training service and a large number of mobile
device groups (MDGs) as service providers who contribute local
data in federated learning. A two-layer hierarchical dynamic
game is formulated to analyze the dynamics of this market. The
service selection processes of MOs are modeled as a lower level
evolutionary game, while the pricing strategies of MDGs are
modeled as a higher level differential game. The uniqueness and
stability of the equilibrium are analyzed theoretically and verified
via extensive numerical evaluations.
Index Terms— Differential game, evolutionary game, federated
learning, machine learning as a service (MLaaS).
N OMENCLATURE
K and K
N and N
ωtm
di,k
pi,k
pk
x i,k
Set of MDGs and number of MDGs, respectively.
Set of MOs and number MOs, respectively.
Weights of MDG m before the tth iteration.
Dataset size of MDG k owns for MO i (i ∈ N
and k ∈ K).
Unit price that MDG k offers for MO i .
pk [ p1,k , . . . , p N,k ]T , list of prices of
MDG k for all MOs.
Probability MO i requests training service
from MDG k.
Manuscript received December 4, 2020; revised April 10, 2021; accepted
May 20, 2021. Date of publication June 23, 2021; date of current version January 31, 2022. This work was supported by the National Science Foundation
of Hubei Province under Grant 2020CFB794. This article was presented in
part at the IEEE Pacific Rim Conference on Communications Computers and
Signal Processing (PACRIM), 2019. (Corresponding author: Jing Xu.)
The authors are with the Hubei Key Laboratory of Smart Internet Technology, School of Electronic Information and Communications, Huazhong
University of Science and Technology, Wuhan 430074, China (e-mail:
xujing@hust.edu.cn).
Digital Object Identifier 10.1109/TCSS.2021.3086100
xi
ci,k
u i,k
k
i,k
|·|
xi [x i,1 , . . . , x 1,K ]T , list of probabilities of MO i
to request training services from MDGs.
Unit cost MDG k provides training service for MO i .
Utility of MO i to request training service from
MDG k.
Expected profit of MDG k.
Dynamics of MO i ’s selection on MDG k.
Euclidean norm.
I. I NTRODUCTION
W
ITH the great success of machine learning and deep
learning technologies, enormous machine learning
applications, such as image recognition, natural language
processing, automatic driving, and medical diagnosis, are
emerging and have profoundly improved our daily life [2].
According to the report by Stratistics MRC, the global
machine learning as a service (MLaaS) market is expected to
grow from U.S. $2.96 billion in 2019 to U.S. $49.13 billion
by 2027 with a compound annual growth rate (CAGR) of
42.1% [3]. Machine learning and deep learning applications
are generally driven by a huge amount of training datasets
collected from personal mobile devices, e.g., smartphones and
watches, which arouses thorny issues related to data privacy,
such as data abuse and leakage. These issues are getting more
and more attention from the public and administrations across
the globe and become the barriers for the MLaaS market to
access large amounts of data from personal mobile devices.
In particular, regulations and laws protecting data privacy
and security have been promulgated recently by states. For
example, general data protection regulation (GDPR) has been
enforced by the European Union in May 2018 to protect users’
privacy and data security [4].
To tackle these issues, federated learning proposed by
Google becomes one of the most promising solutions and
gains great attention from both academia and industry. In a
nutshell, federated learning is a distributed and collaborative
machine learning framework that fully utilizes the powerful
mobile devices and takes advantage of the intelligence at the
end users with their on-device data [5], [6]. Different from
the conventional machine learning frameworks, federated
learning keeps users’ private data on their devices instead
of uploading them to a central data center. In this way,
federated learning preserves users’ data privacy efficiently
and prevents data abuse and leakage. To make federated
learning more applicable in practice, however, there still
exist several open challenges to be tackled. The current
literature mainly focuses on solving related problems that
2329-924X © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
CHENG et al.: DYNAMIC GAMES FOR SOCIAL MODEL TRAINING SERVICE MARKET
typically include low communication efficiency caused by
cumbersome weights of the machine learning model [7], [8],
possible privacy leakage [9], and security issue such as
model poisoning [10]–[12]. Nevertheless, a wide range of
applications based on federated learning framework have been
envisioned in the literature, including medical and health,
such as disease diagnosis [13], natural language processing,
e.g., next-word prediction [14], [15], traffic monitoring [16],
and V2X communications [17]. Moreover, the emerging
of 5G and mature of mobile edge computing (MEC)
techniques [18]–[20] are powering up the applications of
federated learning as it is intrinsically suitable for MEC
scenario. MLaaS providers, such as Google Cloud AI
(https://cloud.google.com/ai-platform),
Microsoft
Azure
(https://azure.microsoft.com/services/machine-learning), and
Amazon ML (https://aws.amazon.com/ai), can embrace
federated learning framework to provide privacy-preserving
model training services to model owners (MOs), such
as small companies whose applications rely on machine
learning models, but massive datasets from end users are
not available or those institutions possess sensitive data,
such as hospitals with medical images. As the pioneer of
this concept, Google has announced a scalable production
system for federated learning in the domain of mobile
devices [21], which contributes to the formation of federated
learning-based market. In this kind of market, we refer to
MLaaS providers as mobile device groups (MDGs) since
they have access to a federation of enormous mobile devices,
e.g., Apple and Samsung have hundreds of millions of iOS
and Android devices, respectively. On the other hand, we refer
to the consumers in the market as MOs as they have machine
learning models to purchase training services from the MDGs.
In this article, we study the incentive mechanism for the
participants in this federated learning market. First, we aim to
design the incentive of an MDG to train a machine learning
model for the MOs. Typically, the MDG can be rewarded by
providing its on-device training services to the MOs. Second,
to maximize the MOs’ profits, it is also a critical design
problem for the MO to select appropriate MDGs in an open
market, as the quality of service varies at different MDGs.
We address these problems by building a price-based market
framework to model the interactions between MOs and MDGs.
In particular, MDGs can set different prices to their training
services for MOs, according to the users’ preferences or
willingness of participation. The MDG’s target is to maximize
the cumulative profits of all users. Each MO can select serving
MDGs from the set of available MDGs according to their
prices and quality of services aiming to maximize their benefits
(i.e., utilities or payoffs). The matching problem becomes
more challenging when the MOs’ selections of MDGs change
dynamically according to the time-varying performance satisfaction and cost. As such, the MDGs’ pricing strategies
need to be adjusted accordingly to meet the MOs’ dynamic
demands. To study this problem, we propose a two-layer
dynamic game framework to model the dynamic behaviors
of both MOs and MDGs in the model training service market.
The game framework is fully distributed and is practically
applicable for federated learning involving a large number
65
of participants. The contributions of this article lie in three
folds.
1) We propose a price-based training service market model
for federated learning to study the MDGs’ selections
of training tasks, and the MOs’ selection of service
providers with different quality of services. The optimal strategies of MOs and MDGs are obtained by
maximizing individuals’ payoffs. The proposed market
model allows participating users to set different prices
of their training services, according to individuals’ risk
preferences of privacy breach, providing a more flexible
privacy-preserving mechanism for the emerging MLaaS
applications.
2) A two-layer hierarchical dynamic game is proposed to
model the interactions between MOs and MDGs in the
above model training service market. The MOs’ service
selections are studied in a lower level evolutionary
game, while the MDGs’ pricing strategies are optimized in a higher level differential game. The solutions
of the proposed game, i.e., dynamic equilibrium, are
given theoretically and verified via extensive numerical
evaluations.
3) We characterize the quality of training service in terms
of dataset size and non-independent identically distributed (i.i.d.) property of the participants in federated
learning. Extensive experiments reveal that the relationships between the quality of service and these properties
can be fitted to exponential functions. Similar results
also apply to other federated learning scenarios.
The remainder of this article is organized as follows.
Section II summarizes the related work and preliminary of
federated learning is given in Section III. We describe the
system model and propose the training service market in
Section IV. Later on, we formulate the two-layer dynamic
game and then analyze the uniqueness and stability of the
equilibrium theoretically in Section V. Finally, numerical evaluations and conclusions are presented in Sections VI and VII,
respectively. The major notations used in this article are given
in the Nomenclature.
II. R ELATED W ORK
Federated learning is originated from Google back
in 2016 [5], [6]. It aims to train a machine learning model in a highly distributed manner while preserving users’ privacy. Mcmahan et al. [5] proposed the
FederatedAveraging algorithm to drive the federated
learning system, which allows a server to collect local stochastic gradient descent (SGD) on each client and then perform
model averaging. Konečnỳ et al. [6] introduced the concept
of Federated optimization, which is a new and practical setting for distributed optimization in machine learning.
Several algorithms, including stochastic variance reduced gradient (SVRG) [22], distributed approximate Newton (DANE)
[23], and federated SVRG are analyzed theoretically for the
federated setting.
However, several technical challenges have to be
solved before its practical deployment. These include
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
66
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 9, NO. 1, FEBRUARY 2022
low communication efficiency, potential privacy leakage [24],
and security issues [9], [25]. The weights of deep neural
networks in a machine learning model are typically of a
large set. This implies a significant cost in the information
exchange between servers and clients. Apart from that,
the data privacy and security issues may prevent the users’
participation in federated learning. Melis et al. [24] showed
that an adversarial participant can infer the presence of
exact data points. Furthermore, the attackers may poison the
shared model, i.e., Fung et al. [9] demonstrated that federated
learning is vulnerable to the sybil-based label-flipping
poisoning.
In presence, a lot of works are emerging to improve the
robustness and enhance the practicality of federated learning.
To reduce the communication overhead, the most straightforward method is to design weights compression algorithms
for the machine learning models [7], [8]. Konečnỳ et al. [7]
designed structured updates and sketched updates to compress
the weights, which can reduce the communication cost by two
orders of magnitude. Furthermore, Lin et al. [8] proposed
deep gradient compression (DGC) to highly compress the
weights even further. In particular, the DGC algorithm can
reduce the size of ResNet-50 from 97 to 0.35 MB and reduce
DeepSpeech from 488 to 0.74 MB. This makes the communication cost negligible in a federated learning system. The
privacy and security issues have also been studied extensively
in the literature [10], [11], [26], [27]. Bonawitz et al. [10]
proposed a secure aggregation scheme to protect the privacy
of each user’s gradient estimation. A randomized mechanism
is proposed in [11] to hide a single client’s contribution
in weights aggregation and thus ensure data privacy in the
learning process. Fung et al. [9] proposed the FoolGold
algorithm to identify poisoning sybils based on the diversity
of clients’ updates in the distributed learning process.
There are several works concentrating on the incentive
mechanisms in federated learning system. Kang et al. [28]
adopted a contract theory to design an effective incentive
mechanism for mobile devices with high-quality data to participate in federated learning. Jiao et al. [29] proposed an
auction-based market model to incentivize data owners to
participate in federated learning. They design two auction
mechanisms for the federated learning system to maximize
the social welfare of the federated learning services market.
III. P RELIMINARY OF F EDERATED L EARNING
The federated learning system generally works in four
phases, as shown in Fig. 1. First, the central coordinator or
model aggregator distributes its machine learning model to a
group of mobile devices selected from the federation. More
specifically, the coordinator can deliberately choose a subset
of mobile devices with preferred quality of services. Second,
the selected mobile devices train the machine learning model
based on individuals’ local datasets. Third, after training for
epochs, each mobile device uploads the updated model weights
back to the model aggregator. To motivate the user’s participation, the mobile device typically receives a payoff from
its model training service. Fourth, the coordinator aggregates
Fig. 1.
Interactions between mobile devices and model aggregator in
federated learning.
(e.g., by averaging [6]) all the weights of machine learning
model uploaded by the mobile devices. This four-phase procedure repeats periodically and is expected to improve model
accuracy.
A. Local Update and Weights Aggregation
Two core operations of federated learning are local update
preformed on mobile devices and weights aggregation performed on the coordinator. Specifically, the local update of
the model weights on the mobile device m at the tth iteration
is given as follows:
T
T
ωt+1
m = ωm − η∇(ωm )
(1)
where ωTm denotes the obsoleted weights of mobile device m
before the tth iteration. Learning rate and loss function of
the machine learning model are represented by η and (·),
aims to reduce the loss
respectively. The new weight ωt+1
m
function (·). Hence, it is updated by a gradient descent rule
based on locally stored dataset on mobile device m. The
weights aggregation performed on the coordinator is typically
implemented via the FederatedAveraging algorithm proposed in [6], which is defined as follows:
nm
ωTm
=
(2)
ωt+1
0
n
T
m∈M
where MT denotes the set of selected mobile devices and
ωt+1
denotes the averaged weights of learning model after the
0
tth iteration. Let n m denotethe size of training dataset on
mobile device m and n = m∈MT n m be the complete size
of training datasets of all participating mobile devices. Hence,
the weighting parameter (n m /n) represents the significance of
individual mobile device. It is clear that the mobile device
can contribute more if it provides more training data to the
selected training task.
B. Quality of Federated Learning Training Service
Similar to the conventional centralized machine learning
tasks, the model accuracy under federated learning fashion is
intuitively higher with larger training dataset size. This means
that more mobile devices are preferred in the training service.
Empirically, the accuracy of a machine learning model is a
nondecreasing concave function of the training dataset size.
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
CHENG et al.: DYNAMIC GAMES FOR SOCIAL MODEL TRAINING SERVICE MARKET
67
Fig. 3. Social model training service market via federated learning that
consists of MOs as service consumers and MDGs as service providers.
Fig. 2. Demonstration of the impacts of dataset size and EMD on the training
accuracy in federated learning system.
Furthermore, however, different from the centralized fashion,
training dataset under federated learning is distributed sparsely
on devices, generally with non-i.i.d. distributions and varies
from device to device. This intrinsic non-i.i.d. property of the
dataset is also critical to the quality of the model training.
Zhao [30] showed that the accuracy of machine learning model
drops dramatically, by up to 55% trained based on highly
skewed non-i.i.d. data, compared to that trained with i.i.d.
data. To quantify this impact on the quality of the federated
learning training service, we adopt the metric defined in [30]
to measure the non-i.i.d. property of training data, which is
Earth mover distance (EMD). Concretely, the EMD of mobile
device m, denoted by σm , is given as follows:
|Pm (l) − P(l)|
(3)
σm l∈L
where P(l) and Pm (l) denote the probabilities of l for the
whole MDG and device m, respectively, and L is the set of
all possible categories in the dataset. Take the MNIST [31]
handwritten digits database as an example, and its training set
is divided into ten categories uniformly. Therefore, P(l) = 0.1
for l ∈ {0, . . . , 9}. Suppose that we have device 1 with all
samples of digit 1 and device 2 with 50% of digit 1 and 50%
of digit 2, and then, σ1 and σ2 are calculated as σ1 = 0.1 ×
9 + 0.9 = 1.8 and σ2 = 0.1 × 8 + 0.4 × 2 = 1.2, respectively.
To illustrate the impact of dataset size and non-i.i.d. property
of training participants on the overall training performance,
i.e., model accuracy, we conduct a group of experiments and
the results are shown in Fig. 2. The machine learning model
is trained on the MNIST database, with a federation of mobile
devices involved. Each mobile device owns 100 samples that
were randomly selected from the training dataset of MNIST
with predefined skewness. We vary the number of participating
devices in federated learning from 1 to 60 (accordingly,
the total size of dataset involved in the training ranges from
100 to 6000). Each mobile device’s EMD is identical and
chosen from the set {0.0, 0.2, 0.4}. Fig. 2 shows the accuracy
of the aggregated model based on the test dataset after five
rounds of on-device training. As we can see, the model’s
accuracy increases with the dataset size with the same EMD
value. However, for a fixed dataset size, the accuracy drops
about 5% as the EMD increases by 0.2.
IV. T RAINING S ERVICE M ARKET M ODEL
We consider a model training service market for federated
learning that consists of N MOs and K MDGs, as shown
in Fig. 3. The sets of MOs and MDGs are denoted by N =
{1, 2, . . . , N} and K = {1, 2, . . . , K }, respectively. In this
market, each MO has a specific machine learning model to
train, which can have different model structures based on its
application, such as convolutional neutral networks (CNNs),
long short-term memory (LSTM), and recurrent neutral networks (RNNs). Generally, MOs have no sufficiently large
datasets to train their models. As such, MDGs can act as
the service providers who can help train the MOs’ machine
learning models in a federated learning manner. Specifically,
let di,k denote the data size MDG k that owns in total for MO i .
Each MDG is governed by an operator who is responsible
for the training service aggregation. For example, the operator
could be a smartphone company such as Apple and Samsung
who have an enormous number of mobile devices. MDGs
can set different prices for their training services in order to
maximize their accumulative profits, while the MOs aim to
select proper MDGs for their model training tasks, aiming to
gain high model accuracy at a relatively low cost.
A. Provider: MDGs
In the model training service market, each MDG is governed
by an operator who coordinates a group of mobile devices.
For example, the training service happens on a federation
of end devices, such as mobile phones, tablets, and watches,
and the updated weights are uploaded to the operator of the
federation. These weights are then aggregated by some algorithms, e.g., FederatedAveraging. The detailed interplay
between individual mobile device and the operator within
one MDG is beyond the scope of this article. Nevertheless,
as shown in Fig. 2, the quality of federated learning training
service mainly relates to the size of data size and the noni.i.d. property. Let x i,k denote probability that MO i selects
training data from MDG k, and thus, the average data size
that MDG k provides for MO i is given by x i,k di,k . On the
other hand, to characterize the non-i.i.d. property in model
training, we define the average EMD of all mobile devices
in the same group (e.g., MDG k) for the same training task
of MO i as σi,k . Let σ i [σi,1 , . . . , σi,K ] denote the EMD
vector of different MDGs for MO i . The MDGs may adjust
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
68
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 9, NO. 1, FEBRUARY 2022
their pricing strategies over time in order to maximize their
accumulative profits. Specifically, the profit of MDG k consists
of two parts.
1) Payment From the MOs: The MOs have to pay MDGs
for providing the model training services. Let pi,k denote
the price that MDG k trains for MO i per unit data size.
Then, the expected payment to MDG k received from
MO i is given by pi,k x i,k di,k .
2) Cost of Model Training: The MDGs’ training services
also incur a cost that accounts for the energy consumption and the users’ preferences. It is clear that a longer
training time or more participating users in the training
will incur higher energy consumption. We assume that
the cost function is linearly increasing with the size of
dataset involving in the training process and varies this
assumption via extensive experiments as in Section VI.
Let ci,k denote the unit cost of MDG k when it provides
the training service for MO i . The expected dataset size
of MDG k used for training the machine learning model
of MO i can be denoted by x i,k di,k , where di,k is the size
of the MDG k’s dataset allocated to MO i . Then, the cost
of training is given by ci,k x i,k di,k .
Therefore, the expected profit of MDG k can be simply
denoted as follows:
k (pk ) =
pi,k x i,k di,k − ci,k x i,k di,k
(4)
i∈N
where pk = [ p1,k , . . . , p N,k ]T denotes the MDG k’s pricing
strategy for training different MOs’ machine learning models.
B. Consumer: MOs
According to the MDGs’ pricing strategies, the MOs can
select different service providers, i.e., MDGs, to maximize
their own utilities in terms of the model accuracy and cost.
We assume that each MO selects one MDG at the same
time since the model aggregation among different MDGs
is generally not available due to extra coordination costs.
We require x i,k ∈ [0, 1] and
k∈K x i,k = 1. In practice,
the probability x i,k can be explained as the portion of training
time allocated to MDG k. For example, x i,k = 0.5 means that
MO i will select the training service of MDG k for half of the
overall training time. The selection of MDGs may dynamically
change over time in order to improve the MOs’ utilities, which
are determined by the following metrics.
1) Model Accuracy: It characterizes the quality of services
provided by the selected MDGs. Let f k (d, σ ) denote the
expected accuracy of the MOs’ machine learning models
contributed by the model training service of MDG k,
which relates to overall size d and the average EMD σ
of the datasets involved in the training process.
2) Payment to MDGs: When MO i selects the training
service from MDG k, the expected payment to MDG k
is simply given by pi,k x i,k di,k , which is linear to the size
of dataset contributed by MDG k.
3) Penalty for Congestion: MOs’ utilities are also affected
by the potential congestion on the selection of MDGs.
The MDG’s scheduling of training services becomes
more complicated as more MOs simultaneously select
the same MDGs. Such congestion in return incurs performance degradation (e.g., an increasing delay) at the
MOs. To capture this effect,
we define the penalty for
congestion at MDG k as ( i∈N x i,k di,k )2 .
Combining the above three terms, the utility of MO i , denoted
by u i,k , can be given as follows:
u i,k = ζi,k f k (x i,k di,k , σi,k )
− pi,k x i,k di,k
2
αk −
x i,k di,k
2 i∈N
(5)
where the coefficient ζi,k controls the MO i ’s preference on
the model accuracy. The constant αk denotes the MDG k’s
sensitivity to congestion, which represents the MDG’s capability to deal with concurrent training requests. We require that
the payoff function f k (d, σ ) related to the model’s accuracy
should have the following properties.
1) Nondecreasing in d with a fixed σ and decreasing in σ
with a fixed size d.1
2) First- and second-order differentiable in terms of d. The
first property indicates that the MDG can provide a
higher accuracy with more data2 that are involved in
the model training task.
V. DYNAMIC G AME AND E QUILIBRIUM A NALYSIS
To depict the dynamics among MDGs and MOs in this training market under a federated learning framework, we devise
a two-layer hierarchical game whose structure is summarized
in Fig. 4. In particular, the dynamics of MOs’ selections are
formulated by an evolutionary game due to their bounded
rationality, while the pricing strategies of MDGs are modeled
as a differential game. Specifically, MOs’ selection over time
can be described by a group of ordinary differential equations
(ODEs), which constitutes the dynamic states of MDGs’
differential game. A similar game structure is also adopted
in [32] and [33].
A. Lower Level Evolutionary Game
The MOs with bounded rationality select MDG from K candidate providers. Initially, each MO chooses MDG randomly.
To obtain better utility, each MO adjusts its selection according
to the price and time-varying observed model accuracy. With
incomplete information, each MO could gradually learn by
imitating the selection with higher payoff during the selection
adaptation process. As such, the selection process of MO i
can be formulated via replicator dynamics as follows:
i,k (t) = δx i,k (t)(u i,k (t) − ū i (t)), k ∈ K
(6)
where δ is the learning rate
that controls the selection adaptation frequency and ū i = k∈K x i,k u i,k is the average utility
1 The model’s accuracy may also decrease with the size of dataset when
the dataset is highly skewed, i.e., EMD σ is relatively large. In this case,
the operator can alleviate the performance degradation, e.g., by sharing a small
portion of common dataset among mobile devices [30]. Besides, such MDGs
can also be excluded by MOs due to their unsatisfactory model accuracy.
2 Here, “more data” means a larger size of dataset and more training time.
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
CHENG et al.: DYNAMIC GAMES FOR SOCIAL MODEL TRAINING SERVICE MARKET
69
where V (x i,k (t)) satisfies
= 0, if x i,k = 0, ∀i ∈ N , k ∈ K
V (x i,k (t))
> 0, otherwise.
(8)
By using the Lyapunov function defined in (7), the evolutionary equilibrium to the evolutionary game defined in (6)
can be proved to be stable. Please refer to Appendix B for
details.
B. Upper Level Differential Game
Fig. 4. Two-layer dynamic game framework for the social model training
service market. The lower level is an evolutionary game that MOs adapt their
training service selections, and the upper level is a differential game that
MDGs adjust their service prices.
of MO i . The initial selection probability xi (0) is randomly
generated and denoted by xi(0) . According to the replicator
dynamics, the probability of MO i ’s selection for MDG k
will increase if its corresponding utility is higher than MO i ’s
average utility [i.e., u i,k (t) > ū i (t)] and vice versa. The growth
rate i,k (t) is proportional to the difference between the utility
of the selection and MO i ’s average utility as well as the
current probability of the selection, x i,k (t).
Next, we prove the uniqueness and stability of the equilibrium of the lower level evolutionary game. To proceed,
we first give the definition of evolutionary equilibrium and
then provide the solution to the lower level game.
Definition 1: Evolutionary Equilibrium: The solution of the
game defined in (6) is defined as the evolutionary equilibrium.
Proposition 1: Let i,k (xi (t), pi (t)) δx i,k (t)(u i,k (t) −
ū i (t)), and the first-order derivative of i,k with respect to
x j,l (t) is bounded for all ( j, l) ∈ N × K.
Proof: The proof is given in Appendix A.
Based on the definition, the uniqueness of the evolutionary
equilibrium is guaranteed by Theorem 1.
Theorem 1: The evolutionary game defined in (6) is
uniquely solvable and hence admits a unique evolutionary
equilibrium.
Proof: Proposition 1 guarantees that i,k satisfies the
Lipschitz condition with respect to x j,l for all ( j, l) ∈ N × K.
Hence, the evolutionary game defined in (6) is uniquely
solvable according to the Cauchy–Lipschitz theorem [34]. Second, according to Lyapunov’s second method for stability [35], we justify the stability of the evolutionary equilibrium
to the evolutionary game defined in (6) as presented in
Theorem 2.
Theorem 2: The evolutionary game defined in (6) admits a
stable evolutionary equilibrium.
Proof: From Lyapunov’s second method for stability,
we design a Lyapunov function as follows:
2
x i,k (t)
(7)
V (x i,k (t)) =
i∈N k∈K
As for the MDGs, they need to decide on the pricing
for MOs and take the dynamics of MOs’ selections into
consideration. For example, an MDG sets a higher price,
which can increase the instant revenue. However, on the other
hand, it may also deviate MOs to select other MOs that offer
cheaper services. As such, we formulate a K -player (each
represents an MDG) differential game to analyze this kind of
dynamic decision-making problem. Different from MOs with
bounded rationality, MDGs are supposed to be rational in that
they are able to make decisions as best response to others’
decisions. Specifically, the MDGs offer their prices for all MOs
initially. After that, MOs’ selections adapt over time based on
replicator dynamics. Then, each MDG updates their prices as
best response to the dynamics of MOs’ selections as well as
other MDGs’ pricing strategies.
All the MDGs aim to maximize their accumulative profits
over a time horizon [0, T ]. The profit of MDG k and k is
defined in (4). In the sequel, the problem to maximize the
accumulative profit of MDG k with other MDGs’ strategies
and the dynamics of MOs’ selections that are given can be
transformed as an optimal control problem (OCP), which is
given by
T pi,k (t)x i,k (t)di,k − ci,k x i,k (t)di,k dt
(9a)
max
pk (t)
0
i∈N
s.t. i,k (t) = δx i,k (t)(u i,k (t) − ū i (t)), (k, i ) ∈ K × N
(9b)
xi (0) = xi(0) , i ∈ N .
(9c)
Next, we give the equilibrium analysis for the above upper
level differential game, which can be reformulated into K
OCPs. In the sequel, the solutions of OCPs are equivalent
to maximize their corresponding Hamilton [36], which can
be solved efficiently via iterative algorithm [37]. Specifically,
the equilibrium existence of the upper level game is proven
in Theorem 3, which is based on Lemma 1. For notational
convenience, we let the objective of (9) inside integral as
follows:
k (xk (t), pk (t))
pi,k (t)x i,k (t)di,k − ci,k x i,k (t)di,k .
i∈N
T
0
k xk ϑ (t), pk ϑ (t) dt → ιk
T
=
k (xk (t), pk (t))dt.
sup
(xk (·),pk (·))∈Dxk ×Dpk
(10)
0
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
70
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 9, NO. 1, FEBRUARY 2022
Lemma 1: For all i ∈ N , assume that there exists a
sequence, i.e., pi ϑ (·) →w pi (·), ∀ϑ ≥ 1, xi ϑ (·) is the solution
to the lower level evolutionary game corresponding to pi ϑ (·).
Then, there exists a subsequence {xi ϑh (·)}h≥1 of {xi ϑ (·)}ϑ≥1
such that {xi ϑh (·)} pointwisely converges to xi (·) for all i ∈ N ,
denoted by xk ϑ (·) →s xk ∗ (·) for all ϑ ≥ 1, if i,k (·, ·) is
Lipschitz continuous with respect to xi (·) and pi (·), e.g.,
x
i,k (
xi (·) − x̂i (·)
xi (·), pi (·)) − i,k (x̂i (·), pi (·)) ≤ κi,k
i,k (xi (·),
pi (·) − p̂i (·)
pi (·)) − i,k (xi (·), p̂i (·)) ≤ κi,kp where xi (·) is the solution to the lower level evolutionary game
corresponding to pi (·).
Proof: Please refer to Appendix C for the proof.
Based on Lemma 1 that guarantees the existence of pointwisely convergence of MO’s strategies under any given pricing
sequences of MDGs, we can further prove the existence of the
equilibrium of the upper level differential game in Theorem 3.
Theorem 3: Suppose that k : T × Dxk × Dpk → R is a
mapping such that the following conditions hold.
1) k : T × Dxk × Dpk → R+ is approximately lower
semicontinuous.
2) k (xk (t), pk (t)) : Dxk × Dpk → R+ is lower semicontinuous for all t ∈ T .
3) k (xk (t), pk (t)) is convex with respect to pk (t) for all
(t, xk (t)) ∈ T × Dxk .
Here, Dxk = ×i∈N Dxi,k and Dpk = ×i∈N D pi,k . Then,
the upper level differential game admits an optimal solution
pair (x i,k ∗ (·), pi,k ∗ (·))i∈N ,k∈K .
Proof:
Let {(xk ϑ (·), pk ϑ (·))}ϑ≥1 be a maximizing
sequence that follows (10). Here, passing to a subsequence
if necessary, we assume that pk ϑ (·) →w pk ∗ (·) for all k ∈ K.
In this case, we have xk ϑ (·) →s xk ∗ (·) for all ϑ ≥ 1 and k ∈ K
based on Lemma 1, where xk ∗ (·) is the solution to the lower
level evolutionary game corresponding to pk ∗ (·). Following
this, we can accordingly have:
T
lim
ϑ→+∞
0
k xk ϑ (t), pk ϑ (t) dt ≤
T
k xk ∗ (t), pk ∗ (t) dt
0
from which it can be further deduced as
T
k xk ∗ (t), pk ∗ (t) dt
ιk ≤
0
≤
T
sup
(rh (·),pk (·))∈Dxk ×Dpk
k (xk (t), pk (t))dt = ιk .
0
T
Consequently, we have 0 k (xk ∗ (t), pk ∗ (t))dt = ιk , which
implies that (xk ∗ (·), pk ∗ (·)) is the optimal solution pair to the
upper level differential game. This completes the proof. VI. N UMERICAL E XPERIMENTS
We conduct comprehensive numerical experiments to evaluate the dynamics of the federated learning training service
market, in which we consider K = 3 MDGs serving for N = 2
MOs. The non-i.i.d. property, i.e., the EMD of these MDGs for
MO 1 and MO 2, is given by σ i = [0.1, 0.15, 0.2], i ∈ {1, 2},
which means that MDG 1 has the relatively most balanced
dataset among all MDGs. The maximum dataset size of each
Fig. 5. Model accuracy against dataset size under different EMD settings.
Dataset size ranges from 100 to 6000 and EMD, and σ ranges from 0 to 1.
MDG can provide for each MO that is identical and given as
4000. Besides, the weights of accuracy term in (5) are also
identical for a fair comparison and set as 6.
A. Accuracy Fitting and Energy Consumption Measurement
First, in order to present a proper empirical function
f (d, σ ), we devise a group of experiments to evaluate the
accuracy of a deep neural network trained by a federation
of mobile devices against different dataset sizes and EMD
settings. Apart from that, we also measure the energy consumption of model training on Raspberry Pi. Specifically,
we train a three-layer neural network with one 512-unit
fully connected hidden layer on a group of mobile devices
in federated learning fashion. The actual and corresponding
fitted model accuracies under different EMD and dataset size
settings are shown in Fig. 5. As we can see in the figure,
with any given EMD setting, the accuracy achieved by the
federation of mobile devices increases with the training dataset
size dramatically when the size is relatively small. However,
the accuracy converges to a certain level when the dataset size
is relatively large, and it has no further improvement with
the increase in size. On the other hand, with given dataset
size, the model can achieve higher accuracy with smaller
EMD, σ , which shows the impact of non-i.i.d. property of
the participants. To capture the marginal effect of training
dataset size and the impact of EMD on accuracy, we adopt
an exponential function to fit the experimental results. The
fitted function is given in (11), in which EMD and σ can be
considered as a parameter that affects the preference ceiling
f (d, σ ) = a(σ )−b exp(−cd a(σ ))
(11)
where a(σ ) = 0.999 exp(−((σ + 0.3331/1.767))2), b =
0.3578, and c = 4.3720. The function fits well when σ ≤ 1.0.
In particular, the coefficients of determination for all given
σ (σ ≤ 1.0) are higher than 0.90, i.e., R 2 > 0.90.
On the other hand, to measure the energy consumption
of model training on mobile devices, we deploy the neural
network on the Raspberry Pi Model 3 B to mimic mobile
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
CHENG et al.: DYNAMIC GAMES FOR SOCIAL MODEL TRAINING SERVICE MARKET
Fig. 6. Mobile device’s energy consumption measurement. Three neural networks with a different number of neurons in the hidden layer are considered,
which are 0, 64, and 512.
71
Fig. 8. Direction field of the replicator dynamics, indicating the stability of
the evolutionary equilibrium.
Specifically, we first give the evolutionary equilibrium of
MOs’ selections at the lower level game, and then, the impact
of congestion coefficient on MOs’ strategies is investigated.
Finally, the equilibrium of MDGs’ pricing strategies at the
upper level differential game is presented, and a comparison
to static noncooperative equilibrium is given.
B. Evolutionary Equilibrium of MOs’ Selections
Fig. 7. Evolutionary trajectories of the training strategies adaptation over
time for (a) MO 1 and (b) MO 2.
device training scenario.3 To gather more general results,
we deploy three neural networks with a different number of
neurons within the hidden layer, which are 0, 64, and 512.
The energy consumption results against training data size are
shown in Fig. 6, as well as the fittings for each of these three
neural network structures. As we can conclude from the figure,
the energy consumption of mobile device is linear with the
training dataset size, which is intuitive. Besides, this linearity
is universal for different settings of the neural networks, which
makes it applicable for different model settings. Furthermore,
the slope of the line indicates the energy consumption of the
machine learning model. The steeper the slope, the model is
more complicated, and thus, the energy consumption is higher.
Without loss of generality, we adopt the model with 512 units
in the hidden layer for the simulations later, and its energy
consumption coefficient is given by ci,k = 0.2148 J/103 , which
indicates that the MDG consumes 0.2148 J for the training of
per thousand samples.
With accuracy function and energy consumption coefficient
determined, we are ready to evaluate the dynamics of the
presented games and the impacts of different parameters.
3 The Raspberry Pi Model 3 B [38] can be charged by a USB port on PC
and has the similar computational capability of a modern smartphone, which
makes it suitable to represent a mobile device.
For the lower level evolutionary game, we plot the evolutionary trajectories of MOs’ selections over time in Fig. 7.
As we can see in the figure, both MO 1 and MO 2’s selections
evolve with time and gradually converge to stable states. For
example, MO 1 initializes its training strategy as x1(0) =
[0.2, 0.3, 0.5]. With the selection strategy of MO 1 evolving
with time, the probability that MO 1 selects MDG 3 decreases,
and meanwhile, the probability that MO 1 selects MDG 1 and
MDG 2 increase, driven by replicator dynamics as defined in
(6). Eventually, the selection of MO 1 converges to x1 (T ) =
[0.3826, 0.3337, 0.2837], which indicates that MO 1 prefers
MDG 1 over other two MDGs. According to the EMD settings
of MDGs, i.e., σ 1 = [0.10, 0.15, 0.20], it is reasonable that
MO 1 has a higher probability to choose the MDG with more
balanced dataset, i.e., smaller EMD since all other parameters
are identical. Similarly, MO 2’s selection strategy converges to
a symmetric stable state of MO 1 as shown in Fig. 7(b), though
it initiates with a different initial strategy compared to x10 .
Furthermore, we verify the asymptotic stability of the MOs’
evolutionary equilibrium via a direction field of replicator
dynamics. Without loss of generality, we give the direction
field of MO 1, as shown in Fig. 8. The arrow at each point
indicates the direction of the adaptation process for MO 1 at
that state. It is governed by replicator dynamics as in (6) and
defines MO 1’s selection evolving at the next step. As depicted
in the figure, the probabilities of MO 1 choose MDG 1 and
MDG 2, i.e., x 1,1 and x 1,2 always converge from any initial
probabilities, which verifies the fact that the adaptation leads
the MO 1 to achieve the evolutionary equilibrium.
Next, we evaluate the impact of learning rate, i.e., δ in (6),
on the convergence speed of the replicator dynamics, and the
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
72
Fig. 9.
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 9, NO. 1, FEBRUARY 2022
Impact of learning rate on the convergence time.
Fig. 10. Impact of congestion coefficient on the training strategies of MO 1.
The congestion coefficient, αk in (5), varies from 0.05 to 0.5.
results are shown in Fig. 9. The learning rate indicates the
frequency of MOs’ selection adaptations, which controls the
speed of strategy adaptation. As a comparison, we also implement a static noncooperative game for the MDGs, in which the
decision on pricing of the MDGs only takes the evolutionary
equilibrium of the lower level into account instead of the
replicator dynamics as that considered in differential game
fashion. The solution of this static noncooperative game uses
a backward induction method, which is widely addressed in
the literature and thus omitted here. We observe in Fig. 9 that
the convergence speed increases with the increasing learning
rate. Specifically, when the MOs fully utilize their observations
(i.e., δ = 1), the MOs can achieve the evolutionary equilibrium
at fastest rate. Without loss of generality, in the following
numerical analysis, we set the learning rate δ = 1. Besides,
as shown in Fig. 9, the differential game outperforms the static
noncooperative game in terms of the convergence speed.
C. Impact of Congestion Coefficient
Fig. 10 shows the impact of congestion coefficient on
MO 1’s training strategy. We vary the congestion coefficient α
for both MOs from 0.05 to 0.5 and plot MO 1’s training strategies at evolutionary equilibrium. As we can see in Fig. 10,
MO 1’s probabilities of selecting MDGs show convergence
with increasing congestion coefficients. With larger congestion
coefficients, the impact on MOs’ utility degradation is severer.
Fig. 11. Average utility of MOs against number of MOs that ranges from 2
to 10.
Fig. 12. Pricing strategies over time of MDGs for (a) MO 1 and (b) MO 2.
Consequently, MOs tend to select MDGs dispersively in terms
of probability in order to lower the chance of congestion.
To investigate the scalability of the proposed game framework, we then vary the number of MOs from 2 to 10 and
extend the number of MDGs to 6. The MOs’ average utilities
against the number of MOs are shown in Fig. 11. As we
can see in the figure, for given K , i.e., number of MDGs,
the average utilities of the MOs decrease with N, i.e., the
number of MOs. The reason is that the increase of MOs in
the market with a given number of MDGs results in a more
crowded network, i.e., increases the punishment introduced by
the congestion term in (5). For the same reason, for any given
number of MOs, more MDGs result in higher average utilities
of the MOs.
D. Equilibrium of MDGs’ Pricing Strategies
As for the upper level differential game, the equilibrium of
dynamic pricing strategies for the MDGs is shown in Fig. 12.
As we can observe in Fig. 12(a) and (b), all MDGs adjust
their prices for both MO 1 and MO 2 gradually over time,
and the prices converge to static states eventually. This pricing
dynamics is simultaneous with the MOs’ selections adaptation
based on that the replicator dynamics of the lower level
evolutionary game is considered for the solution of upper level
differential game. Among these MDGs, MDG 1 offers the
highest static price than other two MDGs for both MO 1
and MO 2. The reason is that MDG 1’s data quality in
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
CHENG et al.: DYNAMIC GAMES FOR SOCIAL MODEL TRAINING SERVICE MARKET
73
where (du i,k /dx j,l ) is
x i,k Mid
du i,k
+
= γk f (h(xk ))M dj
dx j,l
h(xk )
d
x i,k Mid M dj
dx i,k M j
−
× f k (h(xk ))
dx j,l h(xk )
h 2 (xk )
− ηi,k Mid
Fig. 13. Cumulative profit of MDGs under dynamic and static equilibrium
strategy controls.
terms of EMD is better than the others, i.e., smallest σ .
Consequentially, higher price for better service. Furthermore,
each MDG offers the same static price for MO 1 and MO 2.
For example, prices that MDG 1 offers for MO 1 and MO 2
are both 0.4365. Since all MDGs have the same data qualities
for MO 1 and MO 2. This is also reasonable that the same
price for the same service. On the other hand, Fig. 13 shows
the cumulative profit of MDGs. As shown in Fig. 13, MDG 1
has the highest cumulative profit among all MDGs due to
its best data quality, i.e., lowest EMD. Besides, compared
with static noncooperative game, differential game approach
helps MDGs achieve better cumulative profits due to its high
flexibility.
dx i,k
dx j,l
and h(xk ) i∈N x i,k Mid . Besides, t is omitted here for convenience. In the sequel, for all (i, k) ∈ N × K, |(du i,k /dx j,l )|
is bounded for all ( j, l) ∈ N × K due to the continuity of
f (·) and f (·). Similarly, |(d ū i /dx j,l )| is also bounded. These
facts admit that |(di,k /dx j,l )| is bounded, which completes
the proof.
A PPENDIX B
P ROOF OF T HEOREM 2
To prove that the function defined in (7) meets the Lyapunov
conditions, we need to verify that ∇t (V (t)) ≤ 0 for all values
of V (t) = 0. ∇t (V (t)) is given as follows:
∇t (V (t)) =
2
x i,k (t) i,k (t)
i∈N k∈K
= 2 Nδ
i∈N k∈K
x i,k (t) u i,k (t) − ū i (t)
i∈N k∈K
= 2 Nδ
x i,k (t)u i,k (t) −
i∈N k∈K
A PPENDIX A
P ROOF OF P ROPOSITION 1
To prove Proposition 1, we give the derivative of i,k with
respect to x j,l as follows:
du i,k
dx i,k d ū i
di,k
u i,k − ū i + x i,k
=
−
dx j,l
dx j,l
dx j,l
dx j,l
ū i
i∈N
= 0.
VII. C ONCLUSION
With emerging social applications based on machine learning technologies, our daily life has gained profound improvement. Meanwhile, the data privacy issue along with these
applications that may acquire our personal data as training
samples for the machine learning models also gains more
and more attention. To tackle this issue, federate learning
becomes one of the most promising solutions due to its
privacy-preserving property via on-device model training.
To this end, in this article, we devise a two-layer dynamic
game model consists of the lower level evolutionary game
of the MOs and the upper level differential game of MDGs
to study the incentive mechanism. The solutions of the proposed two-layer dynamic game are analyzed theoretically. The
quality of federated learning service that relates to the size
and non-i.i.d. property of the dataset is formulated based on
extensive experiments, as well as the energy consumptions
of on-device training are measured. As such, the solutions of
the proposed two-layer dynamic game are then verified via
numerical evaluations.
This states that the evolutionary game defined in (6) follows
the Lyapunov stability, i.e., the equilibrium is stable.
A PPENDIX C
P ROOF OF T HEOREM 1
Proof: Here, we have
T (12), shown at the top of the
next page, where z T = 0 |z(t)|L1 dt. i,k (·, ·) is Lipschitz
continuous with respect to xi (·) and pi (·). As such, we have
x i,k − x i,k T
ϑ
i∈N ,k∈K
⎡
1
≤ ⎣
μ i∈N ,k∈K
+
T
0
x
xi ϑ (τ ) − xi (τ ) dτ
e−μτ κi,k
i∈N ,k∈K
T
0
⎤
e−μτ κi,kp pi ϑ (τ ) − pi (τ )dτ ⎦.
x
x
= max{κi,k
}∀i∈N ,k∈K and
In the sequel, by letting κmax
p
= {κi,k }∀i∈N ,k∈K , we accordingly have
p
κmax
xϑ − x T
1 x
p
κmax K xϑ − x T + κmax
≤
K pϑ − p T
μ
x K p
K κmax
pϑ − p
xϑ − x T ≤ κmax
⇔ 1−
μ
μ
T
.
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
74
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 9, NO. 1, FEBRUARY 2022
xϑ − x
T
x i,k − x i,k =
ϑ
T
=
i∈N ,k∈K
≤
i∈N ,k∈K
T
i∈N ,k∈K
⎡
≤⎣
e−μt
0
=
0
T
0
0
⎡
1⎣ ≤
μ i∈N ,k∈K
+
i∈N ,k∈K
T
T
e−μt
0
0
i,k xi ϑ (τ ), pi ϑ (τ ) − i,k (xi (τ ), pi (τ ))dτ dt
i,k xi ϑ (τ ), pi ϑ (τ ) − i,k (xi (τ ), pi (τ )) dτ dt
e−μτ i,k xi ϑ (τ ), pi ϑ (τ ) − i,k (xi (τ ), pi (τ ))
T
i∈N ,k∈K
T
T
τ
e−μ(t−τ ) dtdτ
⎤
e−μτ i,k xi ϑ (τ ), pi ϑ (τ ) − i,k (xi (τ ), pi (τ )) dτ ⎦
T
0
∞
e−μt dt
0
e−μτ i,k xi ϑ (τ ), pi ϑ (τ ) − i,k xi (τ ), pi ϑ (τ ) dτ
i∈N ,k∈K
T
0
e−μτ
⎤
i,k xi (τ ), pi ϑ (τ ) − i,k (xi (τ ), pi (τ )) dτ ⎦
x
As long as K κmax
< μ, we can have xϑ (·) →s x(·) when
w
pϑ (·) → p(·). This completes the proof.
R EFERENCES
[1] Y. Zou, S. Feng, J. Xu, S. Gong, D. Niyato, and W. Cheng, “Dynamic
games in federated learning training service market,” in Proc. IEEE
Pacific Rim Conf. Commun., Comput. Signal Process. (PACRIM),
Aug. 2019, pp. 1–6.
[2] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
no. 7553, pp. 436–444, May 2015.
[3] Stratistics MRC, “Machine learning—Global market outlook (2019–
2027),” Annapolis, MD, USA, Tech. Rep. SMRC19398, 2020.
[4] REGULATION (EU) 2016/679 of the European Parliament and of
the Council on the Protection of Natural Persons With Regard to
the Processing of Personal Data and on the Free Movement of Such
Data, and Repealing Directive 95/46/EC (General Data Protection
Regulation), EU, Brussels, Belgium, 2016.
[5] H. B. Mcmahan, E. Moore, D. Ramage, S. Hampson, and B. A. Y. Arcas,
“Communication-efficient learning of deep networks from decentralized
data,” in Proc. 20th Int. Conf. Artif. Intell. Statist. (AISTATS), 2017,
pp. 1273–1282.
[6] J. Konečnỳ, H. B. McMahan, D. Ramage, and P. Richtárik,
“Federated optimization: Distributed machine learning for ondevice intelligence,” 2016, arXiv:1610.02527. [Online]. Available:
http://arxiv.org/abs/1610.02527
[7] J. Konečnỳ, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh,
and D. Bacon, “Federated learning: Strategies for improving communication efficiency,” 2016, arXiv:1610.05492. [Online]. Available:
http://arxiv.org/abs/1610.05492
[8] Y. Lin, S. Han, H. Mao, Y. Wang, and W. J. Dally, “Deep gradient
compression: Reducing the communication bandwidth for distributed
training,” in Proc. ICLR, 2018, pp. 1–14.
[9] C. Fung, C. J. M. Yoon, and I. Beschastnikh, “Mitigating sybils in federated learning poisoning,” 2018, arXiv:1808.04866. [Online]. Available:
http://arxiv.org/abs/1808.04866
[10] K. Bonawitz et al., “Practical secure aggregation for privacy-preserving
machine learning,” in Proc. ACM SIGSAC Conf. Comput. Commun.
Secur., Oct. 2017, pp. 1175–1191.
[11] R. C. Geyer, T. Klein, and M. Nabi, “Differentially private federated
learning: A client level perspective,” in Proc. NIPS Workshop, Mach.
Learn. Phone Other Consum. Devices, 2017, pp. 1–7.
[12] J. Kang, Z. Xiong, D. Niyato, Y. Zou, Y. Zhang, and M. Guizani, “Reliable federated learning for mobile networks,” IEEE Wireless Commun.,
vol. 27, no. 2, pp. 72–80, Apr. 2020.
[13] T. S. Brisimi, R. Chen, T. Mela, A. Olshevsky, I. C. Paschalidis,
and W. Shi, “Federated learning of predictive models from federated
electronic health records,” Int. J. Med. Informat., vol. 112, pp. 59–67,
Apr. 2018.
(12)
[14] A. Hard et al., “Federated learning for mobile keyboard prediction,” 2018, arXiv:1811.03604. [Online]. Available: http://arxiv.org/
abs/1811.03604
[15] T. Yang et al., “Applied federated learning: Improving Google keyboard query suggestions,” 2018, arXiv:1812.02903. [Online]. Available:
http://arxiv.org/abs/1812.02903
[16] L. Liang, H. Ye, and G. Y. Li, “Toward intelligent vehicular networks:
A machine learning framework,” IEEE Internet Things J., vol. 6, no. 1,
pp. 124–135, Feb. 2019.
[17] S. Samarakoon, M. Bennis, W. Saad, and M. Debbah, “Distributed federated learning for ultra-reliable low-latency vehicular communications,”
IEEE Trans. Commun., vol. 68, no. 2, pp. 1146–1159, Feb. 2020.
[18] Z. Ning et al., “Mobile edge computing enabled 5G health monitoring
for Internet of medical things: A decentralized game theoretic approach,”
IEEE J. Sel. Areas Commun., vol. 39, no. 2, pp. 463–478, Feb. 2021.
[19] Z. Ning et al., “Partial computation offloading and adaptive task scheduling for 5G-enabled vehicular networks,” IEEE Trans. Mobile Comput.,
early access, Sep. 18, 2020, doi: 10.1109/TMC.2020.3025116.
[20] Z. Ning et al., “Intelligent edge computing in Internet of vehicles: A
joint computation offloading and caching solution,” IEEE Trans. Intell.
Transp. Syst., vol. 22, no. 4, pp. 2212–2225, Apr. 2021.
[21] K. A. Bonawitz et al., “Towards federated learning at scale: System design,” in Proc. SysML, 2019, pp. 1–15. [Online]. Available:
https://arxiv.org/abs/1902.01046
[22] R. Johnson and T. Zhang, “Accelerating stochastic gradient descent using
predictive variance reduction,” in Proc. Adv. Neural Inf. Process. Syst.,
vol. 26, 2013, pp. 315–323.
[23] O. Shamir, N. Srebro, and T. Zhang, “Communication-efficient distributed optimization using an approximate Newton-type method,” in Proc.
Int. Conf. Mach. Learn., 2014, pp. 1000–1008.
[24] L. Melis, C. Song, E. D. Cristofaro, and V. Shmatikov, “Inference attacks
against collaborative learning,” 2017, arXiv:1805.04049. [Online]. Available: https://arxiv.org/abs/1805.04049
[25] B. Hitaj, G. Ateniese, and F. Perez-Cruz, “Deep models under the GAN:
Information leakage from collaborative deep learning,” in Proc. ACM
SIGSAC Conf. Comput. Commun. Secur., Oct. 2017, pp. 603–618.
[26] A. Gascón et al., “Privacy-preserving distributed linear regression on
high-dimensional data,” Proc. Privacy Enhancing Technol., vol. 2017,
no. 4, pp. 345–364, 2017.
[27] S. Hardy et al., “Private federated learning on vertically partitioned data
via entity resolution and additively homomorphic encryption,” 2017,
arXiv:1711.10677. [Online]. Available: http://arxiv.org/abs/1711.10677
[28] J. Kang, Z. Xiong, D. Niyato, H. Yu, Y.-C. Liang, and D. I. Kim,
“Incentive design for efficient federated learning in mobile networks:
A contract theory approach,” in Proc. IEEE VTS Asia Pacific Wireless
Commun. Symp. (APWCS), Aug. 2019, pp. 1–5.
[29] Y. Jiao, P. Wang, D. Niyato, B. Lin, and D. I. Kim, “Toward an
automated auction framework for wireless federated learning services
market,” IEEE Trans. Mobile Comput., early access, May 14, 2020, doi:
10.1109/TMC.2020.2994639.
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
CHENG et al.: DYNAMIC GAMES FOR SOCIAL MODEL TRAINING SERVICE MARKET
[30] Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra, “Federated
learning with non-IID data,” 2018, arXiv:1806.00582. [Online]. Available: http://arxiv.org/abs/1806.00582
[31] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11,
pp. 2278–2324, Nov. 1998.
[32] K. Zhu, E. Hossain, and D. Niyato, “Pricing, spectrum sharing,
and service selection in two-tier small cell networks: A hierarchical
dynamic game approach,” IEEE Trans. Mobile Comput., vol. 13, no. 8,
pp. 1843–1856, Aug. 2014.
[33] K. Zhu and E. Hossain, “Joint mode selection and spectrum partitioning for device-to-device communication: A dynamic Stackelberg
game,” IEEE Trans. Wireless Commun., vol. 14, no. 3, pp. 1406–1420,
Mar. 2015.
[34] Cauchy-Lipschitz Theorem. Accessed: Feb. 25, 2021. [Online]. Available: https://www.encyclopediaofmath.org/index.php/Cauchy-Lipschitz_
theorem
[35] S. Sastry, “Lyapunov stability theory,” in Nonlinear Systems (Interdisciplinary Applied Mathematics), vol. 10. New York, NY, USA: Springer,
1999, doi: 10.1007/978-1-4757-3108-8_5.
[36] E. J. Dockner, S. Jorgensen, L. N. Van, and G. Sorger, Differential
Games in Economics and Management Science. Cambridge, U.K.:
Cambridge Univ. Press, 2000.
[37] D. Tabak, “Numerical solutions of differential game problems,” Int. J.
Syst. Sci., vol. 6, no. 6, pp. 591–599, 1975.
[38] Raspberry Pi Foundation. Raspberry Pi 3 Model B. Accessed:
Feb. 25, 2021. [Online]. Available: https://www.raspberrypi.org/
products/raspberry-pi-3-model-b/
Wenqing Cheng received the B.E. degree in
telecommunication engineering and the Ph.D. degree
in electronics and information engineering from the
Huazhong University of Science and Technology,
Wuhan, China, in 1985 and 2005, respectively.
She is currently a Professor with the School
of Electronic Information and Communications,
Huazhong University of Science and Technology.
Her research interests include mobile communications and wireless sensor networks, information
systems, and e-learning applications.
75
Yuze Zou received the B.E. degree in electronic information communications (EIC) from the
Huazhong University of Science and Technology,
Wuhan, China, in 2015, where he is currently pursuing the Ph.D. degree with the School of Electronic
Information and Communications.
His research interests include federated learning,
intelligent reflecting surface, and game theory and
its applications in networked systems.
Jing
Xu
received the B.E. degree in
telecommunication engineering and the Ph.D.
degree in electronics and information engineering
from the Huazhong University of Science and
Technology, Wuhan, China, in 2001 and 2011,
respectively.
He is currently an Associate Professor with
the School of Electronic Information and
Communications, Huazhong University of Science
and Technology. His research interests include
wireless networks and network security, with an
emphasis on performance optimization, game theory, and reinforcement
learning and their application in networked systems.
Wei Liu (Member, IEEE) received the B.E. degree
in telecommunication engineering and the Ph.D.
degree in electronics and information engineering
from the Huazhong University of Science and
Technology, Wuhan, China, in 1999 and 2004,
respectively.
He is currently an Associate Professor with
the School of Electronic Information and
Communications, Huazhong University of Science
and Technology. His research interests include
wireless networks, the Internet measurement, and
e-learning applications.
Authorized licensed use limited to: Nirma University Institute of Technology. Downloaded on January 17,2023 at 06:50:34 UTC from IEEE Xplore. Restrictions apply.
Download