Uploaded by hbwang

manuscript

advertisement
Transfer Learning for Crop Classification by Inter-Regional
Crop Spectral Differences
Hengbin Wang1, Yu Yao1, Zijing Ye1, Wanqiu Chang1, Junyi Liu1, Yuanyuan Zhao1,2*, Shaoming
Li1,2, Zhe Liu1,2, Xiaodong Zhang1,2
1College
of Land Science and Technology, China Agricultural University, Beijing, China
Laboratory of Remote Sensing for Agri-Hazards, Ministry of Agriculture and Rural Affairs,
Beijing, China
2Key
* Correspondence: zhaoyuanyuan@cau.edu.cn
Abstract
Transferring a model trained in a region with sufficient samples to another region with sparse samples
can solve the problem of classification in sample sparse regions, but the effect of transferring is related
to the difference in geographical environment between regions. Therefore, this study proposes a transfer
learning strategy that uses Inter-Regional Differences of Crop Spectra (IRDCS) to adapt the interregional differences in geographic environments, as a way to achieve a large scale transfer of
classification models. In addition, a new crop classification model named Symmetric convolutional
neural network with Position encoding (PoSyNet) is proposed. This model can use the temporal
information of critical period of crop growth, as a way to discriminate the differences of the same time
period between different crops. A simple and effective method for quantifying IRDCS is designed, and
the IRDCS obtained by this method is representative and generalizable. To test the approaches proposed
in this study, the Northwest region of China is used as the source domain, with a sufficient number of
training samples, and the northeast region of China with fewer samples was used as the transfer target
domain. PoSyNet is compared with Transformer, Random Forest (RF), and Convolution+Transformer
(CT). In the source domain crop classification experiments, PoSyNet achieved an Overall Accuracy (OA)
of 93.14%, an improvement of 3.58%, 4.8%, and 1.82%, respectively, compared to the other three
methods, and PoSyNet still dominated the classification results in the target domain. The results showed
that PoSyNet had the ability to learn multi-level semantic features and spatial generalization. We found
that IRDCS can adapt the differences in geographic environments between regions, and adding IRDCS
to model transfer can substantially improve classification accuracy. In the cross-regional transfer
experiments, all four methods improved the accuracy of model transfer across years by more than 25%
after adding IRDCS. In the scenario without using current-year samples, the accuracy improvement of
the four methods was still close to 25% using only the IRDCS of samples of historical years. The results
indicated the generality of the proposed transfer strategy. Overall, this study provided a new idea for
large-scale crop classification and cross-regional model transfer.
Keywords: transfer learning; IRDCS; PoSyNet; quantifying differences; crop classification
1. Introduction
Accurate, timely, and reliable crop-type distribution maps provide important support for crop
growth monitoring, yield prediction, and food security (Franch et al 2015, Gallego et al 2008). In recent
years, remote sensing technology has developed rapidly, and its characteristics of wide observation range
and long duration have been widely used in the field of crop monitoring. The high-resolution observation
images provided by remote sensing provide a great convenience for large-scale crop classification(Chen
et al 2015, Gong et al 2013, Yu et al 2013), and crop classification using the unique spectra of each crop
provided by remote sensing has become the most common method(Bargiel 2017, Skakun et al 2017,
Zhang et al 2014, Zhong et al 2019a).
Ground-labeled samples are an important basis for achieving high accuracy in crop classification,
especially when the labeled sample set is very rich, which can further improve the classification accuracy
(Ghazaryan et al 2018). Field surveys and sampling of multiple areas are required for large-scale crop
classification to meet the demand for classification samples, but such large-scale sample acquisition is
labor-intensive and very expensive. Some studies were conducted by visual interpretation for manually
labeled sample acquisition through Google Earth high-resolution images(Dong et al 2015, Zhang et al
2015). These methods are feasible for acquiring labeled samples of bulk crops, but require experts with
extensive knowledge and are not applicable for acquiring a large volume of labeled samples. The
Cropland Data Layer (CDL) released by the USDA is an effective way to obtain ground-based crop types
(Boryan et al 2011), and this way of acquiring crop type maps based on a large number of the field labeled
samples is not actually easy to achieve in other countries(Song et al 2017).
Classification models with high robustness and generalization are the another basis for crop
classification. Several studies have shown that Deep Learning(DL) models are more robust and more
generalizable than traditional machine learning methods for crop classification(Xu et al 2020, Yuan et al
2022, Yuan et al 2023, Zhong et al 2019a). In a few cases, machine learning methods such as Random
Forests and Multilayer Perceptron (MLP) remain general(Lin et al 2022, Skakun et al 2016, Wang et al
2019). Among DL models, Recurrent Neural Networks (RNN) (Minh et al 2018, Sharma et al 2018),
Transformer(Yuan & Lin 2021) and Convolutional Neural Networks (CNN) (Lei et al 2021, Wang et al
2022) perform particularly well in pixel-level crop classification because of their ability to model the
processing of high-dimensional series data. Further, fusing two deep learning models has also received
extensive attention(Wang et al 2023). Existing DL models in the crop classification are more often
derived from other fields, including Computer Vision (CV) and Natural Language Processing (NLP).
Although these models can also effectively solve the problems in crop classification, their application to
the crop classification is not well matched(Güldenring & Nalpantidis 2021). In addition, sufficient
samples also provide support for DL, however, DL does not perform well in the regions with sparse
samples.
Existing approaches to solve the problem of crop classification in sample-sparse regions can be
divided into two categories based on the source of training samples(Aneece & Thenkabail 2021, Hao et
al 2020, Hao et al 2016, Huang et al 2020).(1) Sample generation: generate a large number of training
samples in regions with sparse samples, and then use the generated samples to train the model in the
target domain.(2) Transfer learning: train the model in sample-rich regions, and then use the trained
model for classification in sample-sparse regions. Specifically, while sample generation produces a large
number of samples, the generated samples depend on a small number of samples from the target domain,
which makes the similarity between the two extremely high and the trained model has obvious
drawbacks(Belgiu et al 2021b). In addition, some methods that generate samples need to find key
screening points from the spectral profile(Belgiu et al 2021a, Lin et al 2022, Malambo & Heatwole 2020),
but this is not adapted in most crops. Transfer learning is the most direct way to solve the crop
classification problem in sample-sparse regions, and the models trained using samples rich in the source
domain are robust(Nowakowski et al 2021a). There are exceptions, where model transfer on a small scale
is advantageous, and larger scale transfer is influenced by climatic conditions, growing
environment(Wang et al 2019).
Several recent studies have revealed changes in the consistency and stability of crop spectra over
time, both between years and between regions(Lin et al 2022, Wang et al 2019, Zhong et al 2014). The
impact of these changes in crop spectra on transfer learning is significant, especially in different weather
and management practice contexts (Johnson & Mueller 2021). In fact, the most important factor affecting
transfer learning is the existence of differences in the spectra of the same crop between regions. Due to
the similarity of spectra of the same crop within the same region, it is feasible to calculate the differences
in crop spectra between the sample-sparse and sample-rich regions. However, how to quantify and adapt
to this difference remains the biggest challenge for cross-regional model transfer.
Considering that direct model transfer cannot obtain reliable classification results and that there are
differences in crop phenology of the same crop between different geographical environments. In this
paper, we proposed a transfer learning strategy incorporating IRDCS, and verified the generality of the
transfer strategy using four classification methods in a study area geographically separated by more than
3000 km, with the sample-sufficient Northwest China region as the source domain and the sparsely
sampled Northeast China region as the target domain.
Our novel contributions were threefold:
1) constructed a variable dimensional convolution module to extract different forms of semantic
features, and designed a symmetric structured network model to ensure no loss of spectral information;
2) proposed a computational method to quantify IRDCS and incorporated IRDCS into transfer
learning to solve the problem of unsatisfactory performance when transfer model across regions.;
3) evaluated the cross-regional and cross-year generality of IRDCS, highlighting the superiority of
variable dimensional convolutional modules and symmetric network structures.
2. Related work
2.1 Deep learning structures in crop classification
In recent years, DL has become a cutting-edge technology for crop classification, thanks to its
excellent performance in processing series data. RNN, Transformer and CNN are three commonly used
DL structures in crop classification, which are good at extracting temporal features, global features and
local features, respectively. For example, Xu et al (2020) used a variant of RNN, Long Short Term
Memory (LSTM), to obtain temporal features from multi-temporal remote sensing data and demonstrated
that LSTM outperformed RF, MLP and Transformer, which are not good at extracting temporal
information. Further, Yuan et al (2022) used the Transformer encoding module to extract global and
temporal features using temporal-spatial domain spectra as input. Wang et al (2023) added a local feature
extraction module to this for classification. The results show that the more complete the features the
better the classification performance. However, in another work, Rawat et al (2021) found that onedimensional CNN perform better than hybrid one-dimensional CNN-LSTM. This indicates that although
fusion structures can theoretically lead to better results, the issue of compatibility between structures
needs to be considered.
Although significant progress has been made in DL architectures in crop classification, especially
RNN and Transformer, the application of CNN in crop classification has rarely been considered, mainly
because CNN is not good at processing in the temporal domain(Xu et al 2020). In fact, RNN cannot be
computed in parallel, and Transformer requires a large number of training samples, which limits its
further development. Theoretically, CNN Convolutional Layer can extract local features, Fully
Connected Layer can fuse all local features to form global features, and if temporal features are
introduced on this basis, complete features can be formed. Therefore, it is worth considering how to
further develop CNN in crop classification.
2.2 Transfer learning in crop classification
Transfer learning is an important solution to the problem of crop classification in regions with sparse
samples. The source domain for transfer learning can be other domains, for example, Nowakowski et al
(2021b) transfer a classification model from CV to crop classification achieving accuracy improvements.
Similar work has been done in(Wang et al 2021, Yuan & Lin 2021). However, the upper performance
bound of model transfer across domains is often limited by the difference between the source and target
domains. The source domain for transfer learning can also be the historical year of the sample sparse
region. These studies have used the historical samples from several years to train the model or generate
the current year samples to train the model and thus complete the current year crop classification(Hao et
al 2016, Konduri et al 2020, Yaramasu et al 2020). These efforts are based on the premise that interannual variation in crop phenology is consistent, however, inter-annual variation in phenology is
generated over time (Lin et al 2022). The source domain of transfer learning is more often other regions.
For example, several studies have trained model in sample-rich regions to achieve cross-state level model
transfer in the US (Wang et al 2019, Xu et al 2020). On a larger scale, Hao et al (2020) used highconfidence CDL data as a training sample, and the trained model achieved satisfactory results in China,
achieving cross-country level transfer.
The essence of transfer learning is the assumption that phenology and growth patterns are similar
for the same crop (Hao et al 2020), but that this assumption exists for a limited number of regions or
years due to differences in regional cropping environments or yearly cropping management. However,
all studies have focused on the study of transfer models. So far, this difference has not been investigated
in transfer learning.
3. Materials
3.1 Study area
Fig. 1. Geography of three study areas. The upper left study area is the Ili River Valley (Study area Ⅰ),
the upper right study area is the Western Heilongjiang(Study area Ⅱ) and the lower right study area is the
Eastern Heilongjiang(Study area Ⅲ).
Three study areas were selected for this study as shown in Fig. 1. The first study area is located in
the Ili River valley in northwest China, which has a continental temperate climate with high precipitation,
large diurnal temperature difference, abundant light and heat resources, and annual sunshine hours
between 2699 and 3158 h. The main crops in the study area are maize, rice, and wheat, which account
for more than 90% of the total cultivated area, while other crops such as soybean and sunflower are
grown in smaller proportions. The second and third study areas are located in the northeast region of
China, with unique black land cultivation conditions, a cold-temperate continental monsoon climate with
abundant rainfall and sunshine hours between 2560 and 2700 h. The main crops in the study area are
soybean, rice, maize, and wheat, which account for more than 95% of the total cultivated area, and the
transfer effects of soybean, rice, and maize are focused on in the study of this paper.
The three study areas are geographically extremely far apart, but the main cropping structures are
similar and suitable for verifying the performance of transfer learning. In addition, since the study areas
belong to different climates and have differences in landscape conditions as well as land resources,
differences in the phenology of the same crop can occur, which is an important reason for selecting these
two areas as the study areas.
3.2 Satellite imagery and sample dataset
We used remote sensing images from the GaoFen-1 satellite with a temporal resolution of 4 days
and a spatial resolution of 16 m. The remote sensing images contain four bands: blue, green, red and near
red. The remote sensing images from March 1 to November 1 were selected for the time series analysis
according to the crop cultivation rules and the climatic conditions of the two regions. After 10% cloud
screening, the number of time phases of remote sensing images in different regions would be different,
which led to inconsistent time series length, so we performed linear time interpolation of the time series
with an interval of 10 days.
In study area Ⅰ, we have a rich field sampling sample, while in study area Ⅱ, Ⅲ field sampling
samples are fewer. Study area Ⅰ in this paper used a sample from Northwest in 2017, and the other two
study areas used samples from two consecutive years including 2017 and 2018. The sample categories
common to the three study areas were rice, maize, and soybean, while other crops, as well as land cover
types, were combined into other categories. The number of samples in each category is shown in Table
Ⅰ.
Table Ⅰ Sample size of the three study areas
Year
2017
2018
Study area Ⅰ
Study area Ⅱ
Study area Ⅲ
Study area Ⅱ
Study area Ⅲ
Rich
30771
39
197
139
70
Maize
171126
440
515
896
722
Soybean
9457
1207
571
896
223
Other
25584
364
197
277
84
Total
236938
2050
1480
2208
1099
4. Methods
Our framework is designed to address the problem of crop classification and mapping in regions
with sparse samples. As shown in Fig. 2.
Proposed framework is divided into three parts: data processing, crop classification and validation,
and crop mapping. In the first part, the data in the source and target domains are divided into training
samples, validation samples and samples for calculating IRDCS. Of these, the proportion of the first two
is 80% and the proportion of the latter is 20%. The validation sample is introduced into IRDCS to form
a new validation sample for the target domain. In the second part, rich training samples from the source
domain are input to VdPyNet, and the trained model is transferred to the target domain for use and three
transfer results are obtained. In the third part, the optimal weights are first obtained from the second part.
Then the IRDCS of the three crops obtained in the first part are introduced into the data to be mapped to
form the new data to be mapped. Finally, the new data to be mapped are fed to the trained VdPyNet for
prediction, and the prediction result after classification probability filtering is the final mapping result.
Data Processing
Data of Target
Domain
Data of Source
Domain
IRDCS
Calculation
Sample
IRDCS
Calculation
Sample
IRDCS
Valid Sample
New Valid
Sample
Train Sample
Crop Classification and Validation
Transfer learning classification result
with No IRDCS
VdPyNet
Optimal weighting
Transfer learning classification result
With IRDCS
Valid and Accuracy
Comparision
Transfer learning classification result
With History IRDCS
Crop Mapping
Mapping Result of Rice
filtering
Classification Probability
VdPyNet
IRDCS Of Maize
Data to be
Mapped
IRDCS Of Rice
Mapping Result of Maize
Mapping Result of Soybean
IRDCS Of Soybean
Fig. 2. Illustration of the proposed framework.
4.1 Transfer learning using spectral differences
Stage 1. Dividing samples.
To prevent pixels of the same plot from being divided into both training and validation sets, we
devised an iterative sample division method. First, all pixels will be divided according to 8:2 to obtain
ntrain and ndifference , where ntrain in the source and target domains represent the training and
validation sets, respectively. Then all plots will be divided according to 8:2, N train and Ndifference ,
where the sets of pixels in N train and Ndifference are nˆtrain and nˆtrain , respectively. Finally, with
  nˆdifference  ndifference  nˆtrain  ntrain as the optimization function, when  is smallest,
ntrain and ndifference are the final division results.
Stage 2. Calculating spectral differences between individual samples.
The motivation for IRCSD comes from image denoising, where all the differences between an image
with noise and an image without noise are reflected in the noise, and more specifically in the differences
in each pixel. Thus, the crop spectra reflectance in the source domain can be seen as pure, while the crop
spectra reflectance in the target domain are noisy, as reflected in the spectral reflectance of Day of Year
(DOY) in the crop growth period. The relationship between the two can be satisfied by equation (1)
y (t )  x(t )  g (t )
(1)
where y(t) and x(t) denote the crop spectral reflectance of the target and source domains, respectively,
g(t) denotes the noise between them, and t denotes the DOY. However, when x(t) is 0, the noise remains,
which does not follow the common sense. Therefore, in equation (1), we add a causal factor
  x t 
to the noise, i.e.
y t   x t     g t 
(2)
Stage 3. Calculating IRCSD.
Since there is no correspondence between the source and target domain samples, a fuzzy
correspondence is used for the calculation of IRCSD. Using the source domain samples as units x j , L
samples are randomly selected from the target domain as the fuzzy set
between each
yi . The spectral difference
x j in the source domain and the yi is calculated, and the mean value of the result is
used as the IRCSD. The IRCS is computed as follows:
n
G 
 y x
i
iL
j 1
n
n x j
j
(3)
j 1
where G denotes IRCSD. L 
0, c is the random number and c is the maximum value of the number
of samples of each class used to calculate the spectral difference in the target domain. n denotes the
number of samples in the source domain. Further,
GC denotes the IRCSD of different crops and C
denotes the crop category.
Stage 4. Crop mapping using IRCSD.
GC is unique for different crops is a prerequisite for crop mapping using IRCSD. The data to be
mapped X is combined with IRCSD to form the new data
Xˆ 
X
to be mapped. X̂ is fed
1  GC 
into the trained model for prediction. It is worth noting that we did not use the source domain spectra
minus the IRDCS to generate the target domain spectra. Converting the rich and diverse samples in the
source domain into samples that are very similar to the sparse target domain spectra is very uneconomical,
and the trained model has poor generalization ability.
4.2 Variable-dimension Position Symmetric Net
The time point of crop spectral reflectance acquisition is an important feature, combining the
spectral reflectance and temporal as the input of the classification model can improve the model's
understanding of crop growth patterns. By setting different depths of convolutional layers, the features
of different layers can be extracted to learn the spectral variation of crops in a diversified way. Based on
this, we designed the Variable-dimension Position Symmetric Net (VPSNet) model for crop classification,
which takes the crop spectral and temporal as input, a symmetric form of network architecture as the
feature extractor, and a single Fully Connected Layer as the classifier. VPSNet consists of several blocks,
each with different depths of hidden layers, to extract the features of different layers. The specific
network architecture is shown in Fig. 3.
Input module: the time series
 se1 , se2 ,..., seT 
of each pixel is encoded as a d-dimensional
spectra feature vector sf i by equation (4), where set denotes the spectral reflectance of the four bands, T
denotes the length of the time series, i  [0, T ] , and the temporal
doy1 , doy2 ,..., doyT 
of the
time series is encoded as a d-dimensional temporal feature vector tfi by equation (5). The spectral and
temporal features concat into
xi
by equation (6) as the input to the model.
sfi  f  sei 
 sin  doyi / 10002 k / d 

tfi  p   
2k /d
cos  doyi / 1000 
(4)
if p  2k
if p  2k  1
xi  Concat  tfi , sfi 
(5)
(6)
where k  [0, d ] , d is the dimensionality of the feature vector, and i denotes the sample size.
VPSNet Module: In this study, each Block in VPSNet with different hidden layer depths extracts
features from the input vector at different levels. Although the depth of the middle hidden layer of each
Block is changed according to the number of Blocks, the input layer and the output layer are of equal
depth, which ensures that the information of the input feature vector is not lost. Compared with images,
the redundancy of crop time series is low, so it is not possible to use a CNN structure similar to that of
image classification. We designed the Block with variable depth of the middle hidden layer to ensure no
information loss and to satisfy the diverse hierarchy of the network structure. In addition, to prevent the
gradient from disappearing during the training process, we took the residual structure in each Block.
Each Block hidden layer vector h, input input vector and output output vector can be expressed as
input l  output l 1  input l 1

hll1  LayerNorm wll1   input l   bll1

  
h   b
(7)

hll2  act wll2  hll1  bll2
output l  wll3
where l denotes the lth Block,
l
l2
(8)
l
l3
li denotes the ith hidden layer in the lth Block, w and b denote the weight
matrix that can be learned, act is the GELU activation function, and * denotes the convolution operation.
LayerNorm is used for the first layer of convolution operation, it is more effective than BatchNorm for
sequence processing, as has been demonstrated in previous studies (Liu et al 2022). The first hidden layer
is composed of one-dimensional convolution, and the convolution kernel size is derived from the
candidate values 3,5,7,9,11. The last two hidden layers consist of Linear Layers, extracting features in
multiple dimensions. Compared with the traditional CNN network structure, VPSNet uses fewer
activation layers. The previous network structure used a fixed pairing of convolution + activation, while
we used only one activation function in a Block. The inverse bottleneck structure allows VPSNet to
reduce the number of parameters while maintaining competitive performance and improving the
efficiency of the overall model.
Output module: This module consists of a Global Maximum Pooling Layer and a Linear Layer. The
Linear Layer output vector is computed using Softmax to obtain the classification labels and
classification probabilities.
The training process uses the cross-entropy function as the loss function and the Adam optimizer
for backpropagation. To prevent the model from overfitting, the Dropout technique is used, which
randomly loses some of the weights in each Block and its value is set to 0.1.
4.3 Experimental setup
4.3.1 Different forms of network architectures
To evaluate the performance of the proposed variable-dimension Block, a comprehensive
comparison was conducted to compare five different forms of net architectures (Fig. X). To ensure a fair
comparison, the number of Blocks is the same for all network structures and all with a position encoding
module. A brief description of the five structures is as follows:
(1)
Invariant dimensional(ID): In this architecture, all blocks have the same dimensions and the
inputs and outputs of each block are of the same dimension. The overall structure is symmetric.
(2)
Up-sampling(US): In this architecture, the dimensionality of blocks is variable, the
dimensionality of each block is incremental, and the input and output of each block are equally
dimensional. The overall structure is asymmetric.
(3) Down-sampling(DS): In this architecture, the dimensionality of blocks is variable, the
dimensionality of each block is decreasing, and the input and output of each block are equally
dimensional. The overall structure is asymmetric.
(4)
Input-output non-reciprocal(IONR): In this architecture, the dimensionality of the blocks is
variable, and the inputs and outputs of each module are non-identical in dimensionality. The overall
structure is symmetric.
(5)
No position encoding(NoPE): In this architecture, the dimensionality of blocks is variable,
and the input and output of each block are equally dimensional. The overall structure is symmetric, but
the inputs are not position encoding.
4.3.2 Different classification methods
For comparison, we compared the results of the three study areas with the three classification
methods. RF is an effective traditional classification method known for its avoidance of overfitting and
low complexity. RF is often built as a baseline model, and it performs very excellently in crop
classification(Lin et al 2022, Wang et al 2019). Transformer consists of multiple self-attention layers that
can extract dependencies between long time sequences and has been successfully applied in several crop
classification works(Yuan & Lin 2021, Yuan et al 2022). Adding CNN to the Transformer has also
become a mainstream solution in crop classification, where features can be extracted from both global
and local perspectives(Li et al 2020, Wang et al 2023). VPSNet can be compared with RF to highlight
the importance of temporal-domain learning, with Transformer to highlight the importance of local
feature learning, and with CNN+Transformer to highlight the advantages of VPSNet structure.
For the optimal hyperparameter setting, RF has two parameters n_estimator and max_features to be
set respectively, in this study they are 500 and 4. The settings of Transformer hyperparameters include:
the number of self-attention hidden layer dimensions is 256, the number of heads of multi-headed
attention is 8, the number of layers of Transformer Block is 3, and the number of fully connected layer
dimensions is 1024. CNN+Transformer in the Transformer part of the hyperparameter settings remain
unchanged. CNN part: convolutional layer dimension is 256, convolutional kernel is 3 × 3, stride is 1.
The other settings of Transformer and CNN+Transformer, such as optimizer, loss function, etc. are the
same as VPSNet.
4.3.3 Different ways of IRCSD utilization
In Section 4.1 we introduced a way of utilizing IRCSD. For comparison, we devised another way
of exploiting IRCSD: after obtaining it, we applied it to the source domain, generated training samples
in the target domain, and trained the model using the newly generated samples. In contrast to the approach
in Section 4.1, we generate a large number of samples in the target domain to solve the crop classification
problem in regions with sparse samples. The purpose is to emphasize the difference between generated
samples and model transfer.
4.3.4 Experimental setup and accuracy assessment
The number of blocks in VPSNet is set to 9,11,13 and the dimension of the initial hidden layer is
256. The training process of study area Ⅰ with eopch=20, study area Ⅱ and Ⅲ with epoch=1000, the batch
is set to 256, the learning rate is set to 1e-5, and the dropout rate is set to 0.1. The optimizer uses Adma
and the loss function uses the cross-entropy function.
The entire experiment was run on a Windows platform configured with an i7-11700 K @ 3.60 GHz,
32 G RAM, and NVIDIA GeForce RTX 3080 GPU (10 GB RAM), and all programs were written using
the python language.
In this study, several evaluation metrics were used to evaluate the proposed transfer learning strategy
and classification model. Overall Accuracy (OA) is used to evaluate the classification accuracy of the
classification model and the overall performance of the transfer learning strategy. Intersection over Union
(IoU) calculates the overlap of the prediction results of the two crop types and is used to evaluate the
mapping effectiveness of the transfer learning strategy. Equations (9) and (10) were used to calculate OA
and IoU.
nicorr
OA   all
i 1 ni
N
IoU i , j 
n jpred  nipred
nipred  n jpred
F1i  2
UAi * PAi
UAi  PAi
(9)
(10)
(11)
where N represents the number of categories, i represents the ith category, n represents the number of
samples correctly classified in the ith category, n denotes the number of samples used for validation in
the ith category.
n pred represents the number of samples predicted,  represents the number of pixels
that overlap in the prediction of two categories of crops, and  concurrent represents the sum of all
pixels in the prediction of two categories of crops. UA and PA represent user accuracy and producer
accuracy.
5. Results and analysis
5.1 Comparison of different network architectures
Table 2 reports the accuracy results for different network architectures. It can be observed that
our method (VPSNet-B) always performs best, due to the use of variable dimensional symmetric
architecture. It can also be observed that the six different architectures perform better in study area Ⅰ (SA
Ⅰ) than in study area Ⅱ (SA Ⅱ) and study area Ⅲ (SA Ⅲ), which shows that the model can obtain higher
results when the sample is sufficient. Among all architectures, the US architecture performs the worst,
which shows that increasing the number of dimensions of hidden layers with asymmetry increases the
redundancy affecting the results. The IONR architecture performs second worst, and its inconsistent
input and output dimensions corrupt the information of the original sequence. It is worth noting that the
ID architecture achieves suboptimal results with fewer parameters, but performs very poorly in study
area 1. The DS and NoPE architectures also perform satisfactorily, which demonstrates the importance
of keeping the output and input dimensions consistent in crop classification.
Table 3 reports the F1 scores for each class obtained for the different network architectures in the
three study areas for two years. We noticed that VPSNet-B obtained the best accuracy in 7 out of 15
scenarios. For some scenarios, such as 2017_SA Ⅱ soybean and 2018_SA Ⅱ soybean, the proposed
method obtained suboptimal accuracy. However, for a few network architectures, the results vary very
much across scenarios, for example, the US architecture has an accuracy of 0% in rice and maize in
2017_SA Ⅱ, but an outstanding accuracy performance in soybean. For some scenarios, such as 2017_SA
Ⅲ maize and 2018_SA Ⅲ rice, the VPSNet architecture is lower than the DS and ID architectures,
possibly due to the addition of redundant information at variable dimensionality (e.g., up sampling),
which is retained in the classification features during the follow-up process, but this scenario is more
often found in regions with sparse samples.
Table 2 Accuracy assessment of different network architectures
Network
2017_SA
2017_SA
2017_SA
2018_SA
2018_SA
Parameter
architectures
Ⅰ
Ⅱ
Ⅲ
Ⅱ
Ⅲ
count
OA
ID
79.15
80.27
65.50
80.98
72.34
98k
US
79.02
64.23
47.66
66.59
62.13
24M
DS
95.03
76.89
66.67
80.73
68.94
24M
IONR
80.84
76.89
61.99
77.80
57.87
17M
NoPE
91.38
71.53
65.20
73.66
71.49
16M
VPSNet-B
95.53
81.02
66.37
80.73
76.17
16M
VPSNet-B: VPSNet Base Model, Configure 11 Block.
Parameter count: number of trainable parameters without/with auxiliary classifiers
Table 3 F1-score per scenario (%) of different network architectures
Crop
Net
2017_SA Ⅰ
2017_SAⅡ
2017_SAⅢ
2018_SAⅡ
2018_SAⅢ
architectures
ID
99.01
75.00
82.80
88.89
80.77
US
99.02
0.00
59.51
25.00
0.00
DS
98.41
61.54
79.29
89.66
68.00
IONR
98.81
76.89
78.37
96.25
33.33
NoPE
89.80
54.55
82.42
75.00
79.31
VPSNet-B
98.92
80.00
79.10
96.30
73.68
ID
85.63
75.15
54.54
81.70
81.02
US
85.52
0.00
44.25
68.57
76.39
DS
97.29
70.00
62.24
82.65
80.00
IONR
87.53
67.47
54.36
78.02
71.35
NoPE
94.96
47.95
57.14
74.01
79.61
VPSNet-B
97.69
76.83
56.99
80.38
84.32
ID
81.45
88.38
65.92
87.12
43.37
US
80.43
76.64
46.67
75.29
0.00
DS
82.33
84.46
65.85
85.48
8.00
Soybea
IONR
80.84
84.97
62.73
85.71
21.33.
n
NoPE
77.77
80.59
64.34
80.31
39.51
VPSNet-B
80.97
87.65
68.50
86.81
58.95
Rice
Maize
5.2 Evaluation of VPSNet
In this set of experiments, we verified the effectiveness of the proposed VPSNet. On basis of
VPSNet-B, we added VPSNet-S and VPSNet-L, both of which represent small-scale VPSNet and largescale VPSNet, respectively.
As shown in Table 4, the results of VPSNet-B are better than the other three methods. Specifically,
VPSNet-B improved the average OA of 3.14%, 1.7%, and 1.755% over the other methods in the three
study areas, respectively. Surprisingly VPSNet-S performed equally well, with optimal results in three
of the five scenarios and a significant advantage in its number of parameters. Unexpectedly VPSNet-L
beat all the comparison methods. It is noteworthy that all methods achieve very good results in SA Ⅰ,
which is rich in samples, while the results in SA Ⅱ and SA Ⅲ, which are sparse in samples, are relatively
low. This demonstrates the inability of relying on classification methods to solve the crop classification
problem in sample-sparse areas.
Table 5 summarizes the F1 scores for each scenario for the different methods. We note that VPSNet
achieves the best performance in the vast majority of scenarios (12 out of 15 scenarios). For example, an
improvement of more than 6% in 2018_SAⅢ Rice. For CT, which also has the ability to extract complete
features, the proposed method achieves an almost complete victory. This may be due to the fact that,
although CT is able to extract complete features, the degree of feature fusion is not perfect due to the
mutual exclusivity between the individual network architectures.
Table 4 Accuracy assessment of different methods
Methods
2017_SA
Ⅰ
2017_SA
2017_SA
Ⅱ
2018_SA
Ⅲ
Ⅱ
2018_S
Paramet
Ⅲ
er count
RF
91.32
78.59
67.54
79.76
71.49
1k <
Transformer
77.89
76.64
62.28
78.29
69.79
24M
O
CT
92.39
80.05
61.70
80.71
67.23
28M
A
VPSNet-S
92.94
80.05
61.70
79.51
72.34
7.6M
VPSNet-B
95.53
81.02
66.37
80.73
76.17
16M
VPSNet-L
96.06
81.27
67.87
82.54
78.06
31M
VPSNet-S: VPSNet Small Model, configure 9 Block.
VPSNet-L: VPSNet Large Model, configure 13 Block.
Table 5 F1-score per scenario (%)of different methods
2017_SA Ⅰ
2017_SAⅡ
2017_SAⅢ
2018_SAⅡ
2018_SAⅢ
RF
98.68
0.00
79.01
92.31
69.57
Transformer
97.79
82.35
76.47
76.47
71.64
CT
88.44
80.05
76.39
82.76
69.57
VPSNet-S
98.25
66.67
76.54
88.00
71.70
VPSNet-B
98.92
80.00
79.10
96.30
73.68
VPSNet-L
98.20
71.43
80.21
98.89
80.00
RF
94.99
70.73
63.16
80.59
81.11
Transformer
84.80
67.05
52.91
63.53
48.57
CT
95.65
75.27
59.59
80.65
78.48
VPSNet-S
96.03
72.05
46.86
77.99
81.22
VPSNet-B
97.69
76.83
56.99
80.38
84.32
VPSNet-L
98.38
75.43
57.48
80.38
85.19
RF
86.30
86.05
67.67
85.64
32.43
Transformer
83.94
84.71
63.53
85.41
51.49
CT
83.18
76.68
60.98
86.60
33.73
Soybea
VPSNet-S
80.27
87.48
63.80
86.61
53.06
n
VPSNet-B
80.97
87.65
68.50
86.81
58.95
VPSNet-L
81.42
87.90
67.88
87.33
63.68
Crop
Net
architectures
Rice
Maize
5.3 Evaluation of IRCSD in Transfer Learning
In this set of experiments, we compared three transfer learning methods: 1. direct transfer (TL 1); 2.
adding IRCSD to the source domain (TL 2); 3. adding to IRCSD to the target domain (TL 3).
As shown in Table 6, TL 3 has significant advantages over the other two transfer learning methods.
Specifically, TL 3 has a minimum improvement of more than 25% and a maximum improvement of more
than 70% in OA over TL 1, and a minimum improvement of more than 9% and a maximum improvement
of more than 45% over TL 2. The improvement of the four methods is obvious, which demonstrates the
generality of IRDCS. The introduction of IRCSD showed a significant improvement in the transfer
learning results, which demonstrates the effectiveness of IRDCS. Similarly, the improvement of the four
methods is also obvious, which demonstrates the generality of IRDCS. It is worth noting that TL 2 does
not exceed the local results, although it has an improvement in results, while TL 3 obtains better results
than the local ones. This shows the superiority of using IRCSD in TL 3 over TL 2. Although TL 2
generates samples in the target domain, the trained model is less generalizable.
Table 7 provides the F1 scores for each scenario using the different transfer learning methods using
VPSNet-B. We found that the introduction of IRCSD resulted in a significant increase in F1 scores in
almost all scenarios. This demonstrates the effectiveness of the proposed IRCSD. For some scenarios,
such as 2017_SAⅢ Maize, 2018_SAⅢ Maize and 2018_SAⅢ Rice, the effect TL2 after the introduction
of IRCSD is worse than TL1 results, which means that the generated samples have noise and false
information. It can also be noticed that the transfer results of TL1 are almost always unsatisfactory.
Differences in geographic location as well as climatic conditions may be responsible for this result, and
this difference is reflected in interregional crops as differences between crop spectra, as confirmed in
Section 6.1.
Table 6 Accuracy assessment of different Transfer learning methods
Methods
Transfer
Learning
2017_SA
Ⅱ
2017_SA
Ⅲ
2018_SA
Ⅱ
2018_SA
Ⅲ
TL 1
7.40
31.13
26.97
68.81
TL 2
53.85
46.36
61.58
42.66
TL 3
68.34
81.79
81.07
92.66
TL 1
6.80
29.80
19.77
62.11
TL 2
64.50
48.68
68.08
43.12
TL 3
83.79
83.71
77.01
91.58
TL 1
6.73
28.48
23.45
64.22
TL 2
71.01
51.66
73.73
62.39
TL 3
82.54
83.74
81.36
89.91
TL 1
7.14
30.53
19.21
64.86
TL 2
64.79
50.00
67.23
47.25
TL 3
82.84
82.45
86.72
93.12
Localmean
79.08
64.47
79.87
71.17
Localmax
81.02
67.54
80.73
76.17
RF
Transformer
CT
VPSNet-B
Localmean: Mean of local results of four methods.
Localmax: Max of local results of four methods.
Table 7 F1-score per scenario (%)of different Transfer learning methods using VPSNet-B
Crop
Transfer
2017_SAⅡ
2017_SAⅢ
2018_SAⅡ
2018_SAⅢ
TL 1
4.82
32.40
11.67
49.41
TL 2
26.09
58.56
53.33
40.00
TL 3
73.68
77.92
46.15
84.21
TL 1
11.76
45.98
15.13
66.15
TL 2
9.43
41.18
63.95
52.94
TL 3
75.89
82.73
93.09
96.30
TL 1
3.33
2.75
31.30
13.45
TL 2
79.39
48.11
72.09
40.68
TL 3
86.84
85.57
86.69
87.80
Learning
Rice
Maize
Soybean
1.使用历史
Table 8 Accuracy assessment of different Transfer learning methods using history IRCSD
Study
Transfer
Area
Learning
RF
Transformer
CT
VPSNet-B
19.21
TL 1
26.97
19.77
23.45
TL 2
25.71
31.07
44.63
2018_SA
TL 3
75.71
83.03
76.55
Ⅱ
TL 2max
73.73
TL 3max
86.72
TL 1
68.81
62.11
64.22
2018_SA
TL 2
26.61
50.46
59.17
Ⅲ
TL 3
90.83
92.87
94.95
TL 2max
62.39
TL 3max
93.12
87.87
Localmax
80.73
64.86
94.95
76.17
5.4 Crop Mapping
To evaluate the effectiveness of the mapping of our proposed VPSNet and transfer learning strategy,
we selected two 10 km × 10 km area in each of the study areas Ⅱ, Ⅲ for crop mapping, and the crop
types mapped were rice, maize, and soybean. We mapped each region three times, retaining only the
corresponding crop for each mapping (this was determined by the crop spectral differences between the
regions used). To ensure the accuracy of the mapping, only the 'correct pixels' with high classification
probability (>0.99) were retained and filtered to obtain the distribution layers of the three crops, and the
three layers were superimposed to obtain the final crop mapping results. Since we need to map each area
three times, this causes some pixels to have multiple crop categories, which is unavoidable. We call such
pixels’ uncertain pixels and evaluated the order of magnitude of uncertain pixels in each mapping result
by IoU. When the IoU was smaller, it indicated fewer uncertain pixels and less crop mixing. In addition,
we removed pixels with a very small number of March-November images, which greatly interfered with
the mapping effect. Crop mapping included the use of local samples for mapping (Local), direct model
transfer learning mapping (TL 1), and the introduction of the IRCSD (TL 3) for mapping, and the results
were shown in Fig. 11.
(a)TL3
IOU= 10.9%
(b) Local
(c)TL
(d)TL3 IOU=8.4%
(e) Local
(f)TL
(g)TL3
IOU=12.17%
(h) Local
(i)TL
(j)TL3
IOU=5.9%
(k) Local
(l)TL
Fig. 11. Comparison of mapping results of different transfer learning methods. Where (a-f) are the results
of 2017 mapping and (g-l) are the results of 2018 mapping. (a-c) and (g-l) are study area Ⅲ mapping
results, (d-f) and (j-l) are study area Ⅱ mapping results
Fig. 11 shows that the mapping results obtained by direct use of model migration were very different
from the local mapping results, while the mapping results after adding IRCSD were very similar to the
local results, which indicated the effectiveness of the proposed transfer strategy and crop mapping
strategy. Combined with the two years of mapping results, the two study areas practiced a soybean and
maize rotational cropping system, which was very consistent with the reality and demonstrated the
completeness and validity of VPSNet for crop classification prediction results. The four mapping results
using IRCSD IOU were 10.9%, 8.4%, 5.9% and 12.17%, respectively, which was an acceptable range.
In fact, in this study area there was a pattern of soybean mixed with maize, where soybean and maize are
planted in the same plot, because this planting behavior can increase the yield of both crops and therefore
indeterminate pixels were suitable for the classification of this planting scenario.
We also mapped the transfer of IRCSD for historical years (TL3H), and the mapping idea was
consistent with the previous ones, and the results are shown in Fig. 12.
(a)TL3H IoU = 18.89%
(b) TL3
IOU= 12.17%
(c)Local
(d) TL3H IoU = 5.98%
(e) TL3
IOU= 5.9%
(f)Local
Fig. 12. Comparison of mapping results of different transfer learning methods. Where (a-c) are study
area Ⅲ mapping results, (d-f) are study area Ⅱ mapping results
Fig. 12 shows that the mapping results using the IRCSD of historical years were not as good as
using the IRCSD of the current year, and there was some improvement in IoU, but the mapping results
were acceptable in some areas. Using historical inter-regional crop differences was a solution for crop
mapping without current year field samples, and the mapping results provided in this study were also an
acceptable option in this scenario.
6. Discussion
6.1 Analysis of inter-regional crop time series difference
Fig. 4. Crop temporal profile for the blue band and nir-red band of rice, maize and soybean in the three
study areas in 2017. The blue, green and red lines indicate the means of the time series for the different
regions. The buffers indicate one standard deviation from the mean.
Fig. 4 shows that the temporal profile of the two bands of the three crops in study area Ⅱ, Ⅲ have
similarity, while the temporal profile of the two bands of wheat and maize in study area Ⅰ and study area
Ⅱ, Ⅲ differed very significantly, especially during the growing season of the crop, while the differences
in phenology between soybeans were relatively small and were concentrated before the maturity of the
crop. It can be seen that for regions that were geographically close and had similar climatic, the possibility
of differences in temporal profile between crops became smaller, while when geographically distant and
climatic appeared significantly different, the temporal profile between crops would be significantly
different. This is consistent with previous studies. Such inter-regional differences in crop spectra pose a
barrier to the direct transfer of classification models, and eliminating or simulating such inter-spectral
differences is key for models to achieve cross-regional transfer.
Using a small amount of sample from the source and target domains, we combined the difference
calculation equation to simulate inter-regional crop spectral differences and fuse the spectral differences
with the target domain spectra to form a new target domain spectrum close to the source domain spectrum.
We calculated the crop spectral differences between study area Ⅰ and study area Ⅱ, Ⅲ using equation (2),
and the resulting new target domain crop spectra were plotted against the source domain crop spectra, as
shown in Fig. 5.
Fig. 5. 2017 crop temporal profile of rice, maize and soybean in blue and nir-red bands from three study
areas with IRCSD added.
Fig. 5 shows that the temporal profile of the target domain was very similar to those of the target
domain after adding IRCSD, which was consistent with our assumptions. Among them, the temporal
profile of rice and soybean in the target domain were very close to those of the source domain after
adding IRCSD, but the two band buffered of the source domain and the target domain overlap less, and
the widths of the bands in the target domain were wider than those in the source domain, especially at
the end of growth. In contrast, the widths of the maize temporal profile were relatively wider in the source
domain, but the mean temporal profile were somewhat different compared to the other two crops,
concentrating on the mid-growth stage of the crop. This was related to the number of crop samples in the
source domain, where rice and soybean samples were relatively small and densely distributed
geographically, which made the sample diversity relatively weak, while maize samples in the source
domain are extremely abundant and widely distributed geographically, which created sample diversity.
The richer the sample diversity of the source domain, the more significant the spectral differences
between the obtained regions, which is the key to the successful implementation of the proposed transfer
learning strategy.
Since only the trained model in the source domain need to be used in this study, it was necessary to
introduce IRCSD for the unlabeled remote sensing data (For mapping) in the target domain as well in
crop mapping. Fig. 10 shows that the IRCSD of different crops were also different, which provided
convenience for crop mapping, and for different crops, the IRCSD of different crops can be selected. We
performed i times plus crop spectral differences for the unlabeled remote sensing data in the target
domain, where i was related to the number of classes of crops in the target domain.
Fig. 10. Comparison of IRCSD for different crops.
Transferring a model trained in a region with sufficient samples to a region with sparse samples can
effectively solve the problem that crop classification cannot be performed in a region with sparse samples.
The similarity of the same crop phenology in two regions was a prerequisite for model transfer(Hao et al
2020). However, the difference of geographical environment between regions would make the difference
of the same crop phenology, and the effect of transfer would decrease with the increase of the
difference(Wang et al 2019). Crop phenology was described by the spectral reflectance of remotely
sensed images(Zhang et al 2014). Therefore, this study proposed the use of IRCSD to adapt inter-regional
geographic environmental differences in transfer learning. IRCSD can reduce the differences in
phenology of the same crop in different regions, thus improving the transfer effect of the model, and can
realize the inter-annual interval of the historical classification model. The key to the success of the
transfer strategy was that equation (2) considers not only the difference relationship between the spectral
reflectance between regions, but also the multiplicative relationship between the spectral reflectance of
the target domain and the IRCSD, which made the IRCSD significantly correlated with the spectral
reflectance of the target domain.
In crop classification, the classification method was one of the factors that affect the classification
accuracy. Each classification has unique advantages for crop classification using the spectral reflectance
of remotely sensed images(Wang et al 2022, Zhong et al 2019b). Designing an effective crop
classification model would have a significant impact on the classification results. PoSyNet achieved
optimal classification accuracy by extracting multiform semantic features using a multilevel network
mechanism. This was due to both the design of the network architecture and the input of the temporal
and spectral dimensions, resulting in accurate classification of different crops. The combination of a
generic transfer strategy (IRCSD) and an effective classification model (PoSyNet) can enable the transfer
of models between regions with large differences in geographic environments.
6. Conclusions
In this study, a transfer learning strategy for the transfer of classification models across regions was
proposed, which can solve the problem that classification models cannot obtain satisfactory classification
accuracy when transferring across long distances. The strategy was based on the fact that crop phenology
of the same crop in different regions can vary due to factors such as inter-regional geography, and such
differences can be characterized using crop spectra. The results showed that the crop temporal profile of
the same crop in geographically similar regions were very similar, while the temporal profile of the same
crop in regions with differences in geography were different. The designed method for spectral
differences can calculate the IRCSD using a small number of samples, and the improvement of transfer
learning accuracy by adding the IRCSD is more than 25% in all transfer experiments, which
demonstrated the generality of the proposed method. A multi-layer DL structure (PoSyNet) was
constructed with different forms of features with semantic information extracted by setting up
convolutional layers of different dimensions. Compared with RF, Transformer, and CT, PoSyNet had
advantages in classification results. Adding position coding structure to the network structure improved
the richness of model inputs. Compared with RF without position coding, the method with position
coding methods had significant advantages.
The transfer learning strategy proposed in this study can effectively solve the problem of model
migration, but still requires a small number of samples as support. In future work, the possibility of using
factors such as geography to generate IRCSD will be further explored, which in turn will further reduce
the need for samples.
Acknowledgements
This work was supported in part by the National Natural Youth Science Foundation of China under Grant
42001352; in part by the National Key Research and Development Program of China under Grant
2021YFE0205100.
References
Aneece I, Thenkabail PS. 2021. Classifying Crop Types Using Two Generations of
Hyperspectral Sensors (Hyperion and DESIS) with Machine Learning on the Cloud.
Remote Sens-Basel 13: 4704
Bargiel D. 2017. A new method for crop classification combining time series of radar images
and crop phenology information. Remote Sens Environ 198: 369-83
Belgiu M, Bijker W, Csillik O, Stein A. 2021a. Phenology-based sample generation for
supervised crop type classification. Int J Appl Earth Obs 95
Belgiu M, Bijker W, Csillik O, Stein A. 2021b. Phenology-based sample generation for
supervised crop type classification. Int J Appl Earth Obs 95: 102264
Boryan C, Yang Z, Mueller R, Craig M. 2011. Monitoring US agriculture: the US department of
agriculture, national agricultural statistics service, cropland data layer program.
Geocarto International 26: 341-58
Chen J, Chen J, Liao A, Cao X, Chen L, et al. 2015. Global land cover mapping at 30 m
resolution: A POK-based operational approach. Isprs J Photogramm 103: 7-27
Dong J, Xiao X, Kou W, Qin Y, Zhang G, et al. 2015. Tracking the dynamics of paddy rice
planting area in 1986–2010 through time series Landsat images and phenology-based
algorithms. Remote Sens Environ 160: 99-113
Franch B, Vermote E, Becker-Reshef I, Claverie M, Huang J, et al. 2015. Improving the
timeliness of winter wheat production forecast in the United States of America, Ukraine
and China using MODIS data and NCAR Growing Degree Day information. Remote
Sens Environ 161: 131-48
Gallego J, Craig M, Michaelsen J, Bossyns B, Fritz S. 2008. Best practices for crop area
estimation with remote sensing. Ispra: Joint Research Center
Ghazaryan G, Dubovyk O, Löw F, Lavreniuk M, Kolotii A, et al. 2018. A rule-based approach
for crop identification using multi-temporal and multi-sensor phenological metrics.
European Journal of Remote Sensing 51: 511-24
Gong P, Wang J, Yu L, Zhao Y, Zhao Y, et al. 2013. Finer resolution observation and monitoring
of global land cover: First mapping results with Landsat TM and ETM+ data. Int J
Remote Sens 34: 2607-54
Güldenring R, Nalpantidis L. 2021. Self-supervised contrastive learning on agricultural images.
Comput Electron Agr 191: 106510
Hao PY, Di LP, Zhang C, Guo LY. 2020. Transfer Learning for Crop classification with Cropland
Data Layer data (CDL) as training samples. Sci Total Environ 733
Hao PY, Wang L, Zhan YL, Wang CY, Niu Z, Wu MQ. 2016. Crop classification using crop
knowledge of the previous-year: Case study in Southwest Kansas, USA. European
Journal of Remote Sensing 49: 1061-77
Huang H, Wang J, Liu C, Liang L, Li C, Gong P. 2020. The migration of training samples
towards dynamic global land cover mapping. Isprs J Photogramm 161: 27-36
Johnson DM, Mueller R. 2021. Pre-and within-season crop type classification trained with
archival land cover information. Remote Sens Environ 264: 112576
Konduri VS, Kumar J, Hargrove WW, Hoffman FM, Ganguly AR. 2020. Mapping crops within
the growing season across the United States. Remote Sens Environ 251: 112048
Lei L, Wang XY, Zhong YF, Zhao HW, Hu X, Luo C. 2021. DOCC: Deep one-class crop
classification via positive and unlabeled learning for multi-modal satellite imagery. Int J
Appl Earth Obs 105
Li ZT, Chen GK, Zhang TX. 2020. A CNN-Transformer Hybrid Approach for Crop Classification
Using Multitemporal Multisensor Images. Ieee J-Stars 13: 847-58
Lin C, Zhong L, Song X-P, Dong J, Lobell DB, Jin Z. 2022. Early-and in-season crop type
mapping without current-year ground truth: Generating labels from historical
information via a topology-based approach. Remote Sens Environ 274: 112994
Liu Z, Mao H, Wu C-Y, Feichtenhofer C, Darrell T, Xie S. Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition2022: 11976-86.
Malambo L, Heatwole CD. 2020. Automated training sample definition for seasonal burned area
mapping. Isprs J Photogramm 160: 107-23
Minh DHT, Ienco D, Gaetano R, Lalande N, Ndikumana E, et al. 2018. Deep recurrent neural
networks for winter vegetation quality mapping via multitemporal SAR Sentinel-1. Ieee
Geosci Remote S 15: 464-68
Nowakowski A, Mrziglod J, Spiller D, Bonifacio R, Ferrari I, et al. 2021a. Crop type mapping by
using transfer learning. Int J Appl Earth Obs 98: 102313
Nowakowski A, Mrziglod J, Spiller D, Bonifacio R, Ferrari I, et al. 2021b. Crop type mapping by
using transfer learning. Int J Appl Earth Obs 98
Rawat A, Kumar A, Upadhyay P, Kumar S. 2021. Deep learning-based models for temporal
satellite data processing: Classification of paddy transplanted fields. Ecol Inform 61
Sharma A, Liu X, Yang X. 2018. Land cover classification from multi-temporal, multi-spectral
remotely sensed imagery using patch-based recurrent neural networks. Neural
Networks 105: 346-55
Skakun S, Franch B, Vermote E, Roger J-C, Becker-Reshef I, et al. 2017. Early season largearea winter crop mapping using MODIS NDVI data, growing degree days information
and a Gaussian mixture model. Remote Sens Environ 195: 244-58
Skakun S, Kussul N, Shelestov AY, Lavreniuk M, Kussul O. 2016. Efficiency Assessment of
Multitemporal C-Band Radarsat-2 Intensity and Landsat-8 Surface Reflectance
Satellite Imagery for Crop Classification in Ukraine. Ieee J-Stars 9: 3712-19
Song X-P, Potapov PV, Krylov A, King L, Di Bella CM, et al. 2017. National-scale soybean
mapping and area estimation in the United States using medium resolution satellite
imagery and field survey. Remote Sens Environ 190: 383-95
Wang H, Chang W, Yao Y, Liu D, Zhao Y, et al. 2022. CC-SSL: A Self-Supervised Learning
Framework for Crop Classification With Few Labeled Samples. Ieee J-Stars 15: 870418
Wang H, Chang W, Yao Y, Yao Z, Zhao Y, et al. 2023. Cropformer: A new generalized deep
learning classification approach for multi-scenario crop classification. Frontiers in Plant
Science 14
Wang S, Azzari G, Lobell DB. 2019. Crop type mapping without field-level labels: Random
forest transfer and unsupervised clustering techniques. Remote Sens Environ 222:
303-17
Wang Y, Zhang Z, Feng L, Ma Y, Du Q. 2021. A new attention-based CNN approach for crop
mapping using time series Sentinel-2 images. Comput Electron Agr 184: 106090
Xu JF, Zhu Y, Zhong RH, Lin ZX, Xu JL, et al. 2020. DeepCropMapping: A multi-temporal deep
learning approach with improved spatial generalizability for dynamic corn and soybean
mapping. Remote Sens Environ 247
Yaramasu R, Bandaru V, Pnvr K. 2020. Pre-season crop type mapping using deep neural
networks. Comput Electron Agr 176: 105664
Yu L, Wang J, Clinton N, Xin Q, Zhong L, et al. 2013. FROM-GC: 30 m global cropland extent
derived through multisource data integration. International Journal of Digital Earth 6:
521-33
Yuan Y, Lin L. 2021. Self-Supervised Pretraining of Transformers for Satellite Image Time
Series Classification. Ieee J-Stars 14: 474-87
Yuan Y, Lin L, Liu QS, Hang RL, Zhou ZG. 2022. SITS-Former: A pre-trained spatio-spectraltemporal representation model for Sentinel-2 time series classification. Int J Appl Earth
Obs 106
Yuan Y, Lin L, Zhou Z-G, Jiang H, Liu Q. 2023. Bridging optical and SAR satellite image time
series via contrastive feature extraction for crop classification. Isprs J Photogramm 195:
222-32
Zhang G, Xiao X, Dong J, Kou W, Jin C, et al. 2015. Mapping paddy rice planting areas through
time series analysis of MODIS land surface temperature and vegetation index data.
Isprs J Photogramm 106: 157-71
Zhang J, Feng L, Yao F. 2014. Improved maize cultivated area estimation over a large scale
combining MODIS–EVI time series data and crop phenological information. Isprs J
Photogramm 94: 102-13
Zhong L, Gong P, Biging GS. 2014. Efficient corn and soybean mapping with temporal
extendability: A multi-year experiment using Landsat imagery. Remote Sens Environ
140: 1-13
Zhong L, Hu L, Zhou H. 2019a. Deep learning based multi-temporal crop classification. Remote
Sens Environ 221: 430-43
Zhong LH, Hu LN, Zhou H. 2019b. Deep learning based multi-temporal crop classification.
Remote Sens Environ 221: 430-43
Download