Professor Pun Man-On’s Team Publishes Two Articles Based on Large Model and Diffusion Framework in International Journal IEEE TGRS
Professor Pun Man-On’s team from the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, published two articles in the field of remote sensing in the international journal IEEE Transactions on Geoscience and Remote Sensing within one month. This journal is one of the top journals in the field of geoscience and remote sensing, and is a journal of the IEEE Geoscience and Remote Sensing Society (GRSS). It ranks among the top five journals in the global geoscience field in terms of international influence and is a first-tier journal of the Chinese Academy of Sciences. Its impact factor in 2023 was 7.5, indicating a high level of influence in the fields of remote sensing technology and geoscience.
01
DF4LCZ: A SAM-Empowered Data Fusion Framework for Scene-Level Local Climate Zone Classification
Paper link: https://ieeexplore.ieee.org/document/10556641
Paper introduction
Local Climate Zone (LCZ) is a standardized framework for classifying urban land types, commonly used to analyze the thermal characteristics of different urban areas, and is an important tool for studying the urban heat island effect. Recent advances in remote sensing technologies have highlighted their capability for accurate classification of local climate zones. However, traditional methods using convolutional neural networks (CNNs) often fall short of effectively incorporating prior knowledge of ground objects. In addition, data sources such as Sentinel-2 struggle with capturing detailed information on ground objects. To address these issues, we introduce a novel data fusion approach that combines high-resolution Google imagery, which provides ground object priors, with Sentinel-2 multispectral imagery. Our method, the Dual-stream Fusion framework for LCZ classification (DF4LCZ), merges instance-based location features from Google imagery and spatial-spectral features from Sentinel-2. This framework is enhanced by a graph convolutional network (GCN) module, powered by the segment anything model (SAM), to improve feature extraction from Google imagery. Concurrently, a 3D-CNN architecture is utilized to process the spectral-spatial features of Sentinel-2 imagery. The effectiveness of DF4LCZ is demonstrated through experiments conducted on a specialized multisource remote sensing image dataset for LCZ classification.
Figure 1 The LCZ recognition model structure proposed in this project
Major contributions of this paper
1. Given the limitations of Sentinel-2 satellite imagery, this study proposes for the first time the use of high-resolution Google Earth imagery to enhance LCZ classification performance;
2. A dual stream fusion framework DF4LCZ was proposed, which combines instance based location features with scene level spatial spectral features to obtain synergistic benefits from their complementarity;
3. In this framework, DF4LCZ uses a segmentation large model SAM to extract ground instances from Google Earth RGB images, and uses a graph convolutional network GCN to extract scene discriminative features from these ground instances and classify them. In addition, DF4LCZ also introduces a 3D ResNet11 module for extracting spatial spectral features from Sentinel-2 imagery;
4. A multi-source remote sensing image dataset LCZC-GES2 was constructed using Sentinel-2 multispectral and Google Earth RGB images, and various experiments were conducted on this dataset to verify the performance of the DF4LCZ method.
02
Diffusion Enhancement for Cloud Removal in Ultra-Resolution Remote Sensing Imagery
Paper link: https://ieeexplore.ieee.org/document/10552304
Paper introduction
The presence of cloud layers severely compromises the quality and effectiveness of optical remote sensing (RS) images. However, existing deep-learning (DL)-based cloud removal (CR) techniques tend to generate smooth results, often failing to reconstruct visually pleasing results and cause semantic loss. To tackle this challenge, this work proposes to encompass enhancements at the data and methodology fronts. On the data side, an ultra-resolution benchmark named CUHK cloud removal (CUHK-CR) of 0.5 m spatial resolution is established. This benchmark incorporates rich detailed textures and diverse cloud coverage, serving as a robust foundation for designing and assessing CR models. From the methodology perspective, a novel diffusion-based framework for CR named diffusion enhancement (DE) is introduced. This framework aims to gradually recover texture details, leveraging a reference visual prior providing foundational structure of the images to enhance inference accuracy. Additionally, a weight allocation (WA) network is developed to dynamically adjust the weights for feature fusion, thereby further improving performance, particularly in the context of ultra-resolution image generation. Furthermore, a coarse-to-fine training strategy is applied to effectively expedite training convergence while reducing the computational complexity required to handle ultra-resolution images. The following figure shows a cloud removal case based on the satellite visible light image of the CUHK-Shenzhen campus. On the far left is the true value of the cloudless ground, and in the middle is the satellite image obscured by clouds. By effectively processing satellite images obscured by clouds, the cloud removal model proposed in this project can highly restore the ground truth value (leftmost) in the cloud removed image output (rightmost).
Figure 2 Cloud removal case based on satellite imagery from the CUHK-Shenzhen campus. From left to right are the ground truth value, the visible light and shadow image obstructed by clouds (input), and the image after cloud removal (output)
Major contributions of this paper
1. A diffusion enhancement (DE) network was proposed to restore surface scenery under cloud cover. The proposed DE network combines global visual information with progressive diffusion recovery, enhancing the ability to capture data distribution. Therefore, it is adept at utilizing Reference model priors to predict detailed information during the inference process;
2. A weight allocation network was designed to calculate adaptive weighting coefficients for fusing the intermediate denoised images of the diffusion model with the cloud removal results generated by the Reference model. Therefore, in the initial steps, the Reference model helps with coarse-grained content reconstruction, while the Diffusion model focuses on generating rich detailed information in subsequent stages. In addition, we adopt a training strategy from coarse to fine to stabilize and accelerate the convergence speed of DE;
3. Finally, a high-resolution cloud removal dataset called CUHK-CR was established to evaluate cloud removal methods for different types of cloud coverage. Our dataset includes 668 thin cloud images and 559 thick cloud images with multispectral information. To our knowledge, our 0.5m dataset has the highest spatial resolution among all existing cloud removal datasets.
Maior Authors
Corresponding author: Professor Pun Man-On from The Chinese University of Hong Kong, Shenzhen
Pun Man-On, associate professor at the School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen. Pun received his BEng. from The Chinese University of Hong Kong (CUHK) in 1996, MEng. from University of Tsukuba, Japan in 1999 and Ph.D. from University of Southern California (USC) in Los Angeles, U.S.A in 2006, respectively. He was a post-doctoral research associate at Princeton University (USA) from 2006 to 2008.
Prior to joining CUHK-Shenzhen in 2015, he held research positions at Huawei (USA), Mitsubishi (Boston) and Sony (Tokyo). Prof. Pun’s research interests include AI Internet of Things (AIoT) and applications of machine learning in communications and satellite remote sensing.
Prof. Pun has received three prestigious best paper awards including IEEE VTC’06 Fall, IEEE ICC’08 and IEEE Infocom’09. He served as associate editor in the channel modeling track for the IEEE Transactions on Wireless Communications in 2010 - 2014. He is the founding chair of the IEEE Joint Signal Processing Society-Communications Society Chapter, Shenzhen.
First author of the first paper is Wu Qianqian, a doctoral student at The Chinese University of Hong Kong, Shenzhen
Wu Qianqian obtained a bachelor’s degree in Natural Geography and Resource Environment and a master’s degree in Cartography and Geographic Information Engineering from China University of Geosciences in 2019 and 2022, respectively. She is currently pursuing a doctoral degree at The Chinese University of Hong Kong, Shenzhen. Her main research areas include data fusion, deep learning, and remote sensing image processing.
First author of the second paper is Sui Jialu, a doctoral student at The Chinese University of Hong Kong, Shenzhen
Sui Jialu obtained a bachelor’s degree in Computer Science and Technology from Shandong University in 2021. She is currently pursuing a master’s degree at The Chinese University of Hong Kong, Shenzhen. Her main research areas are remote sensing, machine learning, super-resolution, and image enhancement. She once made a short-term visit as a visiting student at Peking University.