Articles Differences of molecular events driving pathological and radiological progression of lung adenocarcinoma Jun Shang,a,f He Jiang,a,f Yue Zhao,b,c,f Jinglei Lai,b,c,f Leming Shi,a,c Jingcheng Yang,a,d,∗ Haiquan Chen,b,c,e,∗∗ and Yuanting Zhenga,∗∗∗ a State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China b Department of Thoracic Surgery, Shanghai Cancer Center, Fudan University, Shanghai, China c Institute of Thoracic Oncology, Fudan University, Shanghai, China d Greater Bay Area Institute of Precision Medicine, 115 Jiaoxi Road, Guangzhou, China e Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China Summary Background Ground-glass opacity (GGO)-like lung adenocarcinoma (LUAD) has been detected increasingly in the clinic and its inert property and superior survival indicate unique biological characteristics. However, we do not know much about them, which hampers identification of key reasons for the inert property of GGO-like LUAD. Methods Using whole-exome sequencing and RNA sequencing, taking into account both radiological and pathological classifications of the same 197 patients concomitantly, we systematically interrogate genes driving the progression from GGO to solid nodule and potential reasons for the inertia of GGO. Using flow cytometry and IHC, we validated the abundance of immune cells and activity of cell proliferation. eBioMedicine 2023;94: 104728 Published Online xxx https://doi.org/10. 1016/j.ebiom.2023. 104728 Findings Identifying the differences between GGO and solid nodule, we found adenocarcinoma in situ/minimally invasive adenocarcinoma (AIS/MIA) and GGO-like LUAD exhibited lower TP53 mutation frequency and less active cell proliferation-related pathways than solid nodule in LUAD. Identifying the differences in GGO between AIS/MIA and LUAD, we noticed that EGFR mutation frequency and CNV load were significantly higher in LUAD than in AIS/ MIA. Regulatory T cell was also higher in LUAD, while CD8+ T cell decreased from AIS/MIA to LUAD. Finally, we constructed a transcriptomic signature to quantify the development from GGO to solid nodule, which was an independent predictor of patients’ prognosis in 11 external LUAD datasets. Interpretation Our results provide deeper insights into the indolent nature of GGO and provide a molecular basis for the treatment of GGO-like LUAD. Funding This study was supported in part by the National Natural Science Foundation of China (32170657), the National Natural Science Foundation of China (82203037), and Shanghai Sailing Program (22YF1408900). Copyright © 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Keywords: Lung adenocarcinoma; Ground-glass opacity; Pathological progression; Radiological progression; Molecular events Introduction With the increasing concerns about personal health and the popularity of chest low-dose computed tomography (LDCT), more and more nodules with ground-glass opacity (GGO) have been detected in the early screening of lung cancer. GGO is a radiological characteristic and appears on radiology as a dense, ground-glass shadow with well-defined vascular and bronchial texture within the lesion.1 According to the consolidation-to-tumor ratio (CTR) on radiology, tumors can be divided into pure GGO (pGGO, CTR = 0), mixed GGO (mGGO, 0< CTR <1) and solid nodule (CTR = 1).2 *Corresponding author. State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China. **Corresponding author. Department of Thoracic Surgery, Shanghai Cancer Center, Fudan University, Shanghai, China. ***Corresponding author. State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China. E-mail addresses: yjcyxky@163.com (J. Yang), hqchen1@yahoo.com (H. Chen), zhengyuanting@fudan.edu.cn (Y. Zheng). f These authors contributed equally. www.thelancet.com Vol 94 August, 2023 1 Articles Research in context Evidence before this study The underlying molecular mechanisms that lead to GGO-like LUAD have attracted the attention of researchers in recent years. Compared with solid nodule, GGO-like nodule showed fewer genomic events, such as gene mutation load, driver gene mutation frequency, and loss of heterozygosity of HLA. In addition, higher immune cell activity, lower immune suppression, and lower fibroblast activation were found in GGO than in solid nodule. However, these studies may suffer from two major problems. First, there were uncontrolled variables when studying the molecular characteristics of different types of GGO. These studies confused the effects of pathological staging (AIS/MIA/LUAD) and radiological staging (pGGO/mGGO/solid nodule), eventually making it impossible to figure out which molecular alterations may lead to the progression of pathological stage, and which may lead to the transition from GGO to solid nodule. Therefore, we cannot precisely identify reasons for GGO inertia after excluding the influence of pathological period and it is also impossible to find the molecular features that drive pathological progression after excluding the influence of radiological classification. Secondly, these studies were not comprehensive. Therefore, it is impossible to see the LUAD progression more completely from the earlier stage of the disease and to identify the characteristics of the same GGO type in different pathological stages. Only when we know the relationship between molecular alterations and pathological/ radiological staging can we perform more precise intervention in patients with different pathological periods or different GGO types. To the best of our knowledge, no study has comprehensively and separately evaluated the effect of GGO From a pathological point of view, lung adenocarcinoma (LUAD) can be divided into adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA) and LUAD. Both AIS and MIA present as limited lesions ≤30 mm, with tumor cells growing along the alveolar structures, except that AIS does not break through the basement membrane, whereas MIA breaks through the basement membrane, but the depth of invasion ≤5 mm. Compared to the above stages, LUAD is a more malignant stage, which can be further divided into stage I, II, III and IV according to the degree and site of invasion and metastasis.3 pGGO is predominantly observed in the pre/minimally invasive stage of LUAD including adenocarcinoma in situ (AIS) and minimally invasive adenocarcinoma (MIA). It consistently signifies a highly favorable prognosis, with a 5-year relapse-free survival (RFS) rate of 100%,4 even when it is observed in invasive stage I non-small cell lung cancer (NSCLC). A lung cancer screening in Shanghai indicated that lesions on CT with GGO component appeared in 84.9% of cases diagnosed with lung cancer.5 What is more, the results 2 components on molecular and immune characteristics in AIS&MIA and LUAD. Added value of this study We comprehensively compared the molecular characteristics of pGGO, mGGO, and solid nodule in the AIS&MIA and LUAD. Firstly, we found that compared with solid nodule in LUAD, GGO-like LUAD showed lower malignancy, which had similar molecular characteristics to pre/minimally invasive, including lower genomic events, cell proliferation, and matrix remodeling activation. Additionally, we noticed that there was almost no significant genomic difference between pGGO and mGGO in AIS&MIA, but molecular characteristics including EGFR mutation frequency and CNV load were significantly higher in LUAD than AIS&MIA. Immunoreactiverelated cells were also more associated with pathological stage than GGO components. Finally, we constructed a gene signature associated with radiological, pathological and histological progression, which can serve as a good predictor of patient prognosis. To accurately identify whether a specific molecular event was associated with GGO components or pathological stage, one needs to control variables, i.e. radiology or pathology. And it was what our study design uniquely offered. Implications of all the available evidence These results will provide a molecular basis for the treatment of GGO-like LUAD. Meanwhile, the relative balance of malignant cancer cells and immune cells may lead to the inert property of GGO. of a previous study performing lung cancer screening in Chinese hospital employees showed that 95.5% of the lung cancer patients detected by screening were presented as GGO on radiological evaluation.6 Many studies have demonstrated that GGO-like lung cancer is an "inert" tumor with slow growth rate and good prognosis.4,7 The underlying molecular mechanisms that lead to lung cancer with or without the GGO component remain to be explored, and have attracted the attention of researchers in recent years. Compared with solid nodule, GGO-like nodule showed fewer genomic events, such as gene mutation load, driver gene mutation frequency and loss of heterozygosity of HLA.8,9 In addition, the tumor microenvironment (TME) of GGO-like LUAD has been of interest to many investigators. Higher immune cell activity, lower immune suppression, and lower fibroblast activation were found in GGO than in solid nodule.8,10 Meanwhile, malignant cells in GGO suffered from thorough metabolic reprogramming and immune stress.11 Therefore, the relative balance of www.thelancet.com Vol 94 August, 2023 Articles malignant cancer cells and immune cells may lead to the inert property of GGO. However, these studies may suffer from two major caveats. First, there were uncontrolled variables when studying the molecular characteristics of different types of GGO. “Uncontrolled variables” means most of the previous studies performed genomic and transcriptomic explorations did not thoughtfully take the pathological stage into consideration. Some of these studies even compared pGGO of AIS&MIA with solid nodule of LUAD, which cannot tell whether the molecular differences found by such comparison which were originated from pGGO vs solid, or AIS&MIA vs LUAD. To be more specific, these studies confused the effects of pathological staging (AIS/MIA/LUAD) and radiological staging (pGGO/mGGO/solid nodule), eventually making it impossible to figure out which molecular alterations may lead to the progression of pathological stage, and which may lead to the transition from GGO to solid nodule. Therefore, we cannot precisely identify reasons for GGO inertia after excluding the influence of pathological period and it is also impossible to find the molecular features that drive pathological progression after excluding the influence of radiological classification. Secondly, these studies were not comprehensive. Therefore, it is impossible to see the AIS/MIA and LUAD progression more completely from the earlier stage of the disease and to identify the characteristics of the same GGO type in different pathological stages. Only when we know the relationship between molecular alterations and pathological/radiological staging can we perform more precise intervention in patients with different pathological periods or different GGO types. To the best of our knowledge, no study has comprehensively and separately evaluated the effect of GGO components on molecular and immune characteristics in AIS&MIA and LUAD. Based on our previous studies, which made a comprehensive genomic and transcriptomic comparison between AIS&MIA and LUAD,12,13 we further comprehensively compared the molecular characteristics of different GGO components in the AIS&MIA and LUAD. First and foremost, we found that compared with solid nodule in LUAD, GGO-like LUAD showed low malignant, which had similar molecular characteristics to pre/minimally invasive, including lower genomic events, cell proliferation, and matrix remodeling activation. Additionally, we noticed that there was almost no significant genomic difference between pGGO and mGGO in AIS&MIA, but molecular characteristics including EGFR mutation frequency and CNV load were significantly higher in LUAD than AIS&MIA. Immunoreactive-related cells, e.g. Treg and CD8+, were also more associated with pathological stage than GGO components. Finally, we constructed a gene signature associated with radiological, pathological and histological progression, which can serve as a good www.thelancet.com Vol 94 August, 2023 predictor of patient prognosis. These results will provide a molecular basis for the treatment of GGO-like LUAD. Methods Patient cohort and associated radiological evaluation We collected tumor-normal paired samples from a total of 197 patients (24 AIS, 74 MIA, 83 I and 16 IIIA) and associated RNA-seq and WES data were generated from September 2011 to May 2016.12,13 Clinical characteristics of our cohort including sex, age, tumor size, smoking status, histology and pathological stage were shown in Table 1. The sex of patients was self-reported by study participants. Notably, lepidic histology pattern was significantly plunged from pGGO to solid nodule in LUAD. In contrast, other histology patterns gradually increased. Based on the previous methods,14 we further performed radiological evaluation based on enhanced chest computed tomography (CT) scanning and positron emission tomography-CT (PET-CT) scanning images. The maximum diameter of the tumor is measured on the lung window. Radiology-associated three subtypes including pGGO (CTR = 0), mGGO (0< CTR <1) and pure solid (CTR = 1) were classified based on consolidation to tumor ratio (CTR, ratio of the maximum diameter of solid component to the maximum diameter of whole tumor on CT scan). We retrospectively collected IHC results of Ki67 based on the pathological report of these patients. The Ki67 staining was performed using CONFIRM™ anti-Ki-67 (30-9) Rabbit Monoclonal Primary Antibody according to the manufacturer’s instructions. Gene mutation profiling and copy number variation The WES data was generated with 150 bp paired-end reads on Illumina HiSeq X Ten platform, and the gene mutation files were from our previous study.12 TMB was calculated based on total non-synonymous mutations and about 30 M of capture size. We reanalyzed the genomic mutation profiles of four subgroups (pGGO and mGGO in AIS/MIA, mGGO and pure solid in LUAD) in this cohort. Differentially mutated genes (DMGs) among four subgroups were performed on each gene mutation frequency using Fisher’s exact test with P < 0.05. Gene variant allele frequency (VAF) refers to the proportion of reads in a given gene that contains a specific variant allele, relative to the total number of reads in that gene. Based on our previous study that utilized the ABSOLUTE algorithm,12 we have a quantitative estimate of the tumor purity for each sample analyzed. Segment files of CNV were calculated using CNVkit (v0.9.7) with default parameters.15 Amplification and deletion region was marked by setting the mean segment value to 0.1 and −0.1. GISTIC2 was used to identify amplification and deletion peaks.16 3 Articles Clinical variable AIS/MIA AIS/MIA AIS/MIA pGGO (n = 67) mGGO (n = 30) Solid (n = 1) Female 46 (68.7) 21 (70.0) Male 21 (31.3) 9 (30.0) Sex, n (%) P LUAD LUAD LUAD pGGO (n = 3) mGGO (n = 32) Solid (n = 64) 2 (66.7) 20 (62.5) 30 (46.9) 1 (33.3) 12 (37.5) 34 (53.1) 1 (33.3) 15 (46.9) 23 (35.9) 2 (66.7) 17 (53.1) 41 (64.1) 2 (66.7) 26 (81.2) 41 (64.1) 1 (33.3) 6 (18.8) 23 (35.9) 2 (66.7) 9 (28.1) 3 (4.7) 1 0.22 1 (100.0) 0 (0.0) Age, n (%) 0.36 ≤60 46 (68.7) 17 (56.7) 1 (100.0) >60 21 (31.3) 13 (43.3) 0 (0.0) Smoking, n (%) 0.42 0.97 Never 51 (76.1) 22 (73.3) Ever 16 (23.9) 8 (26.7) 0.14 1 (100.0) 0 (0.0) – Histology, n (%) <0.05 Lepidic / / / Acinar / / / 1 (33.3) 17 (53.1) 40 (62.5) Papillary / / / 0 (0.0) 6 (18.8) 10 (15.6) Solid / / / 0 (0.0) 0 (0.0) 9 (14.1) IMA / / / 0 (0.0) 0 (0.0) 2 (3.1) P stage, n (%) 0.32 5 (16.7) 0.15 AIS 19 (28.4) 0 (0.0) / MIA 48 (71.6) 25 (83.3) IA / / / 1 (100.0) / 3 (100.0) IB / / / IIIA / / / Tumor size, n (%) P / / / / 22 (68.7) 31 (48.4) 0 (0.0) 7 (21.9) 20 (31.3) 0 (0.0) 3 (9.4) 13 (20.3) 0.27 <0.05 (0,1] 47 21 1 1 4 2 (1,2] 13 3 0 2 18 21 (2,3] 2 1 0 0 8 25 (3,6] 0 0 0 0 2 16 NA 5 5 0 0 0 0 AIS, adenocarcinoma in situ; MIA, minimally invasive adenocarcinoma; LUAD, lung adenocarcinoma; GGO, ground-glass opacity; pGGO, pure ground-glass opacity; mGGO, mixed ground-glass opacity; IMA, invasive mucinous adenocarcinoma. Table 1: Clinical characteristics of AIS/MIA and LUAD among pGGO, mGGO and solid nodule in our study cohort (n = 197). APOBEC and COSMIC signature activity APOBEC enrichment score was calculated using maftools (v2.10.0).17 Firstly, we generated a frequency matrix of single nucleotide variants (SNVs) using the trinucleotideMatrix function. This matrix was then subjected to non-negative matrix factorization (NMF) using the extractSignatures function to identify the underlying mutational processes, including those associated with APOBEC activity. We used the signatureEnrichment function to perform association analysis to determine the APOBEC enrichment score in each sample, which utilized the method described by Roberts et al.18 This involved comparing the frequency of C > T mutations occurring within the tCw trinucleotide context to the background frequency of C > T mutations in the sample, as well as the frequency of tCw occurring around ±20 bp of the mutated bases. The SigProfiler framework was used to identify the underlying mutational processes that are driving the observed patterns of mutations in different samples or groups of samples. SigProfilerExtractor (v1.1.21) was 4 used to extract mutation signatures from the mutation matrix. This includes both de novo identification of mutation signatures and decomposition of these signatures into known COSMIC (v3.3). A comparison analysis is then performed between different groups based on the mutation activity matrix of the SBS96 signature of each sample. Differentially expressed genes and gene expression modules We have obtained the gene expression matrix from raw RNA-seq data in our previous study.13 The differentially expressed genes (DEGs) among five groups (AIS&MIApGGO vs normal, AIS&MIA-mGGO vs pGGO, pGGO of LUAD and AIS/MIA, mGGO and pGGO of LUAD, solid and mGGO of LUAD) were performed on RNAseq using R package limma (v3.50.0). P < 0.05 and | log2FC| > 1 were used to select DEGs.19 To identify the gene modules associated with LUAD progression, we identified genes with significant changes from normal to pure GGO of AIS/MIA to pGGO, mGGO and solid of www.thelancet.com Vol 94 August, 2023 Articles LUAD (P < 0.05 and |log2FC| ≥ 1) and further performed WGCNA on them. Each module was assigned at least 50 genes, while the P-value ratio threshold of reassigned genes between modules is 0. Gene function and ontology annotation Annotation GMT file of hallmark gene sets (v7.4) was downloaded from Gene Set Enrichment Analysis (GSEA, https://www.gsea-msigdb.org). Biological hallmarks significantly associated with gene expression modules (M1, M2 and M3) were identified with GSVA (v1.42.0). Significantly different enrichment biological hallmarks (FDR <0.05) between GGO and pure solid LUAD or lepidic-predominant LUAD and solidpredominant LUAD were analyzed by GSEA software (v4.1.0). Construction of GGO to solid (G2S) score Differential expression analysis was performed between GGO and solid of 99 LUAD patients in FUSCC dataset. In LUAD, DEGs between GGO and solid were filtered for |log2FC| ≥ 1.5 and P < 0.05. GGO and solid specific genes were identified based on whether log2FC ≤ −1.5 or log2FC ≥ 1.5. Lepidic-predominant LUAD containing GGO component usually has good prognosis, and vice versa. Differential expression analysis was performed between lepidic-predominant LUAD and solidpredominant LUAD in FUSCC and TCGA datasets to identify lepidic- (close to GGO of radiology) and solid(close to solid of radiology) specific genes. Based on intersection of the DEGs in the two datasets, 24 GGOspecific genes and 21 solid-specific genes were finally identified. Then, G2S score (“TotalScore” from singscore) was calculated by R package singscore (v1.14.0) with Upset (solid specific genes) and Downset (GGO specific genes). Compared to solid, GGO samples would have lower G2S score. In each dataset, patients were divided into GGO-like group and solid-like group based on the median value of G2S score. The GGO-like and Solid-like groups can be further divided into two subgroups based on the presence or absence of EGFR or TP53 mutations. This would result in a total of four subgroups based on the mutation status of each gene, including GGO-like and EGFR wild type, GGO-like and EGFR mutant, Solid-like and EGFR wild type, Solid-like and EGFR mutant, GGO-like and TP53 wild type, GGO-like and TP53 mutant, Solid-like and TP53 wild type, and Solid-like and TP53 mutant. Survival analysis can be conducted in the four subgroups based on the G2S score and mutation status of EGFR and TP53. Tumor microenvironment analysis Tumor microenvironment (TME) analysis was performed based on gene expression signatures. Proliferation rate and matrix remodeling for each sample were www.thelancet.com Vol 94 August, 2023 evaluated with Fges.20 LM22 matrix was used to calculate cell fraction of each sample.21 Flow cytometry Thirty-two fresh tumor tissues and 11 distant normal lung tissues were collected and stored in Tissue Storage Solution (Miltenyi Biotec), and processed on the same day of sample collection. The tissues were washed with PBS until there was no visible blood on the surface, and then minced into a homogenate. Subsequently, 10 mL of type IV collagenase (Gibco) was used for digestion at 37 ◦ C for 1 h with agitation. The resulting single-cell suspension was filtered through 70 μm cell strainer (BD Bioscience), and the supernatant was discarded after centrifugation at 500g for 10 min. After resuspension in PBS buffer, the suspension was centrifuged at 800g for 2 min, and the supernatant was discarded. This washing process was repeated twice. Fixable Viability Dye eFluor780 (eBioscience) and Zombie Yellow Viability Kit (BioLegend) were used to distinguish dead cells. Anti-human CD3 BUV395 (BD Bioscience), Anti-human CD4 APC/Cy7 (BioLegend), Anti-human CD8a Alexa Fluor700 (BioLegend), Anti-human Perforin FITC (BioLegend), and Anti-human FOXP3 PE/ Cy5 (eBioscience) were used to stain cell membrane surface proteins and intracellular proteins. Samples were acquired on the BD LSR Fortessa (BD Bioscience). Flow cytometry data were analyzed using FlowJo (TreeStar). The representative gating strategies for Tregs, CD8+ T cells, and CD8+Perforin+ T cells were shown in Supplementary Figure S1. Published datasets The published datasets used in this study were well documented in the previous study.13 LUAD datasets containing gene expression and prognostic information were mainly from GEO and TCGA. The RNA-seq expression data and clinical information of TCGA LUAD datasets were downloaded from genomic data commons (GDC). TCGA LUAD pathology information such as lepidic predominant and solid predominant was downloaded from UCSC Xena (https://tcga-xena-hub. s3.us-east-1.amazonaws.com/download/TCGA.LUAD. sampleMap%2FLUAD_clinicalMatrix). The microarray gene expression data and clinical information were downloaded from GEO databases (https://www.ncbi. nlm.nih.gov/geo/). Ethics statement This study has been approved by the research ethics review committee of Fudan University Shanghai Cancer Center (FUSCC) Institutional Review Board (No. 090977-1). Informed consent was obtained from either the patients themselves or their relatives for the donation of their samples to the tissue bank of Fudan University Shanghai Cancer Center. 5 Articles Statistical analysis All statistical analysis was performed with R (v4.1.2). R package survminer (v0.4.9) and survival (v3.2-13) were used to perform survival and cox analysis, respectively. Cox analysis was performed with 95% confidence intervals (CIs) for overall survival (OS) and RFS. Kaplan– Meier survival analysis in conjunction with the logrank test was employed to conduct assessments on OS and RFS. Comparisons between independent two groups used Welch’s t-test. Cell fraction comparisons between paired tumor and normal were performed with paired t-test. P-value was adjusted for multiple comparisons using the False Discovery Rate (FDR). Chi-squared test was used to perform hypothesis testing on categorical clinical data. If the expected count for each cell was less than five, a simulationbased test was performed. Role of funders The funders did not play a role in the study design, data collection, data analyses, data interpretation, or manuscript writing. Results Study design As shown in Fig. 1, a total of 197 patients were enrolled in our study. RNA sequencing (RNA-seq) and wholeexome sequencing (WES) data from paired tumor and normal samples of 197 patients were previously reported12 and used for this study. Based on CTR from radiological imaging, 98 AIS&MIA patients were classified into 67 pGGO, 30 mGGO and 1 solid nodule, meanwhile 99 LUAD patients were classified into 3 pGGO, 32 mGGO and 64 solid nodule (Supplementary Figure S2a). Among them, most of the patients with GGO component were non-smoking women (Supplementary Figure S2b). The mean tumor purity was over 0.2 (Supplementary Figure S2c) and the mean sequencing depth of WES was over 170× (Supplementary Figure S2d). We comprehensively explored the molecular events driving radiological stage (pGGO, mGGO and solid nodule) or pathological stage (AIS&MIA and LUAD), such as the genomic profiling, gene expression and tumor microenvironment. Finally, we successfully constructed a gene signature associated with pathological, radiological and histological stage to predict patient prognosis. Pathological progression and radiological progression driven by different genomic events Genomic profiling suggested that pathological stage and GGO components were driven by different genomic events. We revealed gene mutation profiles in pGGO of AIS&MIA, mGGO of AIS&MIA, pGGO of LUAD, mGGO of LUAD, and solid nodule of LUAD (Fig. 2a). There was no significant difference in the TMB between 6 pGGO and mGGO within each stage (AIS&MIA or LUAD). However, LUAD had higher TMB than AIS/ MIA (Fig. 2b, mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 6.78e-05). In LUAD, TMB was significantly increased in solid than in GGO (Fig. 2b, mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 9.14e-04). These results suggested that pathological progression and GGO components were codominated by the genomic mutation burdens. In our datasets, there is a higher percentage of smoking patients in solid nodule compared to GGO nodule (Supplementary Figure S2b). History of smoking was closely associated with higher TMB (Supplementary Figure S2e, Ever in Solid vs Never in Solid, Welch’s ttest, P.adj = 9.46e-04) only in solid nodule of LUAD. APOBEC-related mutations, defined as C > G and C > T mutations, are related to various cancer types, and contributed to carcinogenesis.18 We found that patients with solid nodule in LUAD had a significantly higher proportion of APOBEC-enrichment score than GGO nodule (Fig. 2c, mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.037, mGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 4.74e-04). However, smoking had little effect on APOBEC-enrichment score in different GGO component groups in AIS/ MIA and LUAD (Supplementary Figure S2e). The overall activity proportion of the SBS signature that decomposed to COSMIC was comprehensively explored in different pathological and radiological stage (Supplementary Figures S3 and S4a and S4b). The results revealed that smoking was associated with SBS4 in solid nodules of LUAD in our datasets (Supplementary Figure S4c, mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P = 0.025). SBS4 harbored single base deletions of predominantly cytosine (C) and was closely associated with tobacco smoking.22 Most mutations in SBS4 originated from the mutations in TP53,23,24 therefore SBS4 may distinguish smoker and non-smoker better in LUAD with more TP53 mutations. In our cohort, the mutation frequency of TP53 in solid nodule of LUAD was significantly higher than that of GGO-like LUAD (Fig. 2d, mGGO in LUAD vs Solid in LUAD, Fisher’s exact test, P.adj = 8.30e-03). Meanwhile, compared to GGO-like LUAD, smokers were more likely to appear as solid nodules (Supplementary Figure S2b). Therefore, LUAD solid has the highest proportion of smokers, and the increase in sample size may also be helpful for the distinguishing performance of SBS4. Besides, we identified that patients with solid nodule in LUAD had a significantly higher mutation burden of APOBEC-related signatures (SBS2 and SBS13) than GGO nodule (Supplementary Figure S4d; SBS2, mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P = 0.017; SBS13, mGGO in LUAD vs Solid in LUAD, Welch’s ttest, P = 0.02). The APOBEC-related signatures (SBS2 and SBS13) were correlated with APOBEC-enrichment score (Supplementary Figure S4e). www.thelancet.com Vol 94 August, 2023 Articles Fig. 1: Study design and main findings of molecular events driven pathological and radiological progression of lung adenocarcinoma. (a) 197 patients with different pathological and radiological stages were enrolled to generate genomic and transcriptomic data. pGGO and mGGO of AIS/MIA had low malignancy, mGGO of LUAD had moderate malignancy and solid nodule of LUAD had high malignancy. We couldn’t tell malignancy of solid nodule of AIS/MIA and pGGO of LUAD because our cohort only had one solid nodule of AIS/MIA and three pGGO of LUAD, which may lead to less convincing conclusions. (b) TP53 mutation frequency significantly increased in solid nodule of LUAD, EGFR mutation frequency and CNV load in LUAD were significantly higher than those in AIS/MIA, while TMB was significantly higher in LUAD than AIS/MIA, and also significantly higher in solid nodule of LUAD than GGO of LUAD. (c) Activity of cell proliferation pathway and E2F target pathway was significantly higher in solid nodule of LUAD. (d) CD8+ T cell and Treg gradually elevated or decreased from pGGO of AIS/MIA to solid nodule of LUAD. (e) GGO-to-Solid score from low to high represented the prognosis of patients from good to bad. www.thelancet.com Vol 94 August, 2023 7 Articles Fig. 2: Genomic events which may drive AIS/MIA to LUAD and GGO to solid. (a) Gene mutation profiling of pGGO and mGGO in AIS/MIA and pGGO, mGGO and solid nodule in LUAD. (b) Tumor mutation burden (TMB) in AIS/MIA divided into pGGO and mGGO and LUAD divided into pGGO, mGGO and solid nodule (mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 6.78e-05; mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 9.14e-04). (c) APOBEC enrichment score in AIS/MIA divided into pGGO and mGGO and LUAD divided into pGGO, mGGO and solid nodule (mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.037, mGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 4.74e-04). (d) Gene mutation frequency of eight genes in pGGO and mGGO of AIS/MIA and pGGO, mGGO and solid nodule of LUAD (EGFR, pGGO in AIS/MIA vs Solid in LUAD, Fisher’s exact test, P.adj = 0.046; TP53, mGGO in LUAD vs Solid in LUAD, Fisher’s exact test, P.adj = 8.30e-03). The VAF and purity distribution of EGFR (e) (mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.002; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.002) and TP53 (f) (mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.025) in LUAD among different stages. P-values were adjusted by false discovery rate (FDR). *P < 0.05; **P < 0.01; ***P < 0.001. 8 www.thelancet.com Vol 94 August, 2023 Articles Obviously, EGFR was the most frequently mutated gene in LUAD (Fig. 2a). EGFR mutation may be associated with pathological stage, in that LUAD had higher mutation frequency than AIS/MIA (Fig. 2d, Supplementary Figure S5a and S5b, pGGO in AIS/MIA vs Solid in LUAD, Fisher’s exact test, P.adj = 0.046). We also noticed that the copy number variation (CNV) sharply lifted from mGGO in AIS&MIA to mGGO in LUAD (Supplementary Figure S6a, mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 1.91e-07 and Supplementary Figure S6b). Mutation of TP53 and ERBB2 was found closely associated with GGO component. Solid nodule had significantly higher TP53 mutation frequency (53%) than mGGO (13%) in LUAD (Fig. 2d and Supplementary Figure S5b), suggesting that TP53 non-synonymous mutation may drive the transformation from part-solid to solid nodule. However, ERBB2 is at a specific low frequency of mutation in GGO-like LUAD in our datasets (Fig. 2d). No other recurrently mutated genes in LUAD were found to associate with GGO component,12 such as RBM10, KRAS, BRAF, MET and MAP2K1 (Fig. 2d). Besides, we have analyzed and compared the variant allele frequency (VAF) of EGFR, TP53, and ERBB2 (Fig. 2e and f and Supplementary Figure S5c). We observed that the VAF of tumors containing GGO component was significantly lower than that of solid tumors in samples with EGFR (Fig. 2e, mGGO in LUAD vs Solid in LUAD, Welch’s ttest, P.adj = 0.002) and TP53 (Fig. 2f, mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.025) mutations. The VAF of EGFR mutation may be also related with pathological stage (Fig. 2e, mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.002). By analyzing the correlation between tumor purity and VAF, we found that the lower VAF in tumors with GGO components was not totally due to low tumor purity (Supplementary Figure S5d–S5f). This result suggests that AIS/MIA and tumors with GGO components were in an earlier stage of tumor growth compared to solid tumors. Proliferation-associated gene module was downregulated in both AIS&MIA and GGO-like LUAD To identify the gene signatures associated with pathological stage and GGO components separately, we calculated the differentially expressed genes (DEGs) in the following four pairwise comparisons: mGGO vs pGGO in AIS&MIA, GGO of LUAD vs GGO of AIS&MIA, mGGO vs pGGO in LUAD, and solid vs mGGO in LUAD (|log2FC| ≥ 1 & P < 0.05) (Fig. 3a). DEGs identified in the above four comparisons suggested that both pathological progression and GGO components had significant effects on gene expression. PCA based on all the DEGs in Fig. 3a demonstrated that gene expression profiles of pGGO in LUAD were close to AIS/MIA (Fig. 3b). Gene enrichment score, which was calculated for each sample by applying gene set variant analysis www.thelancet.com Vol 94 August, 2023 (GSVA), indicated that the biological function of GGOlike LUAD was close to AIS&MIA (Supplementary Figure S7a and S7b). This may explain why pGGO in LUAD had excellent prognosis just like AIS/MIA.4 Using weighted gene co-expression network analysis (WGCNA), we divided the 3948 pathologically and radiologically associated DEGs (pGGO in AIS&MIA vs normal, mGGO vs pGGO in AIS&MIA, GGO of LUAD vs GGO of AIS&MIA, mGGO vs pGGO in LUAD, and solid vs mGGO in LUAD) into three co-expression modules (M1, M2 and M3) (Supplementary Figure S7c). M0 contained 806 genes that were not defined to a gene module (Supplementary Figure S7d). It was obvious that M1 and M2 were DEGs modules between tumor and normal, while M3 was a specific high-expression gene module for solid nodule (Fig. 3c and d). Gene functions in M1 and M2 were mainly related to development, signaling and immune system (Fig. 3e). The M3 genes, which were closely associated with proliferation, may explain why GGO-like nodules (down-regulated of M3 genes) progressed more slowly than solid nodules (up-regulated of M3 genes) (Supplementary Figure S7e–S7g). Moreover, we found that most of the genes in M3 were strongly associated with overall survival (OS) and RFS, and the distribution of hazard ratio (HR) was significantly higher in M3 than in M1 and M2 (Fig. 3f and g; HR of OS: M1 vs M2, Welch’s t-test, P.adj = 3.27e-15, M2 vs M3, Welch’s ttest, P.adj = 3.67e-10; HR of RFS: M1 vs M2, Welch’s ttest, P.adj = 8.22e-43, M2 vs M3, Welch’s t-test, P.adj = 2.97e-17). In brief, proliferation and poor prognosis related genes were relatively low expression in GGO-like nodule. These results may explain why GGOlike LUAD grows more slowly and has a better prognosis than solid nodule in LUAD at the gene expression level. Pathological progression and radiological progression were driven by different immune responses In the above analysis, we found significantly lower expression of proliferative pathway-related genes in GGO-like nodules than in solid nodule. To quantify the proliferation trends, we further evaluated the proliferation rates of different GGO components at different pathological stages. Using Fges method,20 we quantified the proliferation rate in each sample. Compared to paired normal lung tissue, proliferation rate significantly increased from mGGO to solid nodule in LUAD (Supplementary Figure S8a and S8c; proliferation rate, mGGO in LUAD vs normal, paired t-test, P.adj = 5.33e05, solid in LUAD vs normal, paired t-test, P.adj = 1.26e16; matrix remodeling, pGGO in AIS/MIA vs normal, paired t-test, P.adj = 4.20e-4, mGGO in AIS/MIA vs normal, paired t-test, P.adj = 4.77e-4, mGGO in LUAD vs normal, paired t-test, P.adj = 1.02e-04, solid in LUAD vs normal, paired t-test, P.adj = 2.57e-19). LUAD 9 Articles Fig. 3: Cell proliferation-related gene expression module was less active in GGO-like LUAD and associated with better prognosis. (a) Fold change of gene expression in the following four groups: mixed GGO vs pure GGO in AIS&MIA; non-solid of LUAD vs non-solid of AIS&MIA; mixed GGO vs pure GGO in LUAD; solid vs mixed GGO in AIS&MIA. Red dots represent relatively up-regulated genes and blue dots represent relatively down-regulated genes. (b) PCA based on expression of 1999 DEGs in (a). The color of dot represents pure GGO, mixed GGO and solid nodule in AIS&MIA and LUAD. (c) Three gene expression modules based on WGCNA. M1 consists of 1898 genes with gradual decrease in expression from normal to AIS&MIA to LUAD. M2 consists of 890 genes with gradual increase in expression from normal to AIS&MIA to LUAD. M3 consists of 354 genes that are specifically highly expressed in the solid nodule of the LUAD. (d) Heatmap showed the gene expression trend in the three modules. (e) Function annotation of genes in the three modules. The genes of M1 were enriched in development, signaling and immune. The genes of M2 were enriched in signaling. The genes of M3 were mainly enriched in proliferation 10 www.thelancet.com Vol 94 August, 2023 Articles containing GGO component has significantly lower proliferation rate compared to solid LUAD (Fig. 4a; mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 2.86e-08; pGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.002; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.06). Compared to GGO-like LUAD, the Ki67 positive rate was significantly higher in solid nodule of LUAD, which was consistent with the RNA-seq based proliferation rate (Fig. 4a; mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 8.82e-06; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.013). Meanwhile, using Fges method, we also found that the matrix remodeling score lifted from AIS&MIA to LUAD (Supplementary Figure S8a and S8c). In addition, the matrix remodeling activity was significantly downregulated in GGO-like LUAD relative to solid LUAD (Fig. 4d; mGGO in LUAD vs Solid in LUAD, Welch’s ttest, P.adj = 4.40e-04; mGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 1.80e-07; Normal vs pGGO in AIS/MIA, Welch’s t-test, P.adj = 3.41e-05). These results also indicated that proliferation and matrix remodeling may play an important role in the progression from AIS/MIA to LUAD as well as from GGO to solid nodule. The growth of tumor cells depends on their own ability to proliferate on one hand, and on the other hand it depends on whether they can defeat immune cells. To explore the changes in the immune microenvironment in GGO-like LUAD, we deconvoluted the proportion of various immune cells for each sample using gene expression data (Fig. 4c). Compared with normal tissue, regulatory T cells (Tregs) significantly increased and CD8 T cells significantly decreased in tumor sample (Supplementary Figure S8b and S8d, Tregs, pGGO in AIS/MIA vs normal, paired t-test, P.adj = 0.02, mGGO in AIS/MIA vs normal, paired t-test, P.adj = 0.01, mGGO in LUAD vs normal, paired t-test, P.adj = 3.73e04, solid in LUAD vs normal, paired t-test, P.adj = 4.34e09; CD8 T cells, pGGO in AIS/MIA vs normal, paired ttest, P.adj = 0.027, mGGO in AIS/MIA vs normal, paired t-test, P.adj = 1.89e-05, mGGO in LUAD vs normal, paired t-test, P.adj = 1.89e-05, solid in LUAD vs normal, paired t-test, P.adj = 1.72e-03). No significant difference in Tregs was observed between different GGO components, but Tregs significantly increased from AIS&MIA to LUAD (Fig. 4b; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.003; mGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 5.96e-06). CD8 T cells showed a downward trend from pGGO to solid nodule in LUAD and reached the lowest point in solid nodule of LUAD, although no statistical significance was observed (Fig. 4e, solid in LUAD vs normal, Welch’s t-test, P.adj = 7.58e-06; pGGO in AIS/MIA vs normal, Welch’s t-test, P.adj = 1.42e-04). The above results were successfully confirmed by flow cytometry. Consistent with RNA-seq results, the number of Tregs steadily increased during the disease progression and significantly increased in solid nodule of LUAD (Fig. 4b; pGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 0.007; mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.043). Conversely, CD8+ T cells decreased with the progression of the disease. Importantly, there was no significant trend both in deconvolution and flow cytometry (Fig. 4e). We found that the ability of CD8+ T cells to secrete perforin was almost significantly inhibited (Fig. 4f; Normal vs Solid in LUAD, Welch’s t-test, P.adj = 0.02; mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.053). Another study has found that Treg promotes the occurrence and development of tumors by inhibiting the endogenous secretion of killer factors by CD8+ T cells.25 This may suggest that one of the reasons for immune escape in LUAD may be through upregulating Tregs (Fig. 4b) to weaken the killing ability of CD8+ T cells (Fig. 4f), rather than directly reducing their numbers (Fig. 4e). The representative flow cytometry plots of CD8+ T cells and Tregs expressing perforin were shown in Fig. 4g and h. Prognosis prediction based on gene signatures associated with pathological and radiological progression The aforementioned results about gene modules and tumor microenvironment suggested that genes associated with GGO-to-solid progression in the LUAD play an important role in patients’ prognosis. Therefore, quantification of the process based on genes significantly varied between GGO and solid in the LUAD may be helpful in the prognostic stratification of LUAD patients. Meanwhile, histological progression related features also can predict histology patterns and prognosis of tumor samples. LUAD can be classified into lepidic-predominant, acinar-predominant, papillary-predominant, micropapillary-predominant, and solid-predominant adenocarcinoma according to the main growth patterns.26,27 Among them, lepidic-predominant nodule always contains GGO component and has a superior prognosis, whereas the solid-predominant adenocarcinoma usually does not have a GGO component and is associated with poor prognosis.28–30 We speculated that the biological differences between the GGO and solid nodule in radiology and the lepidic-predominant and solid- and also contain development and metabolic. (f) Proportional distribution of genes significantly associated with OS or RFS in M1, M2 and M3. (g) Distribution of OS and RFS hazard ratios of genes in M1, M2 and M3 (HR of OS: M1 vs M2, Welch’s t-test, P.adj = 3.27e-15, M2 vs M3, Welch’s t-test, P.adj = 3.67e-10; HR of RFS: M1 vs M2, Welch’s t-test, P.adj = 8.22e-43, M2 vs M3, Welch’s t-test, P.adj = 2.97e-17). Pvalues were adjusted by false discovery rate (FDR). ***P < 0.001. www.thelancet.com Vol 94 August, 2023 11 Articles Fig. 4: Dynamic changes in the tumor microenvironment evaluation. (a) Proliferation (RNA-seq, mGGO in LUAD vs Solid in LUAD, Welch’s ttest, P.adj = 2.86e-08; pGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.002; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.06) (IHC, mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 8.82e-06; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.013), (b) Tregs (RNA-seq, mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.003; mGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 5.96e-06) (Flow cytometry, pGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 0.007; mGGO in LUAD vs Solid in 12 www.thelancet.com Vol 94 August, 2023 Articles predominant adenocarcinoma in histology may be similar. To validate our hypothesis, we performed gene set enrichment analysis (GSEA) of cancer hallmarks for the two groupwise comparisons: GGO vs solid and lepidic-predominant vs solid predominant. As a result, eight hallmarks were significantly enriched in radiological solid and five hallmarks were significantly enriched in histological solid-predominant, with four hallmarks including G2/M checkpoint, E2F targets, mitotic spindle, and mTORC1 signaling, significantly enriched in both groups (Fig. 5a–d). We also identified four hallmarks that were significantly enriched in solid-predominant histology from the TCGA dataset, which were also identified in the FUSCC dataset (Fig. 5e and f). We conducted a simulation of radiological and histological progression using genes significantly altered from GGO to solid as well as lepidic-predominant to solid-predominant. To identify GGO/lepidic predominant and solid/solid predominant associated features, we performed differential expression analysis in the following three groups: solid vs GGO in the FUSCC dataset, solid-predominant vs lepidic-predominant in FUSCC dataset, and solid-predominant vs lepidicpredominant in the TCGA dataset. By setting |log2FC| ≥ 1.5 and P-value <0.05, we obtained 61 genes significantly up-regulated in GGO/lepidic-predominant (GGO/lepidic-specific genes) and 36 genes significantly up-regulated in solid/solid predominant (solidspecific genes) (Fig. 6a and b). 24/61 GGO & lepidicpredominant specific genes and 21/36 solid & solidpredominant specific genes were further identified in 8108 genes from the intersection of 12 gene expression datasets (Fig. 6c). Then, we constructed a signature named G2S (lepidic/GGO to solid score) for quantification of GGO to solid or lepidic predominant to solidpredominant using the 24 GGO/lepidic-predominant specific and 21 solid-predominant specific genes. It was significant that G2S increased during radiological or histological progression patterns in both the FUSCC (Fig. 6d and e; 6d: mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 5.14e-09; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.007; Normal vs pGGO in AIS/MIA, Welch’s t-test, P.adj = 0.002; 6e: Papillary vs Solid, Welch’s t-test, P.adj = 0.007; Lepidic vs Acinar, Welch’s t-test, P.adj = 3.24e-06; Normal vs AIS, Welch’s t-test, P.adj = 0.007) and TCGA datasets (Fig. 6f, Micropapillary vs Solid, Welch’s t-test, P.adj = 0.001; Papillary vs Micropapillary, Welch’s t-test, P.adj = 2.16e-04; Lepidic vs Acinar, Welch’s t-test, P.adj = 0.003). Solid and solid-predominant had the highest score on average in the FUSCC and TCGA datasets. In the FUSCC dataset, we classified patients into GGO-like group and solid-like group based on the median of G2S. Better prognosis in terms of overall survival (OS) and RFS was found in GGO-like group than in the solid-like group (Fig. 6g). The other 11 independent datasets were also classified into GGO-like and solid-like based on the median of G2S and validated the prognosis predictive performance of G2S (Fig. 6h and i, Supplementary Figure S9a and S9b). To strengthen the relationship between genomic and transcriptional data, we integrated gene mutation and gene expression to construct a prognostic model. EGFR and TP53 were chosen for their significantly increased mutation frequency observed from AIS&MIA to LUAD, or from GGO-like LUAD to solid nodule in LUAD. We analyzed the correlation between EGFR/TP53 mutation status and prognosis, and found that EGFR mutations were associated with a favorable prognosis in the LUAD (Supplementary Figure S10a and S10b), with the best prognosis observed in patients with GGO characteristics and EGFR mutations, and the worst prognosis observed in patients with solid features and wild-type EGFR (Supplementary Figure S10c and S10d). The same conclusion can be observed in the Asian LUAD dataset, GSE31210, which includes a high frequency of EGFR mutations (Supplementary Figure S10e–S10h). EGFRmutated patients exhibited a better prognosis compared to EGFR wild-type patients, even in the absence of tyrosine-kinase inhibitor (TKI) administration (Supplementary Figure S10i–S10n). There was a trend towards a correlation between mutations of tumor suppressor gene TP53 and poor prognosis (P = 0.15 and 0.35) (Supplementary Figure S11a and S11b). We integrated the gene expression-based G2S model with TP53 mutations and found that prognosis stratification was mainly driven by G2S, and TP53 mutation information did not significantly improve the prognostic performance of the model (Supplementary Figure S11c and S11d). Discussion GGO-like often appears in pre-invasive or early stage, with inert growth and good prognosis.4,14,31 Many studies, including our previous findings, have indicated LUAD, Welch’s t-test, P.adj = 0.043), (c) Immune cell content from gene expression showed immune response, (d) Matrix Remodeling (mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 4.40e-04; mGGO in AIS/MIA vs Solid in LUAD, Welch’s t-test, P.adj = 1.80e-07; Normal vs pGGO in AIS/MIA, Welch’s t-test, P.adj = 3.41e-05), (e) CD8+ T cells (RNA-seq, solid in LUAD vs normal, Welch’s t-test, P.adj = 7.58e-06; pGGO in AIS/MIA vs normal, Welch’s t-test, P.adj = 1.42e-04), and (f) CD8+Perforin+ T cells (Flow cytometry, Normal vs Solid in LUAD, Welch’s t-test, P.adj = 0.02; mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 0.053) distribution in normal, AIS&MIA divided into pGGO and mGGO and LUAD divided into pGGO, mGGO and solid nodule. (g) Representative flow cytometry plot of CD8+ T cells expressing perforin. (h) Representative flow cytometry plot of T regulatory cells (Tregs). P-values were adjusted by false discovery rate (FDR). *P < 0.05; **P < 0.01; ***P < 0.001. www.thelancet.com Vol 94 August, 2023 13 Articles Fig. 5: GSEA of the FUSCC and TCGA datasets. (a) Eight significantly enriched hallmarks between solid and GGO in the FUSCC datasets. (b) Five significantly enriched hallmarks between solid predominant and lepidic predominant in the FUSCC datasets. (c) Enrichment score and FDR of the eight gene sets significantly enriched in solid. (d) Enrichment score and FDR of the five gene sets significantly enriched in solid predominant. (e) Four significantly enriched hallmarks between solid predominant and lepidic predominant in TCGA datasets. (f) Enrichment score and FDR of the four gene sets significantly enriched in solid predominant. that there are many significant molecular differences between AIS&MIA and LUAD.12 In this study, we further classified AIS&MIA and LUAD into three 14 subtypes based on CTR which was used to define GGO proportion in tumor. Then, molecular characteristics of pGGO and mGGO in AIS&MIA as well as pGGO, www.thelancet.com Vol 94 August, 2023 Articles Fig. 6: Prognosis prediction features associated with pathological, radiological and histological in LUAD. (a) Venn plot showed up-regulated genes in GGO or lepidic predominant group by following three comparing groups: solid vs GGO in FUSCC dataset, solid predominant vs lepidic predominant in FUSCC dataset and solid predominant vs lepidic predominant in TCGA dataset. (b) Venn plot showed up-regulated genes in solid or solid predominant group by following three comparing groups: solid vs GGO in FUSCC dataset, solid predominant vs lepidic predominant in FUSCC dataset and solid predominant vs lepidic predominant in TCGA dataset. (c) Volcano plot, which derived from comparison between solid and GGO in FUSCC dataset, showed the log2FC and P-value distributions for 8108 genes from the intersection of 12 expression datasets. 24/61 GGO or lepidic predominant specific genes were marked with blue dots and 21/36 solid or solid predominant specific genes were marked with red dots. (d) G2S score distribution from normal to AIS&MIA and LUAD divided into pGGO, mGGO and solid in FUSCC dataset (mGGO in LUAD vs Solid in LUAD, Welch’s t-test, P.adj = 5.14e-09; mGGO in AIS/MIA vs mGGO in LUAD, Welch’s t-test, P.adj = 0.007; Normal vs pGGO in AIS/MIA, Welch’s t-test, P.adj = 0.002). (e) G2S score distribution from normal to different histological stages in FUSCC dataset (Papillary vs Solid, Welch’s t-test, P.adj = 0.007; Lepidic vs Acinar, Welch’s t-test, P.adj = 3.24e-06; Normal vs AIS, Welch’s t-test, P.adj = 0.007). (f) G2S score distribution of different histological stage in TCGA dataset (Micropapillary vs Solid, Welch’s t-test, P.adj = 0.001; Papillary vs Micropapillary, Welch’s t-test, P.adj = 2.16e-04; Lepidic vs Acinar, Welch’s t-test, P.adj = 0.003). (g) Overall survival (OS) and relapse-free survival (RFS) between GGO-like and solid-like divided by median value of G2S score. Hazard ratio and 95% confidence intervals (CIs) for OS (h) and RFS (i) calculated based on G2S score in 12 datasets. Pvalues were adjusted by false discovery rate (FDR). **P < 0.01; ***P < 0.001. www.thelancet.com Vol 94 August, 2023 15 Articles mGGO and solid in LUAD were comprehensively described. Our results showed consistent and different points of molecular characteristics in driving the pathological and radiological progression of GGO-like LUAD. Although relatively lower tumor mutation burden (TMB) and less copy number variation have been reported in subsolid nodule of lung,8,9 our study provided a more detailed comparison of different GGO components in AIS&MIA and LUAD. AIS&MIA and GGO-like LUAD exhibited similar level of TMB, suggesting that GGO-like LUAD may be the early stage in the development of tumor invasion (Fig. 2b). Meanwhile, our analysis revealed no significant difference in TMB and no DMG between pGGO of AIS&MIA and mGGO of AIS&MIA, suggesting that the effect of solid component on genomic events in AIS&MIA might be negligible (Fig. 2b and Supplementary Figure S5a). However, the genomic differences between pure solid and GGO nodule in AIS&MIA remained to be further explored, as the FUSCC dataset consisted of only one pure solid nodule in AIS&MIA. GGO-like nodule was closely associated with relatively low TP53 mutation frequency compared to solid nodule. TP53 is a tumor suppressor gene that is usually mutated in cancer, and its mutation can promote tumor proliferation and invasion, and regulate tumor metabolism.32–35 Our results suggested that TP53 mutation may contribute to solid component emergence in early stage of LUAD because significantly higher TP53 mutation frequency was found in solid nodule of LUAD than in GGO-like nodule (Fig. 2d). Even though TP53 mutation frequency of mGGO in AIS&MIA was not significantly higher than that of pGGO, there was an increasing trend in mGGO (Fig. 2d). EGFR mutation may be closely related to pathological progression, but not to GGO components. Although one study reported that the EGFR mutated group had significantly higher frequency of GGO than EGFR non-mutated group.36 Consistent with several studies,37,38 EGFR mutation was almost not significantly connected with GGO component in AIS&MIA and LUAD (Fig. 2d). ERBB2 (also known as HER2) is an oncogene, whose mutation is associated with EGFR tyrosine kinase inhibitor resistance, poor prognosis, and tumor invasion.39,40 However, GGO-associated nodule of AIS&MIA had higher ERBB2 mutation frequency than solid nodule of LUAD in our dataset (Fig. 2d). This conclusion was consistent with previous study,41 which showed a higher frequency of mutations in ERBB2 in AIS&MIA compared to LUAD. However, the low frequency of ERBB2 mutations and the small sample size of each subgroup after dividing our dataset into AIS&MIA and LUAD may lead to less robustness of our conclusions. A larger sample size is needed to validate this conclusion. AIS/MIA and GGO-like LUAD tumor exhibit lower proliferative activity and are less densely clustered 16 compared to pure solid LUAD. Consequently, accurate analysis of AIS/MIA and GGO-like LUAD requires higher tumor purity and sequencing depth. Previous studies have employed micro-dissection technique to enhance tumor purity for AIS/MIA and utilized target sequencing to enhance sequencing depth.42–44 Fortunately, our findings were consistent with these studies. For example, TMB, EGFR and TP53 mutations were more common in LUAD compared to AAH/AIS/MIA, while ERBB2 had higher mutation frequency in the earlier stage, and EGFR-mutated cases showed a favorable prognosis. Cell proliferation and matrix remodeling were less activated in GGO nodule compared to solid nodule of LUAD. Proliferation-related genes, which were closely associated with patients’ prognosis, were expressed lower in GGO nodule than solid nodule in LUAD (Fig. 3c and d). The activity of proliferation-related pathways, including G2M checkpoint, E2F targets, and mitotic spindle pathway, which contributed to early lung cancer progression,45 was significantly lower in GGO nodules than in solid nodules (Figs. 3e and 5a and c). Aberrantly activated G2M checkpoint pathway and mitotic spindle pathway cause more tumor cells to enter the mitotic period (M period), while overexpression of several E2F targets (e.g. CCNE and CDC6) will induce replicative stress and decreases genomic stability,46 and all these can result in tumor cell proliferation. It was clear that tumor proliferation rate also increased significantly from GGO nodule to solid nodule (Fig. 4a). Therefore, we speculated that the relatively low activity of proliferation-related pathway may lead to slower growth of tumor cells in GGO of LUAD, which may explain why GGO nodule exhibits inert property and good prognosis. Consistent with proliferation rate, matrix remodeling, which has been demonstrated closely associated with tumor invasion,47 increased in the pathological course of LUAD and reached its highest in solid of LUAD (Fig. 4d). It has been reported that matrix remodeling plays an important role in the process of pre-invasive to invasive.10,13 We further demonstrated that matrix remodeling activity of GGO nodule was significantly lower than solid nodule in LUAD (Fig. 4d). The low malignancy and slow progression of GGO in LUAD compared to solid nodules is partly due to its own weak cell proliferative capacity and also depends on the strong immune activity of immune cells in tumor microenvironment. We found that immunosuppression was significantly enhanced from AIS&MIA to LUAD, reaching the highest in the solid of LUAD. Consistent with previous report,48 we also observed increase in immune inhibition from pre-invasive to invasive LUAD. Besides, the immunosuppression was significantly increased from GGO to solid nodule in LUAD (Fig. 4b). To quantify transition from GGO in LUAD with low malignancy to solid nodules in LUAD with high malignancy, we constructed a gene signature called G2S www.thelancet.com Vol 94 August, 2023 Articles originated from the comparison of pure GGO nodule and pure solid nodule. Gene signature (G2S) reflecting GGO to solid or lepidic-predominant to solidpredominant process was an independent predictor of patients’ OS and RFS in 11 independent LUAD cohorts (Fig. 6g and i). Radiological subtype (pGGO, mGGO and solid) and histological subtype (lepidic, acinar, papillary, micropapillary, solid, etc.) both were vital to guide the prognostic stratification and therapy for LUAD patients.2,4,49–51 Therefore, G2S could be a good complement to radiological and histological subtypes. It should be noted that there were some limitations in our study. First, our cohort contained only one case of solid nodule in AIS&MIA and only three cases of pGGO in LUAD, which may lead to less robust conclusions. Secondly, we could not obtain authentic mGGO containing both GGO and solid parts from fresh tumor samples. Consequently, the specific contributions of the solid and subsolid components in the analysis of mGGO remain unclear. However, these samples harbored molecular alterations in overall mGGO nodules or the patients who exhibit mGGO radiologically, which was also worth studying, as the GGO part in mGGO may be different from pGGO, and the solid part in mGGO may also be different from the solid nodules. We hope there are methods in the future that can be used to identify the radiological subtypes of fresh tumor tissue. In summary, we comprehensively depicted the genomic profiling of different GGO component nodules of AIS&MIA and LUAD. We found that the molecular characteristics of GGO in LUAD are similar to AIS/MIA with low malignancy, which may be one of the reasons for the indolent state and good prognosis of GGO. Therefore, studies about GGO-like nodule should exclude the influence of pathological differences. Our results demonstrated that even though both AIS&MIA and LUAD were GGO-like nodule, there were significant differences in TMB, APOBEC-related mutation, gene mutation frequency, CNV frequency and gene expression between AIS&MIA and LUAD. In LUAD, lower TMB and TP53 mutation frequency were found in GGO nodule than in solid nodule. Almost no significant difference of genomic events in pGGO and mGGO of AIS&MIA. Proliferation rate and matrix remodeling were lower active in GGO nodule than solid nodule. To accurately identify whether a specific molecular event was associated with GGO components or pathological stage, one needs to control variables, i.e. radiology or pathology. And it was what our study design uniquely offered. Contributors YZheng, HC, LS, JS and HJ conceived the study. JS, HJ and JY analyzed the data. YZhao and JL collected and interpreted clinical data. JL and HJ performed flow cytometry experiment analyzed the data. JS and HJ drafted the manuscript. LS, HC and YZheng revised the manuscript and www.thelancet.com Vol 94 August, 2023 supervised the work. All authors reviewed and approved the manuscript. HJ and JS have accessed and verified the data. Data sharing statement WES and RNA-seq raw data have been deployed in the National Omics Data Encyclopedia (NODE) (https://www.biosino.org/node) with the accession number OEP000325. RNA-seq data of TCGA lung adenocarcinoma were downloaded from GDC (https://portal.gdc.cancer.gov/). All gene expression microarray data and corresponding clinical phenotypes were obtained from GEO (https://www.ncbi.nlm.nih.gov/geo/). Declaration of interests The authors declare no competing financial interests. Acknowledgements This study was supported in part by the National Natural Science Foundation of China (32170657), the National Natural Science Foundation of China (82203037), and Shanghai Sailing Program (22YF1408900). Fig. 1 was created with Biorender.com. Appendix A. Supplementary data Supplementary data related to this article can be found at https://doi. org/10.1016/j.ebiom.2023.104728. References 1 Hansell DM, Bankier AA, MacMahon H, McLoud TC, Muller NL, Remy J. Fleischner society: glossary of terms for thoracic imaging. Radiology. 2008;246(3):697. 2 Suzuki K, Koike T, Asakawa T, et al. A prospective radiological study of thin-section computed tomography to predict pathological noninvasiveness in peripheral clinical IA lung cancer (Japan Clinical Oncology Group 0201). J Thorac Oncol. 2011;6(4):751–756. 3 Board WCoTE. WHO classification of tumours: thoracic tumours. 5th ed. 2021. 4 Fu F, Zhang Y, Wen Z, et al. Distinct prognostic factors in patients with stage I non–small cell lung cancer with radiologic part-solid or solid lesions. J Thorac Oncol. 2019;14(12):2133–2142. 5 Fan L, Wang Y, Zhou Y, et al. Lung cancer screening with low-dose CT: baseline screening results in Shanghai. Acad Radiol. 2019;26(10):1283–1291. 6 Zhang Y, Jheon S, Li H, et al. Results of low-dose computed tomography as a regular health examination among Chinese hospital employees. J Thorac Cardiovasc Surg. 2020;160(3):824–831.e4. 7 Aokage K, Miyoshi T, Ishii G, et al. Influence of ground glass opacity and the corresponding pathological findings on survival in patients with clinical stage I non–small cell lung cancer. J Thorac Oncol. 2018;13(4):533–542. 8 Chen K, Bai J, Reuben A, et al. Multiomics analysis reveals distinct immunogenomic features of lung cancer with ground-glass opacity. Am J Respir Crit Care Med. 2021;204(10):1180–1192. 9 Li Y, Li X, Li H, et al. Genomic characterisation of pulmonary subsolid nodules: mutational landscape and radiological features. Eur Respir J. 2020;55(2):1901409. 10 Altorki NK, Borczuk AC, Harrison S, et al. Global evolution of the tumor microenvironment associated with progression from preinvasive invasive to invasive human lung adenocarcinoma. Cell Rep. 2022;39(1):110639. 11 Xing X, Yang F, Huang Q, et al. Decoding the multicellular ecosystem of lung adenocarcinoma manifested as pulmonary subsolid nodules by single-cell RNA sequencing. Sci Adv. 2021;7(5): eabd9738. 12 Chen H, Carrot-Zhang J, Zhao Y, et al. Genomic and immune profiling of pre-invasive lung adenocarcinoma. Nat Commun. 2019;10(1):5472. 13 Shang J, Zhao Y, Jiang H, et al. Poor prognosis of stage I lung adenocarcinoma patients determined by elevated expression over pre/minimally invasive status of COL11A1 and THBS2 in the focal adhesion pathway. medRxiv. 2021:2021.12.16.21267913. https:// doi.org/10.1101/2021.12.16.21267913. 14 Ye T, Deng L, Wang S, et al. Lung adenocarcinomas manifesting as radiological part-solid nodules define a special clinical subtype. J Thorac Oncol. 2019;14(4):617–627. 17 Articles 15 Talevich E, Shain AH, Botton T, Bastian BC. CNVkit: genome-wide copy number detection and visualization from targeted DNA sequencing. PLoS Comput Biol. 2016;12(4):e1004873. 16 Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 2011;12(4):R41. 17 Mayakonda A, Lin DC, Assenov Y, Plass C, Koeffler HP. Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res. 2018;28(11):1747–1756. 18 Roberts SA, Lawrence MS, Klimczak LJ, et al. An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers. Nat Genet. 2013;45(9):970–976. 19 Shi L, Reid LH, Jones WD, et al. The MicroArray quality control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 2006;24(9):1151– 1161. 20 Bagaev A, Kotlov N, Nomie K, et al. Conserved pan-cancer microenvironment subtypes predict response to immunotherapy. Cancer Cell. 2021;39(6):845–865.e7. 21 Newman AM, Liu CL, Green MR, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods. 2015;12(5): 453–457. 22 Alexandrov LB, Kim J, Haradhvala NJ, et al. The repertoire of mutational signatures in human cancer. Nature. 2020;578(7793): 94–101. 23 Pfeifer GP, Denissenko MF, Olivier M, Tretyakova N, Hecht SS, Hainaut P. Tobacco smoke carcinogens, DNA damage and p53 mutations in smoking-associated cancers. Oncogene. 2002;21(48): 7435–7451. 24 Kucab JE, Zou X, Morganella S, et al. A compendium of mutational signatures of environmental agents. Cell. 2019;177(4):821–836.e16. 25 Ganesan A-P, Johansson M, Ruffell B, et al. Tumor-infiltrating regulatory T cells inhibit endogenous cytotoxic T cell responses to lung adenocarcinoma. J Immunol. 2013;191(4):2009–2017. 26 Nicholson AG, Tsao MS, Beasley MB, et al. The 2021 WHO classification of lung tumors: impact of advances since 2015. J Thorac Oncol. 2022;17(3):362–387. 27 Travis WD, Brambilla E, Nicholson AG, et al. The 2015 World Health Organization classification of lung tumors. J Thorac Oncol. 2015;10(9):1243–1260. 28 Fujikawa R, Muraoka Y, Kashima J, et al. Clinicopathologic and genotypic features of lung adenocarcinoma characterized by the International Association for the study of lung cancer grading system. J Thorac Oncol. 2022;17(5):700–707. 29 Warth A, Muley T, Meister M, et al. The novel histologic International Association for the study of lung cancer/American Thoracic Society/European Respiratory Society Classification System of lung adenocarcinoma is a stage-independent predictor of survival. J Clin Oncol. 2012;30(13):1438–1446. 30 Travis WD, Brambilla E, Noguchi M, et al. International Association for the study of lung cancer/American Thoracic Society/European Respiratory Society International Multidisciplinary Classification of lung adenocarcinoma. J Thorac Oncol. 2011;6(2):244–285. 31 Hattori A, Matsunaga T, Hayashi T, Takamochi K, Oh S, Suzuki K. Prognostic impact of the findings on thin-section computed tomography in patients with subcentimeter non–small cell lung cancer. J Thorac Oncol. 2017;12(6):954–962. 32 Di Agostino S, Strano S, Emiliozzi V, et al. Gain of function of mutant p53: the mutant p53/NF-Y protein complex reveals an aberrant transcriptional mechanism of cell cycle regulation. Cancer Cell. 2006;10(3):191–202. 18 33 Di Agostino S, Sorrentino G, Ingallina E, et al. YAP enhances the pro-proliferative transcriptional activity of mutant p53 proteins. EMBO Rep. 2016;17(2):188–201. 34 Wang S-P, Wang W-L, Chang Y-L, et al. p53 controls cancer cell invasion by inducing the MDM2-mediated degradation of Slug. Nat Cell Biol. 2009;11(6):694–704. 35 Zhou G, Wang J, Zhao M, et al. Gain-of-function mutant p53 promotes cell growth and cancer cell metabolism via inhibition of AMPK activation. Mol cell. 2014;54(6):960–974. 36 Hasegawa M, Sakai F, Ishikawa R, Kimura F, Ishida H, Kobayashi K. CT Features of epidermal growth factor receptormutated adenocarcinoma of the lung: comparison with nonmutated adenocarcinoma. J Thorac Oncol. 2016;11(6):819–826. 37 Rizzo S, Petrella F, Buscarino V, et al. CT radiogenomic characterization of EGFR, K-RAS, and ALK mutations in non-small cell lung cancer. Eur Radiol. 2016;26(1):32–42. 38 Glynn C, Zakowski MF, Ginsberg MS. Are there imaging characteristics associated with epidermal growth factor receptor and KRAS mutations in patients with adenocarcinoma of the lung with bronchioloalveolar features? J Thorac Oncol. 2010;5(3):344–348. 39 Wang SE, Narasanna A, Perez-Torres M, et al. HER2 kinase domain mutation results in constitutive phosphorylation and activation of HER2 and EGFR and resistance to EGFR tyrosine kinase inhibitors. Cancer Cell. 2006;10(1):25–38. 40 Wei XW, Gao X, Zhang XC, et al. Mutational landscape and characteristics of ERBB2 in non-small cell lung cancer. Thorac Cancer. 2020;11(6):1512–1521. 41 Zhang C, Zhang J, Xu FP, et al. Genomic landscape and immune microenvironment features of preinvasive and early invasive lung adenocarcinoma. J Thorac Oncol. 2019;14(11):1912–1923. 42 Hu X, Fujimoto J, Ying L, et al. Multi-region exome sequencing reveals genomic evolution from preneoplasia to lung adenocarcinoma. Nat Commun. 2019;10(1):2978. 43 Sivakumar S, Lucas FAS, McDowell TL, et al. Genomic landscape of atypical adenomatous hyperplasia reveals divergent modes to lung adenocarcinoma. Cancer Res. 2017;77(22):6119–6130. 44 Wang S, Du M, Zhang J, et al. Tumor evolutionary trajectories during the acquisition of invasiveness in early stage lung adenocarcinoma. Nat Commun. 2020;11(1):6083. 45 Mascaux C, Angelova M, Vasaturo A, et al. Immune evasion before tumour invasion in early lung squamous carcinogenesis. Nature. 2019;571(7766):570–575. 46 Kent LN, Leone G. The broken cycle: E2F dysfunction in cancer. Nat Rev Cancer. 2019;19(6):326–338. 47 Winkler J, Abisoye-Ogunniyan A, Metcalf KJ, Werb Z. Concepts of extracellular matrix remodelling in tumour progression and metastasis. Nat Commun. 2020;11(1):5120. 48 Dejima H, Hu X, Chen R, et al. Immune evolution from preneoplasia to invasive lung adenocarcinomas and underlying molecular features. Nat Commun. 2021;12(1):2722. 49 Deng C, Zheng Q, Zhang Y, et al. Validation of the Novel International Association for the study of lung cancer grading system for invasive pulmonary adenocarcinoma and association with common driver mutations. J Thorac Oncol. 2021;16(10):1684–1693. 50 Hou L, Wang T, Chen D, et al. Prognostic and predictive value of the newly proposed grading system of invasive pulmonary adenocarcinoma in Chinese patients: a retrospective multicohort study. Mod Pathol. 2022;35(6):749–756. 51 Saji H, Okada M, Tsuboi M, et al. Segmentectomy versus lobectomy in small-sized peripheral non-small-cell lung cancer (JCOG0802/ WJOG4607L): a multicentre, open-label, phase 3, randomised, controlled, non-inferiority trial. Lancet. 2022;399(10335):1607–1617. www.thelancet.com Vol 94 August, 2023