Screening of key biomarkers of tendinopathy based on ...

27
Page 1/27 Screening of key biomarkers of tendinopathy based on bioinformatics and machine learning algorithms Yaxi Zhu ( [email protected] ) Xiangtan Central Hospital https://orcid.org/0000-0003-0709-1374 Jia Qiang Huang Xiangtan Central Hospital Yun Dong Zhou Sai Nuo Xi Scientic Research Institute Yu Yang Ming Xiangtan Central Hospital Zhao Zhuang Weifang Medical University Hong Xia Xiangtan Central Hospital Research Article Keywords: Bioinformatics, machine learning, tendonopathy, algorithms, biomarkers Posted Date: August 9th, 2021 DOI: https://doi.org/10.21203/rs.3.rs-792462/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License

Transcript of Screening of key biomarkers of tendinopathy based on ...

Page 1/27

Screening of key biomarkers of tendinopathy basedon bioinformatics and machine learning algorithmsYaxi Zhu  ( [email protected] )

Xiangtan Central Hospital https://orcid.org/0000-0003-0709-1374Jia Qiang Huang 

Xiangtan Central HospitalYun Dong Zhou 

Sai Nuo Xi Scienti�c Research InstituteYu Yang Ming 

Xiangtan Central HospitalZhao Zhuang 

Weifang Medical UniversityHong Xia 

Xiangtan Central Hospital

Research Article

Keywords: Bioinformatics, machine learning, tendonopathy, algorithms, biomarkers

Posted Date: August 9th, 2021

DOI: https://doi.org/10.21203/rs.3.rs-792462/v1

License: This work is licensed under a Creative Commons Attribution 4.0 International License.  Read Full License

Page 2/27

AbstractTendinopathy is a complex, multifaceted tendon disease that is often associated with overuse andcauses signi�cant health care costs with its high prevalence. At present, the pathogenesis and effectivetreatment of tendinopathy cannot be fully elucidated. This study aims to deeply explore and analyze thekey genes, functional pathways, and immune in�ltration characteristics of the occurrence anddevelopment of tendinopathy.

Methods

The gene expression pro�les of GSE106292, GSE26051 and GSE167226 were downloaded from GEOdatabase. The WGCNA analysis was performed on GSE106292 data set by the R package in R software,and differential gene analysis was performed on GSE26051 and GSE167226 data sets by combining andscreening. The gene enrichment analysis of GO and KEGG and immune cell in�ltration analysis wereperformed. Lasso logistic regression, support vector machine (SVM-REF) and Gaussian mixture model(GMMS) algorithm were used to screen and identify early diagnostic genes.

Results

We have obtained a total of 171 DEGs through WGCNA analysis and screening of different expressedgenes. By GO and KEGG enrichment analysis, it was found that these malregulated genes were related tomTOR, HIF-1, MAPK, NF-κB and VEGF signaling pathways. Immunoin�ltration analysis showed that M1macrophages, activated mast cells and activated NK cells had in�ltration signi�cance. MacroD1 may bean early diagnosis gene, and it was found based on Lasso, SVM-REF and GMMS algorithm.

Conclusions

Based on comprehensive bioinformatics analysis, we identi�ed the potential early diagnosis genesMACROD1, key regulatory pathways and immune in�ltration characteristics of tendinopathy. These keygenes and pathways may be used as biomarkers and molecular therapeutic targets for early tendoninjury to guide drugs and basic research.

IntroductionTendinopathy is usually described as pathological changes in injured and diseased tendons that lead tolimb pain and reduced function, and is characterized by abnormalities in the molecular structure,composition, and cellular matrix of tendons[1]. In recent years, the prevalence of tendinopathy hasincreased, leading to long-term or permanent limb dysfunction and loss in some patients withtendinopathy[2]. Tendinopathy occurs in the extremities and accounts for 30 to 50 percent ofmusculoskeletal and disease in the motor system[3, 4]. It is well known that the causes of tendinopathyinclude both intrinsic and extrinsic factors. The external factors, cause acute tendon injury, while thechronic tendinopathy is the result of both extrinsic and intrinsic factors[5]. Previous studies have

Page 3/27

suggested that the imbalance of normal homeostatic responses in tendon tissue, including immune cellin�ltration, stromal cell dysfunction, cell apoptosis, oxidative stress and stromal dysfunction, togetherleads to early tendon lesions[1]. Unfortunately, at present, tendinopathy is often diagnosed in the middleand late stages, and most treatment options are targeted at the later stages of the disease, includingcentrifugal exercises, drug injections and surgical treatment[6–8]. Studies have shown that the e�cacyand evidence-based effectiveness of most treatment regimens for tendinopathy remains inadequate.Thus, early diagnosis and treatment is the key to complete recovery. At present, there are few studies onthe key genes and pathways that lead to tendinopathy in the early stage. But, understanding the keypathways that affect the regulation and dynamic homeostasis of extracellular matrix in the early stage isvery important for future targeted therapy of tendinopathy. Thus, more in-depth studies are needed toclarify the molecular mechanisms of key genes, functional pathways, in�ammation and immunity, etc, soabout providing new insights for later studies of tendinopathy and molecular targeted drug therapy.

With the development of high-throughput omics, Bioinformatics analysis methods have become animportant tool for the discovery of key genes and many diseases signaling pathways[9, 10]. In this study,we downloaded raw data from the NCBI Comprehensive Gene Expression Database (GEO). A co-expression network was constructed from GSE106292 dataset and the related gene modules wereidenti�ed. Data sets of GSE26051 and GSE167226 were combined and different genes were screened.We intersected different genes and gene modules to obtain common differential genes, and then carriedout the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis.We then used whole-genome gene expression data from the tendon tissue for analysis of immune cellin�ltration. We used LASSO algorithm Lasso regression model, machine learning algorithm supportvector machine (SVM-ref), clustering algorithm Gaussian mixture model (GMMS) to screen earlydiagnosis genes[11–13]. The objective of this study was to identify key genes and pathways thatcontribute to the occurrence and progression of tendinopathy at the molecular level, and to identifycandidate genes that could be used as early diagnostic and therapeutic targets.

Material And MethodsData collection and download

GEO (Gene Expression Omnibus) database (http://www.ncbi.nlm.nih.gov/geo/) is an international publicdatabase, used to store and provide free microarray, the second generation sequencing and high-throughput functional genomics data set[14, 15]. We downloaded GSE106292, GSE26051 andGSE167226 data sets by using the R software GeoQuery package. The GSE106292 dataset included 35cases of gene expression pro�les of tendon, bone, muscle, cartilage and ligament[15]. The GSE26051dataset included gene expression pro�les of 23 patients with chronic tendinopathy and 23 normaltendons[16]. The GSE167226 dataset includes gene expression pro�les from 19 patients withtendinopathy[17]. Institutional Review Board approval was not required because the study was based ona public database and did not involve animal or human studies. 

Page 4/27

Data preprocessing and DEG screening

The original data of GSE106292, GSE26051 and GSE167226 datasets were corrected and normalized byusing log2 of the R software. Bioconductor platform (http://www.bioconductor.org/) gene comments �leis used to probe the matrix[18]. At the same time, the expression matrix of GSE26051 and GSE167226datasets was merged, and the differences between batches were eliminated using the Limma softwarepackage. The differences of gene expression pro�les between GSE26051 and GSE167226 datasets wereanalyzed by using the Limma function and the RemoveBatch Effect function using Limma[19]. Bydeleting the genes with too low expression value, the expression pro�le value was converted to log2-counts per million (logCPM), and linear regression was performed to construct the comparison matrix.Based on bayesian calculation T value, F value and log odds, differences in genetic screening to meet therequirements, to meet the �lter for | log2 (FC) | >1, PValue<0.05. For data visualization, we mapped thevolcano using the ggplot2 package[20].

Gene WGCNA analysis

The expression matrix of GSE106292 dataset was selected to construct the expression pro�le of potentialrelated genes in tendinopathy. The WGCNA software package is an open source and widely used methodfor identifying co-expression networks in R software[21]. The Pearson correlation coe�cient is calculatedto construct the correlation matrix, and the soft threshold function is used to transform the correlationmatrix into a weighted adjacency matrix. In order to obtain a balanced co-expressed network betweenscale independence and mean connectivity, a soft connectivity algorithm is used to calculate the scaleindependence and mean connectivity with different powers. We transform the adjacency matrix intotopological overlap matrix (TOM). According to 1-tom as distance measurement, we classi�ed geneclustering as co-expression modules, with a depth split value of 2, a minimum size cutoff value of 20, anda maximum module size of 5,000. To determine the association between co-expression modules andclinical characteristics, we set the 0.25 merge height as the clustering criterion for similar modules.

GO and KEGG enrichment analysis

Gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) Gene ontology(GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysiswere calculated using the R package Clusterpro�ler program[22]. It includes annotation mining thefunction and path of the module, and identifying the dysfunctional module with function and path. Pvalue < 0.05 was considered signi�cant, and the identi�ed signi�cant analyses were classi�ed by genecounting.

The analysis of Immune cell in�ltration

Cibersort is an online analysis tool for immunocell in�ltration for a variety of diseases, includingosteoarthritis, lupus nephritis, atopic dermatitis, acne and rosacea dermatitis[23-26]. At present, fewstudies have used the Cibersort method to study the characteristics of immune cell in�ltration in

Page 5/27

tendinopathy. We assessed the proportion of immune cells in patients and the normal population byCibersort analysis based on the genome-wide level of tendinopathy[27]. This revealed the characteristicsof tendinopathy immune cell in�ltration by heat mapping using ggplot2.

Screening and identi�cation of gene prediction model for early diagnosis

Lasso logistic regression is a machine learning method, which determines the variable by �nding the λvalue with the smallest classi�cation error[28]. By processing the data of GSE26051 and GSE167226processed, 75 % of the samples in the data set are selected as the test set, 25 % of the samples are usedas the validation set, and the glmnet package of R is used as the binomial LASSO model of the trainingset. We plotted the operating characteristic curve of the recipient and determined its AUC.The diagnosticvalue of key genes was evaluated using the pROC software package in R[29]. SVM-RFE is a machinelearning method based on support vector machine, which �nds the best variable by subtracting thefeature vector generated by svm. We use SVM-REF-Algorithm- for feature selection, mainly by using�vefold cross-validation to split the data, assign random numbers, perform feature selection, and �nallyevaluate the effect. We used the above two algorithms to screen the key genes of tendinopathy at thesame time and obtained the same key genes. Finally, the same key genes were classi�ed and screened byGaussian mixture model (GMMS). The Gaussian mixture model (GMMS) is a feasible screening methodwith better clustering performance[30]. Through repeated training of Gaussian mixture model algorithm,we screened the highest cluster in the optimal cluster in the �gure, screened the optimal AUC value in the�fth category as 0.95, and identi�ed the �nal candidate key genes.

Results

Data preprocessing and the screening of differential genesAfter gene annotation and standardization of the data, GSE26051 data set contains 46 samples (2088genes), and GSE167226 data set contains 19 samples (19402 genes). By combining the data sets, Rpackage was used to analyze the whole gene expression of the cleaned data. Figure1A shows the DEGexpression heat map. Differential analysis of gene counts (2088 genes) that patient tissue and normaltissue samples was performed using the R - packet LIMMA. We used R package ggplot2 to map thevolcano map of the differentially expressed genes. The screening conditions were | log2 (FC) | > 1,PValue<0.05. A total of 2995 DEGs was identi�ed, including 2729 up-regulated genes and 266 down-regulated genes (Fig.1B).

Construction of weighted co-expression network and identi�cation of key modules

We performed WGCNA analysis on 20,226 genes that data set GSE106292 has been annotated. Wepreprocessed the sample expression values, and screened 15034 genes by screening standard deviation > 0.5. Based on the proximity matrix β = 16, we made the gene distribution conform to the scale-freenetwork, sample tree and soft threshold estimation according to the connectivity degree. We set the

Page 6/27

vertical axis as the scale-free topology �tting index R2 (the values in the SFT.R.Sq column in thestatistical results) (Fig. 2A) and the average connectivity of the network (Fig. 2B). According to thenegative correlation between k and p(k) (correlation coe�cient 0.84), it can be shown that the selected β = 16 meets the standard of establishing the gene scale-free network (Fig. 2C, D).

By setting the soft threshold power to 1–10, β = 4 (scale-free R 2 = 0.83), we used the dynamic shearmethod to cut the tree into different modules (the minimum number of genes in the module is 30) and cutthe height to 0.3, using 0.25 as the merge threshold (shear height). Modules at this height will merge(modules with a correlation higher than 0.75 will merge). In the results of the respruning of the clustertree, we identi�ed 20 modules in the heat map of correlation between the tree diagram of gene expressionand module features. We selected the two modules with the highest correlation as the research targetmodule, among which ModuleTraitCor in Purple = 0.54, ModuleTraitPvalue = 9E-04 and ModuleTraitCor inSkyblue2 = 0.41, ModuleTraitPvalue = 0.02 meets the criteria.

GO and KEGG enrichment analysis of differential genes

A total of 171 genes were obtained from the intersection of genes and differential genes in the module,including 150 up-regulated genes and 21 down-regulated genes. Gene Ontology (GO) and Kyoto ProtocolEncyclopedia of Genes and Genomes (KEGG) were used for enrichment analysis of the obtained DEGusing the clusterPro�ler package of the R language. The results showed that the biological processes(BP) of the up-regulated genes were mainly related to the regulation of mitochondrion organization, theregulation of apoptotic signaling pathway, regulation of reactive oxygen species biosynthetic process, theregulation of tumor necrosis factor-mediated signaling pathway and other pathways. In terms of cellcomposition (CC), it is mainly related to peroxisome, microbody and ubiquitin ligase complex. In terms ofmolecular function (MF), it is mainly related to guanyl-nucleotide exchange factor activity, RNA helicaseactivity, GTPase regulator activity and phosphatidylinositol binding(Figure 4A, B). Among the down-regulated genes, biological processes (BP) are mainly related to protein acylation, positive regulation ofprotein-containing complex assembly, and I-kappaB kinase/NF-kappaB signaling. In terms of cellcomposition (CC), it is mainly related to catalytic step 2 spliceosome and a cluster of actin-based cellprojections. In terms of molecular function (MF), it is mainly related to the binding of small GTPasebinding and Ras GTPase binding (Figure 4A, B). The upregulated genes were mainly concentrated inmTOR, HIF-1, MAPK, NF-κB, NOD-like receptor and VEGF signaling pathways in the KEGG pathway richconcentration. The relatively down-regulated genes were mainly concentrated in T cell receptor signalingpathway, spliceosome, and sphingolipid signaling pathway (Figure 4C, D).

The analysis of Immune cell in�ltration

We obtained the Cibersort absolute score for the analysis of immune cell in�ltration based on theCibersort algorithm, which can re�ect the absolute content of 22 immune cells in each sample. Accordingto CIBERSORT absolute score, we calculated the proportion of each in�ltrating cell when the totalin�ltration rate was 100 %. We mapped the overall proportion pattern of 22 subgroups of immune cells intenopathy and healthy controls. The results showed that M2 macrophages, resting mast cells,

Page 7/27

neutrophils, activated NK cells and regulatory T cells (Tregs) were the highest in�ltrating cells in allsamples (Figure 5A). Heat maps of immune cell in�ltration between the tendinopathy group and thecontrol group showed signi�cant enrichment of M1 macrophages, activated mast cells, γ-δT cells,regulatory T cells (Tregs), neutrophils, and M0 macrophages (Figure 5B). We found that M2macrophages, resting memory CD4 + T cells, memory B cells, resting mast cells, CD8 + T cells, activatedNK cells and regulatory T cells (Tregs) were highly in�ltrated in all samples (Figure 5C). The differentialexpression between the two groups showed that the in�ltration of regulatory T cells (Tregs), activated NKcells and naive B cells was higher than that of the control group (p < 0.05), while the in�ltration of plasmacells and memory B cells in the tendon disease group was lower (p < 0.05) (Figure 5D).

Screening and Identi�cation of Key Genes in tendinopathy

We combine GSE26051 and GSE167226 data sets, and randomly divide all samples into a training set(75 %) and veri�cation set (25 %). We used to predict functions to make model prediction, and evaluatedit in the training set and test set respectively.By constructing LASSO model in the training set andselecting λ = 18 to determine the key genes that can predict early tendinopathy accurately (Figure 6A).Based on the optimal λ value of 18, we obtained the LASSO coe�cient spectrum of differentiallyexpressed genes (Figure 6B). The results showed that the AUC in the training set was 0.981 and the AUCin the validation set was 0.807.This indicates that the constructed model performs well in both thetraining set and the test set(Figure 6C). Finally, we obtained 18 genes with nonzero coe�cients, includingMACROD1, CES2, GFER, MRPL52, SKIV2L, B3GNT4, LYNX1, C19orf57, SAFB2, NOM1, C7orf43, FRMPD1,MLPH, MFSD10, PIEZO1, FAM222A, PRG4, POU2AF1.

At the same time, we used multiple support vector machine recursive feature elimination (MSVM-RFE)algorithm to screen 171 different genes, and obtained 167 key genes. Build the SVM model by selectingthe �rst 171 variables and check that the model error rate is 167-0.139, and the accuracy rate is 167-0.861. The position of the red circle is the lowest point of error rate (Figure 7A). The position of the redcircle is the point with the highest accuracy (Figure 7B). We obtained common key genes from the abovetwo algorithms, including MACROD1, CES2, GFER, MRPL52, SKIV2L, B3GNT4, LYNX1, C19orf57, SAFB2,NOM1, C7orf43, FRMPD1, MLPH, MFSD10, PIEZO1, FAM222A, PRG4, POU2AF1(Figure 7C).

In order to more accurately predict genes for the early diagnosis of tendinopathy. Based on the Gaussian�nite mixture model, we use the model-based hierarchical agglomerative clustering method forclassi�cation. We use a Gaussian Mixture Model (GMM) to classify miRNA clusters, and use Logisticregression analysis to establish a combined model for predicting recurrence. At the same time, weconstructed a receiver operating characteristic (ROC) curve, and calculated AUC to evaluate the predictivevalue of the model, using a predictive miRNA signature model. We used the Gaussian mixture modelalgorithm to calculate and verify the best key genes from 18 candidate key genes by constructing 262143AUC models. Finally, we screened 6 potential key genes, including MACROD1, CES2, SKIV2L, LYNX1,MFSD10 and PIEZO1 Figure 8A . The FoldChange and p-value populations were used to displaydifferential gene waterfall plots, and the predictive mRNA signature model was described using Gaussian

Page 8/27

�nite mixture model markers Figure 8B . The overall situation of the 6 potential key genes is listed inTable 1.

Table 1   6 potential key gene display

  logFC AveExpr t P.Value Type

MACROD1 1.167552501 5.712454732 5.41405282 8.56E-07 Up

CES2 1.809870576 7.075108564 5.355859588 1.07E-06 Up

SKIV2L 1.026460579 6.111613386 4.671443796 1.44E-05 Up

LYNX1 1.148012232 6.163685681 4.321936756 5.13E-05 Up

MFSD10 -1.250111926 5.106939598 -3.626082073 0.000548936 Down

PIEZO1 -1.112748319 6.687175452 -3.52261718 0.000765663 Down

 

DiscussionTendinopathy is not only a very common chronic disease, but also a disease that lacks real effectivetreatment[31]. So far, tendinopathy is still a major challenge in musculoskeletal diseases due to thewidespread disease population, low cure rate, and huge medical expenditure. However, the mechanism ofthe occurrence and development of tendinopathy is not yet fully understood. At present, there are manyhypotheses about the etiology of tendinopathy, including Biomechanical theory[32], in�ammationtheory[31], apoptosis theory[33], vascular or neurogenic theory[34], etc. Although these theoretical modelsclosely link the basic science of tendinopathy with clinical applications, none of the theoretical theoriescan fully clarify the pathological mechanism of tendinopathy and the complex relationship betweentendon pain and function. Although the current research on tendinopathy has involved many aspects,however, the key genes of early tendinopathy are rarely studied. We believe that early diagnosis andtreatment of tendinopathy is essential to prevent the further progression of tendinopathy to avoid thesubsequent pathological cascade of tendinopathy. In this study, we attempted to identify the key genesfor early tendinopathy by comparing the differences in gene expression pro�les between tendinopathypatients and normal control samples. We screened 171 candidate key genes through comprehensiveBioinformatics analysis, and enriched their biological functions and signaling pathways. KEGGenrichment analysis showed that in HIF-1 signal pathway, VEGF signal pathway, Wnt signal pathway,MAPK signal pathway and NF- κ B signal pathway was signi�cantly up-regulated. At the same time,based on the Bioinformatics multiple screening algorithm, we screened 6 key genes from 171 candidategenes, including MACROD1, CES2, SKIV2L, LYNX1, MFSD10 and PIEZO1. MacroD1 was the gene with thesmallest PValue, and we speculated that this gene might play an important role in the occurrence anddevelopment of tendinopathy. The �ndings of this study reveal potential key genes and therapeutic

Page 9/27

targets in the signaling pathway of tendinopathy, which may provide basic support for future research ondisease molecular mechanisms and targeted drugs.

The MACROD1 gene is still rare in current research and its speci�c function is unclear. A few studies haveproved the functions of MacroD1 in the nucleus and the cytoplasm of the body, including MacroD1binding and regulating transcription factors ERα and NF-κB related proteins[35, 36]. Recent studies haveshown that endogenous MacroD1 protein is highly enriched in mitochondria and is highly expressed inhuman and mouse skeletal muscle[37]. As we all know, mitochondria, as a kind of organelle, haveabundant biological functions in cells[38]. Mitochondria are not only factories that produce ATP but alsoparticipate in many biological processes, such as steroid biosynthesis[39], metal ion homeostasis in thehuman body[40, 41], immune cell activation and regulation[42], cell signaling[43], apoptosis[44] andin�ammation[45]. At the same time, mitochondrial dysfunction can cause many diseases, such asatherosclerosis[46], Alzheimer’s disease[47]. Mitochondria can produce reactive oxygen species (ROS) inthe process of oxidative phosphorylation. When the accumulation of ROS exceeds the cellular antioxidantdefense system, the accumulation of ROS levels will cause oxidative stress.Oxidative stress can causeROS-mediated damage to molecular substances such as proteins, nucleic acids, and lipids[48]. And somestudies believe that oxidative stress is closely related to vascular disease[49], neurodegeneration[50], andcancer[51]. Similarly, ROS can also regulate the cascade reaction in the MAPK signaling pathway byactivating apoptosis signal regulator 1 (ASK1) to induce apoptosis[52, 53]. Therefore, based on the highenrichment of MacroD1 gene in mitochondria, high expression in skeletal muscle, and the biologicalfunction of mitochondria, we speculate that the occurrence of MacroD1 gene in early tendinopathy isrelated to the comprehensive cascade reaction of hypoxic microenvironment, in�ammatory response,apoptosis and so on.

The MACROD1 gene may trigger the hypoxic microenvironment-mediated tendon in�ammation throughmitochondrial dysfunction. As we all know, hypoxic cell injury has been considered as the basicmechanism of tendinopathy[54]. Research suggests that hypoxia may be a potential cause of earlytendinopathy. Under the action of mechanical stimulation and injury, hypoxia promotes the release ofin�ammatory cytokines in human tendon cells, the expression of key apoptosis mediators, the formationof blood vessels[55], and signi�cantly affects the synthesis of collagen matrix[56]. Previous researchsuggests that chronic tendinopathy is caused by a degenerative process without in�ammation. In a studyof the biopsy components of tendinopathy rupture tissue, no obvious in�ammation was observed, andmore than 85% of the biopsy specimens had almost no in�ammatory cells, On the contrary, there was amarked increase in tissue degeneration, including thinning and disorientation of collagen �bers, myxoiddegeneration, hyaluronic acid degeneration, chondroid metaplasia, tissue calci�cation, angiogenesis, andfatty in�ltration[57, 58]. Interestingly, some recent studies have reached the opposite conclusion that thereis an early and important in�ammatory response in the process of tendinopathy. Molloy et al found thatthe expression of the in�ammatory cell receptor and immunoglobulin was up-regulated in the ratsupraspinatus tendon disease model by microarray analysis[59]. Matthews et al. showed that a small-area tear is more obvious in the in�ammatory in�ltration of macrophages and mast cells, re�ecting moredegenerative changes through a biopsy sample of tearing tendon tissue[60]. Interestingly, in our study, the

Page 10/27

KEGG enrichment pathway was signi�cantly up-regulated by the NF-κB signaling pathway and the MAPKsignaling pathway. Studies have shown that these two pathways are related to in�ammation, apoptosisand ossi�cation. For example, ERK regulates apoptosis, while P38 mediates apoptosis andin�ammation[61]. And there are also studies suggesting that the activation of NF-κB can induce theactivation of the MAPK pathway[62]. Another study showed that TNF-α can induce TDSC in�ammationand apoptosis, and promote the development of tendinopathy by up-regulating the activation of MAPKand NF-κB pathways[63]. Our analysis of the difference in immune cell in�ltration of the whole gene ofthe samples showed that the in�ltration of M1 macrophages, activated mast cells, activated NK cells, andregulatory T cells (Tregs) in tendinopathy samples increased, while the in�ltration of plasma cells andmemory B cells decreased. Through a cross-sectional and case-control study, Maja et al. found that inchronic tendinopathy tissues, compared with healthy tendons, most (52%-96%) biopsy specimens wereobserved in macrophages, T lymphocytes, and hypertrophy Cells and natural killer cells[64]. Thisindicates that M1 macrophages, activated mast cells, and activated NK cells play an important role in theprogression and treatment of tendinopathy. It can be considered that the MACROD1 gene may inducein�ammatory cell in�ltration by mediating the hypoxic microenvironment.

Disregulation of apoptosis is thought to be one of the causes of tendinopathy[33]. Apoptosis is a kind ofprogrammed cell death, which plays a key role in tissue homeostasis. Apoptosis causes many diseases,such as autoimmune diseases and skeletal muscle degeneration[65]. Current research believes that cellapoptosis are crucial in the occurrence and development of tendinopathy[33]. The reasons include:mechanical overuse of tendons[66], hypoxic microenvironment[67], and oxidative stress[68]. In our study,the differential gene enrichment pathway was signi�cantly up-regulated on the HIF-1 signaling pathway.However, hypoxia-inducible factor 1 (HIF-1), as a transcriptional activator sensitive to oxygen[69], is a keyregulator in the process of cell apoptosis[70]. In hypoxia, HIF-1α can initiate cell apoptosis by inducinghigh concentrations of pro-apoptotic proteins. Vascular endothelial growth factor (VEGF) is aglycosylated protein of about 45 kDa, composed of two subunits connected by disul�de bonds, whichcan mediate angiogenesis. Previous studies have suggested that VEGF is at a high level of expression indegenerative tendinopathy, while its expression is almost completely down-regulated in healthy Achillestendons[71]. There are many factors that cause high expression of VEGF in tendon cells, includinghypoxia, in�ammatory factors and mechanical stress load[72]. As for the mechanism of action of VEGFin tendinopathy, some studies have shown that VEGF activates the binding of VEGF and its receptorVEGFR-2 to promote angiogenesis in tendon tissue by up-regulating the expression of matrixmetalloproteinases (MMPs) and down-regulating the expression of metalloproteinase-3 (TIMP-3) intendon cells[72–75]. Dakin et al. studied the tendon biopsy in symptomatic patients with tendinopathy orrupture and found that the superposition of in�ammatory in�ltration and neovascularization promotestendon rupture[76]. Interestingly, in our study, based on KEGG enrichment, the VEGF signaling pathway issigni�cantly up-regulated, which may lead us to speculate that VEGF plays an important role in the earlypathogenesis of tendinopathy under hypoxic factors.

In the development of tendinopathy, the wrong differentiation of tendon cells may promote theoccurrence of tendinopathy[77]. Hypoxia and in�ammation are related to the occurrence of heterotopic

Page 11/27

endochondral ossi�cation (HEO), but the speci�c molecular mechanism is unclear. It has been proventhat hypoxic environment can stimulate the differentiation of progenitor cells into cartilage during thedevelopment of the skeletal system[78]. Wang et al. believed that cell hypoxia promoted heterotopicossi�cation by amplifying BMP signal transduction[79]. Shailesh et al. in the rat Achilles tendon,muscleossi�cation model and FOP mouse model found that HIF-1α was signi�cantly up-regulated in thechondrogenic differentiation stage[80]. Tendon-derived stem cells (TDSC) have the potential todifferentiate into tendon cells, osteoblasts, chondrocytes and �broblasts[81]. TDSC plays an importantrole in the healing of tendon injuries. However, if the tendon is not properly healed, it will cause tendonossi�cation and promote the formation of tendinopathy[82–84]. Studies have shown that TNF-α caninduce apoptosis of TDSC[85], and chronic tendinopathy is closely related to the up-regulation of TNF-α[86]. This is closely related to our study and screening of different genes in the regulation of tumornecrosis factor-mediated signaling pathways on GO enrichment. Some studies believe that Wnt pathwaymediators are expressed in chondrocyte-like cells and ossi�cation deposits, and are related toendochondral ossi�cation[87]. Differentiation of tendon stem cells into non-tendon cells, such asosteoblasts, may reduce the total amount of tendon stem cells used for tendon repair and lead to failureof tendon healing. Studies have shown that activation of the Wnt pathway can promote calci�cation ofrelated tissues, such as cardiovascular calci�cation[88, 89]. In our study, the Wnt pathway wassigni�cantly up-regulated. We believe that the Wnt pathway is activated and at a high level of expressionin tendinopathy injuries. And Wnt induces the osteogenic differentiation of tendon stem cells, andpromotes the differentiation of tendon cells in the wrong direction, which ultimately leads totendinopathy[90].Studies have shown that in the rat Achilles tendon injury model, the key regulators ofthe Wnt pathway and the Notch pathway are activated at the wound[91]. Therefore, it can be consideredthat the MACROD1 gene may promote the misdifferentiation of tendon cells by mediating the hypoxicmicroenvironment of the cells and ultimately lead to tendinopathy.

Based on the above research, we speculate that the up-regulation of MACROD1 gene may cause earlytendinopathy hypoxia microenvironment and oxidative stress response through tendinocytemitochondrial dysfunction, and then activate multiple signal pathways under the combined action ofin�ammatory cytokines and angiogenic factors Lead to apoptosis of tendon cells. These cascadesultimately lead to the development of chronic tendinopathy. Therefore, we speculate that the MACROD1gene may be a potential key gene for early tendinopathy. It is necessary for us to study clearly thefunction of MACROD1 gene in mitochondria and the possible speci�c molecular mechanism of the earlyoccurrence of tendinopathy. Explore the mechanism of action of MACROD1 gene in mitochondria to �ndnew therapeutic targets in molecular pathways, which may be a promising treatment for tendinopathy.

However, our research still has certain limitations. First of all, the data set we are studying containsdifferent populations of tendon tear patients and the control group, which may affect the results of thestudy. In addition, there is a slight difference between the immune cell in�ltration condition obtained bythe whole genetic immunoassay and the immune cell in�ltration condition obtained from theexperimental study, which may be caused by the difference in different stages of the disease. Finally,

Page 12/27

based on our research data comes from public databases, it is necessary to conduct molecular cell andanimal experiments to verify the results of this research.

ConclusionIn conclusion, based on the comprehensive bioinformatics analysis method, we identi�ed the potentialearly key genes, key regulatory pathways and immune in�ltration characteristics of tenopathy. This willhelp to provide new insights into the future drug and molecular mechanism of tendon disease.

DeclarationsEthics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Availability of data and materials

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE106292

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE26051

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE167226

Competing interests

The authors declare that the research was conducted in the absence of any commercial or �nancialrelationships that could be construed as a potential con�ict of interest.

Funding

Not applicable

Authors' contributions   

ZYX and XH designed this study. ZYX, HJQ collected data for analysis and drafted a manuscript. ZYX,MYY, ZYD and ZZ analyzed the data. ZYX,HJQ and XH made signi�cant contributions to the design of theresearch and created numbers. Final manuscript read and approved by all authors.

Acknowledgements

Not applicable

Page 13/27

References1.          Millar NL, Silbernagel KG, Thorborg K, Kirwan PD, Galatz LM, Abrams GD, Murrell GAC, McInnes IB,Rodeo SA: Tendinopathy. Nature reviews Disease primers 2021, 7(1):1.

2.          Hopkins C, Fu SC, Chua E, Hu X, Rolf C, Mattila VM, Qin L, Yung PS, Chan KM: Critical review onthe socio-economic impact of tendinopathy. Asia-Paci�c journal of sports medicine, arthroscopy,rehabilitation and technology 2016, 4:9-20.

3.          Riley G: Chronic tendon pathology: molecular basis and therapeutic implications. Expert reviewsin molecular medicine 2005, 7(5):1-25.

4.          Lui PP, Maffulli N, Rolf C, Smith RK: What are the validated animal models for tendinopathy?Scandinavian journal of medicine & science in sports 2011, 21(1):3-17.

5.          Sharma P, Maffulli N: Basic biology of tendon injury and healing. The surgeon : journal of theRoyal Colleges of Surgeons of Edinburgh and Ireland 2005, 3(5):309-316.

6.          Alfredson H, Pietilä T, Jonsson P, Lorentzon R: Heavy-load eccentric calf muscle training for thetreatment of chronic Achilles tendinosis. The American journal of sports medicine 1998, 26(3):360-366.

7.          Camargo PR, Alburquerque-Sendín F, Salvini TF: Eccentric training as a new approach for rotatorcuff tendinopathy: Review and perspectives. World journal of orthopedics 2014, 5(5):634-644.

8.          Irby A, Gutierrez J, Chamberlin C, Thomas SJ, Rosen AB: Clinical management of tendinopathy: Asystematic review of systematic reviews evaluating the effectiveness of tendinopathy treatments.Scandinavian journal of medicine & science in sports 2020, 30(10):1810-1826.

9.          Huang X, Liu S, Wu L, Jiang M, Hou Y: High Throughput Single Cell RNA Sequencing,Bioinformatics Analysis and Applications. Advances in experimental medicine and biology 2018,1068:33-43.

10.        Wang T, Zheng X, Li R, Liu X, Wu J, Zhong X, Zhang W, Liu Y, He X, Liu W et al: Integratedbioinformatic analysis reveals YWHAB as a novel diagnostic biomarker for idiopathic pulmonary arterialhypertension. Journal of cellular physiology 2019, 234(5):6449-6462.

11.        Li Z, Sillanpää MJ: Overview of LASSO-related penalized regression methods for quantitative traitmapping and genomic selection. TAG Theoretical and applied genetics Theoretische und angewandteGenetik 2012, 125(3):419-435.

12.        Sanz H, Valim C, Vegas E, Oller JM, Reverter F: SVM-RFE: selection and visualization of the mostrelevant features through non-linear kernels. BMC bioinformatics 2018, 19(1):432.

Page 14/27

13.        Zhao Y, Shrivastava AK, Tsui KL: Regularized Gaussian Mixture Model for High-DimensionalClustering. IEEE transactions on cybernetics 2019, 49(10):3677-3688.

14.        Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH,Sherman PM, Holko M et al: NCBI GEO: archive for functional genomics data sets--update. Nucleic acidsresearch 2013, 41(Database issue):D991-995.

15.        Hicks MR, Hiserodt J, Paras K, Fujiwara W, Eskin A, Jan M, Xi H, Young CS, Evseenko D, NelsonSF et al: ERBB3 and NGFR mark a distinct skeletal muscle progenitor cell in human development andhPSCs. Nature cell biology 2018, 20(1):46-57.

16.        Ferguson GB, Van Handel B, Bay M, Fiziev P, Org T, Lee S, Shkhyan R, Banks NW, Scheinberg M,Wu L et al: Mapping molecular landmarks of human skeletal ontogeny and pluripotent stem cell-derivedarticular chondrocytes. Nature communications 2018, 9(1):3634.

17.        Jelinsky SA, Rodeo SA, Li J, Gulotta LV, Archambault JM, Seeherman HJ: Regulation of geneexpression in human tendinopathy. BMC musculoskeletal disorders 2011, 12:86.

18.        Gautier L, Cope L, Bolstad BM, Irizarry RA: affy--analysis of Affymetrix GeneChip data at the probelevel. Bioinformatics (Oxford, England) 2004, 20(3):307-315.

19.        Parker HS, Leek JT, Favorov AV, Considine M, Xia X, Chavan S, Chung CH, Fertig EJ: Preservingbiological heterogeneity with a permuted surrogate variable analysis for genomics batch correction.Bioinformatics (Oxford, England) 2014, 30(19):2757-2763.

20.        Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK: limma powers differentialexpression analyses for RNA-sequencing and microarray studies. Nucleic acids research 2015, 43(7):e47.

21.        Langfelder P, Horvath S: WGCNA: an R package for weighted correlation network analysis. BMCbioinformatics 2008, 9:559.

22.        Yu G, Wang LG, Han Y, He QY: clusterPro�ler: an R package for comparing biological themesamong gene clusters. Omics : a journal of integrative biology 2012, 16(5):284-287.

23.        Deng YJ, Ren EH, Yuan WH, Zhang GZ, Wu ZL, Xie QQ: GRB10 and E2F3 as Diagnostic Markers ofOsteoarthritis and Their Correlation with Immune In�ltration. Diagnostics (Basel, Switzerland) 2020, 10(3).

24.        Cao Y, Tang W, Tang W: Immune cell in�ltration characteristics and related core genes in lupusnephritis: results from bioinformatic analysis. BMC immunology 2019, 20(1):37.

25.        Félix Garza ZC, Lenz M, Liebmann J, Ertaylan G, Born M, Arts ICW, Hilbers PAJ, van Riel NAW:Characterization of disease-speci�c cellular abundance pro�les of chronic in�ammatory skin conditionsfrom deconvolution of biopsy samples. BMC medical genomics 2019, 12(1):121.

Page 15/27

26.        Yang L, Shou YH, Yang YS, Xu JH: Elucidating the immune in�ltration in acne and its comparisonwith rosacea by integrated bioinformatics analysis. PloS one 2021, 16(3):e0248650.

27.        Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M, Alizadeh AA:Robust enumeration of cell subsets from tissue expression pro�les. Nature methods 2015, 12(5):453-457.

28.        Antonacci Y, Toppi J, Mattia D, Pietrabissa A, Astol� L: Single-trial Connectivity Estimation throughthe Least Absolute Shrinkage and Selection Operator. Annual International Conference of the IEEEEngineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society AnnualInternational Conference 2019, 2019:6422-6425.

29.        Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Müller M: pROC: an open-sourcepackage for R and S+ to analyze and compare ROC curves. BMC bioinformatics 2011, 12:77.

30.        Ficklin SP, Dunwoodie LJ, Poehlman WL, Watson C, Roche KE, Feltus FA: Discovering Condition-Speci�c Gene Co-Expression Patterns Using Gaussian Mixture Models: A Cancer Case Study. Scienti�creports 2017, 7(1):8617.

31.        Rees JD, Stride M, Scott A: Tendons--time to revisit in�ammation. British journal of sportsmedicine 2014, 48(21):1553-1557.

32.        Almekinders LC, Weinhold PS, Maffulli N: Compression etiology in tendinopathy. Clinics in sportsmedicine 2003, 22(4):703-710.

33.        Xu Y, Murrell GA: The basic science of tendinopathy. Clinical orthopaedics and relatedresearch 2008, 466(7):1528-1538.

34.        Riley G: The pathogenesis of tendinopathy. A molecular perspective. Rheumatology (Oxford,England) 2004, 43(2):131-142.

35.        Han WD, Zhao YL, Meng YG, Zang L, Wu ZQ, Li Q, Si YL, Huang K, Ba JM, Morinaga H et al:Estrogenically regulated LRP16 interacts with estrogen receptor alpha and enhances the receptor'stranscriptional activity. Endocrine-related cancer 2007, 14(3):741-753.

36.        Wu Z, Li Y, Li X, Ti D, Zhao Y, Si Y, Mei Q, Zhao P, Fu X, Han W: LRP16 integrates into NF-κBtranscriptional complex and is required for its functional activation. PloS one 2011, 6(3):e18157.

37.        Agnew T, Munnur D, Crawford K, Palazzo L, Mikoč A, Ahel I: MacroD1 Is a Promiscuous ADP-Ribosyl Hydrolase Localized to Mitochondria. Frontiers in microbiology 2018, 9:20.

38.        McBride HM, Neuspiel M, Wasiak S: Mitochondria: more than just a powerhouse. Current biology :CB 2006, 16(14):R551-560.

Page 16/27

39.        Chien Y, Rosal K, Chung BC: Function of CYP11A1 in the mitochondria. Molecular and cellularendocrinology 2017, 441:55-61.

40.        Bravo-Sagua R, Parra V, López-Crisosto C, Díaz P, Quest AF, Lavandero S: Calcium Transport andSignaling in Mitochondria. Comprehensive Physiology 2017, 7(2):623-634.

41.        Paul BT, Manz DH, Torti FM, Torti SV: Mitochondria and Iron: current questions. Expert review ofhematology 2017, 10(1):65-79.

42.        Liu PS, Ho PC: Mitochondria: A master regulator in macrophage and T cell immunity.Mitochondrion 2018, 41:45-50.

43.        Blajszczak C, Bonini MG: Mitochondria targeting by environmental stressors: Implications forredox cellular signaling. Toxicology 2017, 391:84-89.

44.        Jeong SY, Seol DW: The role of mitochondria in apoptosis. BMB reports 2008, 41(1):11-22.

45.        Kolmychkova KI, Zhelankin AV, Karagodin VP, Orekhov AN: Mitochondria and in�ammation.Patologicheskaia �ziologiia i eksperimental'naia terapiia 2016, 60(4):114-121.

46.        Suárez-Rivero JM, Pastor-Maldonado CJ, Povea-Cabello S, Álvarez-Córdoba M, Villalón-García I,Talaverón-Rey M, Suárez-Carrillo A, Munuera-Cabeza M, Sánchez-Alcázar JA: From Mitochondria toAtherosclerosis: The In�ammation Path. Biomedicines 2021, 9(3).

47.        Kerr JS, Adriaanse BA, Greig NH, Mattson MP, Cader MZ, Bohr VA, Fang EF: Mitophagy andAlzheimer's Disease: Cellular and Molecular Mechanisms. Trends in neurosciences 2017, 40(3):151-166.

48.        Ray PD, Huang BW, Tsuji Y: Reactive oxygen species (ROS) homeostasis and redox regulation incellular signaling. Cellular signalling 2012, 24(5):981-990.

49.        Paravicini TM, Touyz RM: Redox signaling in hypertension. Cardiovascular research 2006,71(2):247-258.

50.        Shukla V, Mishra SK, Pant HC: Oxidative stress in neurodegeneration. Advances inpharmacological sciences 2011, 2011:572634.

51.        Trachootham D, Alexandre J, Huang P: Targeting cancer cells by ROS-mediated mechanisms: aradical therapeutic approach? Nature reviews Drug discovery 2009, 8(7):579-591.

52.        Tobiume K, Matsuzawa A, Takahashi T, Nishitoh H, Morita K, Takeda K, Minowa O, Miyazono K,Noda T, Ichijo H: ASK1 is required for sustained activations of JNK/p38 MAP kinases and apoptosis.EMBO reports 2001, 2(3):222-228.

53.        Ichijo H, Nishida E, Irie K, ten Dijke P, Saitoh M, Moriguchi T, Takagi M, Matsumoto K, Miyazono K,Gotoh Y: Induction of apoptosis by ASK1, a mammalian MAPKKK that activates SAPK/JNK and p38

Page 17/27

signaling pathways. Science (New York, NY) 1997, 275(5296):90-94.

54.        Kannus P: Etiology and pathophysiology of chronic tendon disorders in sports. Scandinavianjournal of medicine & science in sports 1997, 7(2):78-85.

55.        Järvinen TA: Neovascularisation in tendinopathy: from eradication to stabilisation? British journalof sports medicine 2020, 54(1):1-2.

56.        Millar NL, Reilly JH, Kerr SC, Campbell AL, Little KJ, Leach WJ, Rooney BP, Murrell GA, McInnes IB:Hypoxia: a critical regulator of early human tendinopathy. Annals of the rheumatic diseases 2012,71(2):302-310.

57.        Maffulli N, Barrass V, Ewen SW: Light microscopic histology of achilles tendon ruptures. Acomparison with unruptured tendons. The American journal of sports medicine 2000, 28(6):857-863.

58.        Hashimoto T, Nobuhara K, Hamada T: Pathologic evidence of degeneration as a primary cause ofrotator cuff tear. Clinical orthopaedics and related research 2003(415):111-120.

59.        Molloy TJ, Kemp MW, Wang Y, Murrell GA: Microarray analysis of the tendinopathic ratsupraspinatus tendon: glutamate signaling and its potential role in tendon degeneration. Journal ofapplied physiology (Bethesda, Md : 1985) 2006, 101(6):1702-1709.

60.        Matthews TJ, Hand GC, Rees JL, Athanasou NA, Carr AJ: Pathology of the torn rotator cufftendon. Reduction in potential for repair as tear size increases. The Journal of bone and joint surgeryBritish volume 2006, 88(4):489-495.

61.        Zhou J, Du T, Li B, Rong Y, Verkhratsky A, Peng L: Crosstalk Between MAPK/ERK and PI3K/AKTSignal Pathways During Brain Ischemia/Reperfusion. ASN neuro 2015, 7(5).

62.        Wang L, Lee W, Cui YR, Ahn G, Jeon YJ: Protective effect of green tea catechin against urban �nedust particle-induced skin aging by regulation of NF-κB, AP-1, and MAPKs signaling pathways.Environmental pollution (Barking, Essex : 1987) 2019, 252(Pt B):1318-1324.

63.        Moqbel SAA, Xu K, Chen Z, Xu L, He Y, Wu Z, Ma C, Ran J, Wu L, Xiong Y: Tectorigenin AlleviatesIn�ammation, Apoptosis, and Ossi�cation in Rat Tendon-Derived Stem Cells via Modulating NF-Kappa Band MAPK Pathways. Frontiers in cell and developmental biology 2020, 8:568894.

64.        Kragsnaes MS, Fredberg U, Stribolt K, Kjaer SG, Bendix K, Ellingsen T: Stereological quanti�cationof immune-competent cells in baseline biopsy specimens from achilles tendons: results from patientswith chronic tendinopathy followed for more than 4 years. The American journal of sports medicine 2014,42(10):2435-2445.

65.        Elmore S: Apoptosis: a review of programmed cell death. Toxicologic pathology 2007, 35(4):495-516.

Page 18/27

66.        Egerbacher M, Arnoczky SP, Caballero O, Lavagnino M, Gardner KL: Loss of homeostatic tensioninduces apoptosis in tendon cells: an in vitro study. Clinical orthopaedics and related research 2008,466(7):1562-1568.

67.        Kannus P, Natri A: Etiology and pathophysiology of tendon ruptures in sports. Scandinavianjournal of medicine & science in sports 1997, 7(2):107-112.

68.        Murrell GA, Szabo C, Hanna�n JA, Jang D, Dolan MM, Deng XH, Murrell DF, Warren RF:Modulation of tendon healing by nitric oxide. In�ammation research : o�cial journal of the EuropeanHistamine Research Society  [et al] 1997, 46(1):19-27.

69.        Ke Q, Costa M: Hypoxia-inducible factor-1 (HIF-1). Molecular pharmacology 2006, 70(5):1469-1480.

70.        Greijer AE, van der Wall E: The role of hypoxia inducible factor 1 (HIF-1) in hypoxia inducedapoptosis. Journal of clinical pathology 2004, 57(10):1009-1014.

71.        Pufe T, Petersen W, Tillmann B, Mentlein R: The angiogenic peptide vascular endothelial growthfactor is expressed in foetal and ruptured tendons. Virchows Archiv : an international journal ofpathology 2001, 439(4):579-585.

72.        Pufe T, Petersen WJ, Mentlein R, Tillmann BN: The role of vasculature and angiogenesis for thepathogenesis of degenerative tendons disease. Scandinavian journal of medicine & science insports 2005, 15(4):211-222.

73.        Wang H, Keiser JA: Vascular endothelial growth factor upregulates the expression of matrixmetalloproteinases in vascular smooth muscle cells: role of �t-1. Circulation research 1998, 83(8):832-840.

74.        Qi JH, Ebrahem Q, Moore N, Murphy G, Claesson-Welsh L, Bond M, Baker A, Anand-Apte B: A novelfunction for tissue inhibitor of metalloproteinases-3 (TIMP3): inhibition of angiogenesis by blockage ofVEGF binding to VEGF receptor-2. Nature medicine 2003, 9(4):407-415.

75.        Pufe T, Lemke A, Kurz B, Petersen W, Tillmann B, Grodzinsky AJ, Mentlein R: Mechanical overloadinduces VEGF in cartilage discs via hypoxia-inducible factor. The American journal of pathology 2004,164(1):185-192.

76.        Dakin SG, Newton J, Martinez FO, Hedley R, Gwilym S, Jones N, Reid HAB, Wood S, Wells G,Appleton L et al: Chronic in�ammation is a feature of Achilles tendinopathy and rupture. British journal ofsports medicine 2018, 52(6):359-367.

77.        Rui YF, Lui PP, Chan LS, Chan KM, Fu SC, Li G: Does erroneous differentiation of tendon-derivedstem cells contribute to the pathogenesis of calcifying tendinopathy? Chinese medical journal 2011,124(4):606-610.

Page 19/27

78.        Mennan C, Garcia J, McCarthy H, Owen S, Perry J, Wright K, Banerjee R, Richardson JB, Roberts S:Human Articular Chondrocytes Retain Their Phenotype in Sustained Hypoxia While Normoxia PromotesTheir Immunomodulatory Potential. Cartilage 2019, 10(4):467-479.

79.        Wang H, Lindborg C, Lounev V, Kim JH, McCarrick-Walmsley R, Xu M, Mangiavini L, Groppe JC,Shore EM, Schipani E et al: Cellular Hypoxia Promotes Heterotopic Ossi�cation by Amplifying BMPSignaling. Journal of bone and mineral research : the o�cial journal of the American Society for Boneand Mineral Research 2016, 31(9):1652-1665.

80.        Agarwal S, Loder S, Brownley C, Cholok D, Mangiavini L, Li J, Breuler C, Sung HH, Li S,Ranganathan K et al: Inhibition of Hif1α prevents both trauma-induced and genetic heterotopicossi�cation. Proceedings of the National Academy of Sciences of the United States of America 2016,113(3):E338-347.

81.        Lui PP: A practical guide for the isolation and maintenance of stem cells from tendon. Methods inmolecular biology (Clifton, NJ) 2015, 1212:127-140.

82.        Chaudhury S: Mesenchymal stem cell applications to tendon healing. Muscles, ligaments andtendons journal 2012, 2(3):222-229.

83.        Yee Lui PP, Wong YM, Rui YF, Lee YW, Chan LS, Chan KM: Expression of chondro-osteogenicBMPs in ossi�ed failed tendon healing model of tendinopathy. Journal of orthopaedic research : o�cialpublication of the Orthopaedic Research Society 2011, 29(6):816-821.

84.        Zhang Q, Zhou D, Wang H, Tan J: Heterotopic ossi�cation of tendon and ligament. Journal ofcellular and molecular medicine 2020, 24(10):5428-5437.

85.        Han P, Cui Q, Yang S, Wang H, Gao P, Li Z: Tumor necrosis factor-α and transforming growthfactor-β1 facilitate differentiation and proliferation of tendon-derived stem cells in vitro. Biotechnologyletters 2017, 39(5):711-719.

86.        Rath PC, Aggarwal BB: TNF-induced signaling in apoptosis. Journal of clinical immunology 1999,19(6):350-364.

87.        Kitagaki J, Iwamoto M, Liu JG, Tamamura Y, Pacifci M, Enomoto-Iwamoto M: Activation of beta-catenin-LEF/TCF signal pathway in chondrocytes stimulates ectopic endochondral ossi�cation.Osteoarthritis and cartilage 2003, 11(1):36-43.

88.        Shao JS, Cheng SL, Pingsterhaus JM, Charlton-Kachigian N, Loewy AP, Towler DA: Msx2promotes cardiovascular calci�cation by activating paracrine Wnt signals. The Journal of clinicalinvestigation 2005, 115(5):1210-1220.

89.        Al-Aly Z, Shao JS, Lai CF, Huang E, Cai J, Behrmann A, Cheng SL, Towler DA: Aortic Msx2-Wntcalci�cation cascade is regulated by TNF-alpha-dependent signals in diabetic Ldlr-/- mice.

Page 20/27

Arteriosclerosis, thrombosis, and vascular biology 2007, 27(12):2589-2596.

90.        Lui PP, Chan KM: Tendon-derived stem cells (TDSCs): from basic science to potential roles intendon pathology and tissue engineering applications. Stem cell reviews and reports 2011, 7(4):883-897.

91.        Molloy TJ, Wang Y, Horner A, Skerry TM, Murrell GA: Microarray analysis of healing rat Achillestendon: evidence for glutamate signaling mechanisms and embryonic gene expression in healing tendontissue. Journal of orthopaedic research : o�cial publication of the Orthopaedic Research Society 2006,24(4):842-855.

Figures

Figure 1

(A Whole gene expression heat map: Whole gene expression heat map of tendon tissue, with highexpression in red and low expression in blue B The DEG Volcano map shows upregulated genes in redand down-regulated genes in blue

Page 21/27

Figure 2

(A Soft Threshold (Power) represents the weight, and the vertical axis shows the scale-free topology�tting index R2 B Soft Threshold (Power) represents the weight, and the vertical axis shows the averageconnectivity of the network C Distribution of node connectivity K D Correlation graph of K and P (K)

Page 22/27

Figure 3

(A Tree of all gene expressions based on the Difference Measure (1-TOM) cluster B The heat maps ofcorrelations between modules feature genes and samples, with each cell containing correlationcoe�cients and P values C The expression calorimetry and feature vector histogram of PURPRE moduleD The expression calorimetry and feature vector histogram of Skyblue2 module

Page 23/27

Figure 4

Baseball �gure of differential gene enrichment analysis. The horizontal axis represents the proportion ofdifferential genes in GO and KEGG enrichment analysis, and the vertical axis represents the enrichmentcategory. A Up-regulated GO enrichment distribution of differentially expressed genes B Down-regulatedGO enrichment distribution of differentially expressed genes C Up-regulated differential gene KEGGenrichment distribution D Down-regulated differential gene KEGG enrichment distribution

Page 24/27

Figure 5

In�ltration patterns of immune cells in different groups A Relative percentage of 22 immune cell subsetsin tendon disease samples B Heat map of immune cell in�ltration between tendinopathy group andcontrol group, green represents tendinopathy group, red line represents control group C In�ltration degreeof 22 immune cell subsets in tendon disease samples D Box Diagram of Immune In�ltration Differencebetween Tendon Disease Group and Control Group, Green as Tendon Disease Group, Red as ControlGroup

Page 25/27

Figure 6

The potential key genes of tendinopathy were screened by LASSO regression model A Selection of thebest parameter (λ) value in Lasso model of data collection and construction B LASSO coe�cientspectrum of 18 differentially expressed genes selected by optimal (λ) value C Comparison of ROC curvesbetween training set and validation set for gene signature

Page 26/27

Figure 7

MSVM-RFE algorithm for screening key genes A Shows the error rate of the SVM model B Shows theaccuracy of the SVM model C The Venn diagram shows the same key genes obtained by the twoalgorithms

Page 27/27

Figure 8

Displays the patterns of AUC and 262143 logistic regression model based on Gaussian �nite mixturemodel A The pattern of the logistic regression model is related to the AUC score and is determined byGaussian mixture B The waterfall diagrams of 6 key genes in different genes