PreprintPDF Available

LCLNCRdb: A Comprehensive Resource for Investigating long non-coding RNAs in Lung Cancer

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Lung cancer is a primary cause of death worldwide, accounting for a substantial number of mortalities. It involves several molecular mechanisms that are influenced by long non-coding RNAs (lncRNAs), a specific types of RNA molecules that do not code for proteins. Several research have revealed the importance of long non-coding RNAs (lncRNAs) in the initiation, progression, and development of resistance to lung cancer therapy. However, there are no centralized web resources or databases that collect and integrate information regarding lung cancer associated lncRNAs. This led to the development of the LCLNCRdb, a manually curated database that includes data from various sources, such as published research articles, and The Cancer Genome Atlas (TCGA) data portal. This database contains detailed information on 1102 lncRNAs that have differential expression patterns in lung cancer patients, such as lncRNA name, entrez ID, Ensemble ID, HGNC ID, NONCODE ID, lung cancer type, source, lncRNA expression pattern, experimental techniques, network analysis, and survival analysis details. The database offers a user-friendly platform for browsing, retrieving, and downloading data, and it features a dedicated submission page for researchers to share newly identified lncRNAs related to lung cancer. LCLNCRdb aims to enhance our knowledge of lncRNA deregulation in lung cancer and provides a valuable and timely resource for lncRNA research. The database is freely accessible at (https://dbtcmi.in/tools/lclncrdb/main.html).
Content may be subject to copyright.
LCLNCRdb: A Comprehensive Resource for Investigating long non-coding
RNAs in Lung Cancer
Ayushi Dwivedi, Afrin Zulfia S, Mallikarjuna Thippana, Sai Nikhith Cholleti and Vaibhav Vindal*
Department of Biotechnology and Bioinformatics, School of Life Sciences, University of
Hyderabad, Hyderabad 500046, India
*Corresponding author:
Dr. Vaibhav Vindal
Professor,
Dept. of Biotechnology & Bioinformatics
School of Life Sciences,
University of Hyderabad
Hyderabad – 500046.
Telangana, INDIA.
Email: vaibhav@uohyd.ac.in
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Abstract
Lung cancer is a primary cause of death worldwide, accounting for a substantial number of
mortalities. It involves several molecular mechanisms that are influenced by long non-coding
RNAs (lncRNAs), a specific types of RNA molecules that do not code for proteins. Several
research have revealed the importance of long non-coding RNAs (lncRNAs) in the initiation,
progression, and development of resistance to lung cancer therapy. However, there are no
centralized web resources or databases that collect and integrate information regarding lung cancer
associated lncRNAs. This led to the development of the LCLNCRdb, a manually curated database
that includes data from various sources, such as published research articles, and The Cancer
Genome Atlas (TCGA) data portal. This database contains detailed information on 1102 lncRNAs
that have differential expression patterns in lung cancer patients, such as lncRNA name, entrez ID,
Ensemble ID, HGNC ID, NONCODE ID, lung cancer type, source, lncRNA expression pattern,
experimental techniques, network analysis, and survival analysis details. The database offers a
user-friendly platform for browsing, retrieving, and downloading data, and it features a dedicated
submission page for researchers to share newly identified lncRNAs related to lung cancer.
LCLNCRdb aims to enhance our knowledge of lncRNA deregulation in lung cancer and provides
a valuable and timely resource for lncRNA research. The database is freely accessible at
(https://dbtcmi.in/tools/lclncrdb/main.html).
Keywords: Lung cancer, long non-coding RNAs, Clinical, Survival, Competing endogenous
RNA networks
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
1. Introduction
Lung cancer is a major health concern, causing millions of deaths annually. It is the second most
prevalent form of cancer and the leading cause of cancer-related deaths. Factors, such as smoking,
genetics, and exposure to harmful substances, contribute to its development. Non-small cell lung
cancer (NSCLC) is the most prevalent type, representing 85% of all instances, and is categorized
into the adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) subtypes. Long non-
coding RNAs (lncRNAs) play crucial roles in the regulation of gene expression across multiple
levels, including epigenetic, transcriptional, and post-transcriptional processes. They can affect
chromosome structure, recruit chromatin-modifying enzymes, and interact with transcription
factors to either enhance or repress gene expression [1-4]. Additionally, they can affect mRNA
stability and translation and function as competing endogenous RNAs (ceRNAs) that sequester
microRNAs (miRNAs) [5]. Dysregulation of lncRNAs can contribute to abnormal gene expression
patterns, potentially leading to the development of various diseases, including cancer [6].
Long non-coding RNAs (lncRNAs) contributes to the cancer progression, and various databases
have addressed diverse aspects of lung cancer progression. However, none have specifically
focused on lncRNAs associated with lung cancer progression. In lung cancer, dysregulated
lncRNAs promote cell growth, migration, and invasion, and inhibit apoptosis [7]. For example,
lncRNA DANCR is upregulated in various cancers and is associated with increased cell
proliferation and invasion. Similarly, MALAT1 and H19 have been implicated in lung cancer
development. The dysregulation of lncRNAs can also affect key signaling pathways and regulatory
networks, such as the p53 tumor suppressor pathway [6]. Therefore, lncRNAs are important
regulators of gene expression, and their dysregulation can lead to the disruption of cellular
homeostasis and lung cancer progression. The development of a database for long non-coding
RNAs (lncRNAs) associated with lung cancer is underscored by a growing body of evidence
highlighting their significant roles in cancer biology. They participate in the regulation of gene
expression at various levels and are associated with the initiation, progression, and prognosis of
lung cancer [8-9]. Furthermore, long non-coding RNAs (lncRNAs) have emerged as promising
candidates for use as diagnostic and prognostic biomarkers, as well as potential targets for
therapeutic intervention [10-11]. This knowledge gap presents a compelling case for a dedicated
database that could consolidate current and future research findings, facilitating a more
comprehensive understanding of lncRNA functions and interactions in lung cancer.
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Several databases exist that compile and organizes data associated with lncRNAs and cancer.
Among these, Lnc2Cancer is a manually curated database that contains detailed mechanisms of
lncRNA regulation in cancer [12-13]. It contains comprehensive information on the mechanisms
by which lncRNAs regulate cancer development. Another database, CRlncRNA, was created by
Wang and focuses on the functional roles of cancer-related lncRNAs [14]. In addition, it provides
information on the clinical and molecular characteristics of these lncRNAs. In addition,
LncRNADisease offers information about lncRNA-disease associations, along with details on
transcriptional regulatory relationships and a confidence score for each association [15]. These
databases are invaluable resources for researchers and clinicians seeking to understand and explore
the roles of lncRNAs in cancer. Nevertheless, none of these investigations have specifically
addressed long non-coding RNAs (lncRNAs) linked to lung cancer, which represents a substantial
gap in the current body of knowledge.
To address the lack of information on long non-coding RNAs (lncRNAs) associated lung cancer,
a comprehensive database designated as the LCLNCRdb was developed. This database includes
1102 differentially expressed lncRNAs in lung cancer, which were manually curated from both the
literature and TCGA databases. The LCLNCRdb offers a range of features, such as lncRNA
expression patterns, target information, type of lung cancer, source data, experimental techniques,
survival analysis, and network analysis. The database is user friendly, allowing users to easily
browse, retrieve, and download data. In addition, users can submit newly validated lncRNAs
related to lung cancer. The LCLNCRdb was developed using the XAMPP webserver, HTML, PHP
8.2.0, JavaScript, MySQL, Bootstrap 5, and DataTables plug-in. The database is freely accessible
at https://dbtcmi.in/tools/lclncrdb/main.html and is a valuable resource for researchers studying
lung cancer and lncRNAs, with the potential to greatly advance research in this field.
2. Web resource content and methods
The data for transcriptomic profiling of LUAD and LUSC tumor and normal samples were
obtained using the TCGAbiolinks R package [16] from the TCGA-GDC portal. The projects were
TCGA-LUAD and TCGA-LUSC, comprising 598 and 551 samples, respectively. Of these, 537
LUAD and 502 LUSC samples were tumor-positive, whereas 59 LUAD and 49 LUSC samples
were normal. Following pre-processing to remove duplicates and low read count entries, gene
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
symbols were mapped to coding and non-coding entities to classify genes as mRNA, lncRNAs,
and miRNAs. Apart from this, a comprehensive search of the PubMed database was conducted to
identify lncRNAs related to lung cancer using the entrez_search function of the rentrez R package
[17] up until Jan 24, 2024. Keywords such as "long non-coding RNA," "lncRNA," "long non-
coding," and "lung cancer" were used to retrieve the information. The search yielded 1924 hits,
reporting lncRNAs associated with lung cancer development, progression, diagnosis, and
treatment (Figure 1). The information on lncRNAs was obtained from the HGNC database, and
their sequences were sourced from the Ensembl database. The Gene Cards database was used to
determine their association with lung cancer and other forms of cancer. This information was
systematically stored and managed using the MySql data tables.
Figure 1: No. of arcles on lncRNA and Lung cancer
3. Analysis of transcriptomic data
3.1 Differential gene expression analysis
A differential gene expression analysis was carried out using the DESeq2 package [18], which
allowed us to identify genes with significant changes in expression. The filtering criteria for
differentially expressed genes were set at log2FC =< 2 and a adjusted P value of less than 0.05,
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
which were used to screen out genes that exhibited significant changes in expression. Differentially
expressed genes were then categorized into protein-coding and non-coding groups. Additionally,
the regulatory networks of the target genes were examined using differentially expressed non-
coding elements such as DE-miRNA (DEM) and DE-lncRNA (DEL).
3.2 Target prediction and network construction
3.2.1 Prediction of target miRNAs of DE-lncRNA
In this study, 619 and 931 lncRNAs were identified as differentially expressed in LUAD and
LUSC, respectively, of which 448 were common between the two cancer types, resulting in a total
of 1,102 unique DE-lncRNAs. To identify their target miRNAs, two databases were utilized:
miRcode [19], which contains 10,000 long non-coding RNAs, and lncRNASNP2 [20], which
encompasses experimentally validated microRNA-long non-coding RNA interactions.
3.2.2 Prediction of target mRNAs of DE-miRNA
Our study utilized the miRDB [21] database to identify mRNAs that are regulated by DE-miRNAs.
This database comprises an extensive collection of predicted miRNA-mRNA interactions. The
investigation primarily focused on mRNAs that exhibited differential expression in the target genes
of DE-miRNAs to elucidate specific interactions.
3.2.3 Target-gene interaction network construction
To construct the lncRNA-miRNA-mRNA competing endogenous RNA (ceRNA) network, DE-
lncRNA-DE-miRNA and DE-miRNA-DE-mRNA target networks were integrated for both LUAD
and LUSC. This approach facilitates the understanding of complex regulatory mechanisms by
mapping the interactions between different RNA molecules in lung cancer subtypes. Specifically,
there were 5 such DE-miRNAs for LUAD and 6 for LUSC. Consequently, only 5 DE-miRNAs
and their corresponding target DE-mRNAs were included from LUAD and 6 DE-miRNAs and
their corresponding target DE-mRNAs from LUSC in constructing the network.
3.3 Survival Analysis
Survival analysis was carried out to investigate the effect of differentially expressed genes,
lncRNAs, and miRNAs on patient survival in the LUAD and LUSC groups. The lncExplore
database [22] was used to generate survival curves for patients with high or low expression levels
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
of DE-lncRNAs over time. The Kaplan-Meier plotter [23] tool was used to assess the prognostic
performance of DE-miRNAs and DE-mRNAs from the ceRNA network, by calculating the Cox
proportional hazards ratio (HR) > 1 and log-rank p-value cutoff < 0.05 to identify poorly prognosed
genes. Finally, the survival probability of patients with low or high expression levels of DE-
miRNAs and DE-mRNAs over time (in months) was compared.
3.4 Database Construction
The database was constructed using the XAMPP web server, HTML, PHP 8.2.0, and JavaScript
for the front-end; MySQL for the backend to store database tables; and Bootstrap 5 for styling
purposes. Additionally, the DataTables plugin was employed to present tables with a large amount
of data in an organized manner.
4. Results
4.1 Data summary
The LCLNCRdb is a comprehensive database that contains information on 1,102 long non-coding
RNAs (lncRNAs) that exhibit differential expression patterns in lung cancer patients. In addition
to basic information, the database includes data on target-gene regulatory interactions and
predicted prognostic performance in patients with lung cancer.
4.2 Web interface and usage
4.2.1 User interface modules
The primary goal of the LCLNCRdb is to identify and analyze long non-coding RNAs (lncRNAs)
that exhibit differential expression in lung cancer. They have the potential to significantly
contribute to lung cancer development and progression and may serve as biomarkers or therapeutic
targets. The database provides information on 1,102 lncRNAs and their expression levels in lung
adenocarcinoma and squamous cell carcinoma. LCLNCRdb offers a user-friendly interface for
exploring differentially expressed lncRNAs, target networks, and survival analysis plots, and users
can download the information in csv format from the download module. The platform provides
four user-friendly web interfaces to access the database.
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
4.2.2 Homepage
Our database comprises 1102 long non-coding RNAs (lncRNAs) that exhibit differential
expression profiles along with their corresponding target networks and survival analysis plots. The
homepage of our database, LCLNCRdb, features a fixed navigation bar at the top and displays the
name of the database, as shown in Figure 2.
Figure 2: Homepage of LCLNCRdb
4.2.3 Navigation Tabs
The database's user interface includes a navigation bar with six sections: Home, Search, Analysis,
Submit, Download, and About. The Search tab allows users to search for long non-coding RNA
(lncRNAs) using gene names, sequence information, or a list of lncRNAs. The Analysis tab offers
a network analysis of target genes and competing endogenous RNA (ceRNA) networks, as well as
survival analysis plots. The About tab provides information on how to use the database and
includes contact details, as illustrated in Figure 3.
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 3(a): Search module opon of LCLNCRdb
Figure 3(b): Analysis module opon of LCLNCRdb
Figure 3(c): About module opon of LCLNCRdb
4.2.4 Search module
The search module features three distinct search tabs that can be accessed via a drop-down menu:
search by gene, search by sequence, and browsing. The user has the option to select from the two
search methods. They can search for a gene by entering its official symbol, Entrez ID, HGNC ID,
Ensemble ID, NONCODE ID or by submitting the FASTA sequence of the gene.
The following information is summarized in Figure 4, which provides a detailed breakdown of
each long non-coding RNA (lncRNA): The figure displays the lncRNA's unique identifier (ids),
aliases, map location, source, types, and expression pattern. By clicking on a lncRNA's RefSeq
ID, the user can access its FASTA sequence and view the network of microRNAs (miRNAs)
targeted by the lncRNA below the sequence. Additionally, the survival plot and p-value shown in
the figure indicate the prognostic significance of the lncRNAs.
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 4: Search page of LCLNCRdb
Figure 4: Showing search results (a) General details of lncRNA “HOTAIR”
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 4: Showing search results (b) RefSeq fasta sequence of HOTAIR
Figure 4: Showing search results (c) miRNA - target network of HOTAIR
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 4: Showing search results (d) Survival plot of HOTAIR in LUAD and LUSC
4.2.5 Browse module
The section for browsing the table displays all 1102 long non-coding RNAs (lncRNAs) with
columns for their classification, expression pattern, and origin. Users can refine their search by
selecting specific categories, such as lung adenocarcinoma (LUAD), lung squamous cell
carcinoma (LUSC), or those that are shared by both. Additionally, data can be filtered based on the
expression pattern of lncRNAs, which can be upregulated, downregulated, or both. Furthermore,
the table can be arranged in ascending or descending order, as shown in Figure 5(a) and 5(b).
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 5:(a) Browse page showing all 1102 lncRNA
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 5:(b) Browse page LUAD and LUSC type filter and Upregulated expression filter
4.2.6 Analysis module
The Analysis tab offers two distinct modules: network and survival analyses.
4.2.6.1 Network Analysis module
The Network analysis module offers three distinct types of target networks: DElncRNA-
DEmiRNA, DElncRNA-DEmiRNA-DEmRNA, and Competitive endogenous RNA (ceRNA)
networks, which are specifically designed to analyze miRNAs in LUAD and LUSC. To visualize
these networks and offer interactive features, the Cytoscape.js JavaScript library version 3.2.4 was
used. The table below the network displays the centrality measures of each gene sorted by degree,
in descending order, as depicted in Figure 6(a) and 6(b).
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 6:(a) Network analysis page showing the networks available
Figure 6:(b) Display of DE-lncRNA – DE-miRNA target network of LUAD
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
4.2.6.2 Survival Analysis module
The survival analysis module enables researchers to examine overall survival based on long non-
coding RNA (lncRNA) expression utilizing the Kaplan-Meier survival plot. To assess the survival
of a given gene, researchers input the lncRNA name into the designated field and initiate the search
function, which subsequently generates the survival plot for that specific gene. The resulting plot
displays the lncRNA plot along with the messenger RNA (mRNA) or microRNA (miRNA)
associated with the competing endogenous RNA (ceRNA) network of lung adenocarcinoma
(LUAD) and lung squamous cell carcinoma (LUSC). The plot presents both LUAD and LUSC if
the gene/lncRNA is shared by both types or only one type if it is exclusive to one type, as illustrated
in Figure 7(a), 7(b), and 7(c).
Figure 7(a): Survival Analysis page
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 7(b): Representaon of Kaplan-Meier survival plot for the lncRNA “BBOX1-AS1”
Figure 7(c): Representaon of Kaplan – Meier survival plot for the mRNA “ADAM12”
4.2.7 Submit module
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
The procedures depicted in Figure 8(a) and 8(b) are essential for effectively completing the Submit
page form. To guarantee the successful submission of user enquiry or fresh information pertaining
to lncRNAs and lung cancer, it is essential to verify that all fields are correctly completed. This
includes the name, email address, subject, and message. Subsequently, click the Submit button to
transmit the form. Upon validation, the form will be sent as an email.
Figure 8(a): Submit page
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 8:(b) Submied form as an email
4.2.8 Download module
The 1102 lncRNAs in LUAD and LUSC, as well as their shared lncRNAs, can be accessed on the
download page, which also includes network analysis and centrality measures. The data can be
downloaded in the CSV format, as shown in Figure 9.
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
Figure 9: Download page of LCLNCRdb
5. Summary and future directions
Summary:
Lung cancer, a major global health concern, is intricately associated with the dysregulation of long
non-coding RNAs (lncRNAs), which play a crucial role in the disease's onset, progression, and
treatment resistance. Despite their significance, there is a notable absence of centralized databases
that compile comprehensive data on lncRNAs related to lung cancer. This gap in resources hinders
the consolidation of crucial information that could potentially advance research and therapy
development in this field. To address this gap, the LCLNCRdb database was developed, which
includes data from various sources, such as published research articles, and TCGA. The database
contains detailed information on 1102 lncRNAs that have differential expression in lung cancer
patients, including lncRNA name, entrez ID, Ensemble ID, HGNC ID, NONCODE ID, lung cancer
type, source, lncRNA expression pattern, experimental techniques, network analysis, and survival
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
analysis details. LCLNCRdb offers a user-friendly interface that enables users to browse, retrieve,
and download data, as well as a dedicated submission page for researchers to share newly
indentified long non-coding RNAs (lncRNAs) related to lung cancer. LCLNCRdb aims to enhance
our understanding of lncRNA deregulation in lung cancer and provides a valuable and timely
resource for lncRNA research. The database is freely accessible at
https://dbtcmi.in/tools/lclncrdb/main.html
LCLNCRdb is a comprehensive and user-friendly database that compiles information on
differentially expressed long non-coding RNAs (lncRNAs) and their targets related to lung cancer,
sourced from various data sources, providing an accessible and thorough resource for researchers.
This resource provides extensive details regarding lncRNAs associated with lung cancer, including
interactive and survival plots, to aid downstream analyses, such as identifying biomarkers and
selecting target genes for future experiments. LCLNCRdb contains information on 1102 lncRNAs
related to lung cancer and freely accessible. Users can retrieve different types of target networks,
such as DE-lncRNA-DE-miRNA, DE-lncRNA-DEmiRNA-DE-mRNA, and competitive
endogenous regulatory networks for both upregulated and downregulated miRNAs. The database
aims to curate scientific literature on lncRNAs associated with lung cancer with high quality and
efficiency, ensuring that the information is relevant, comprehensive, and up-to-date. LCLNCRdb
also strives to improve the user experience and add new relevant content. To our knowledge, there
are no other specific and up-to-date databases dedicated to lncRNAs associated with lung cancer.
This database, which is specifically designed for lung cancer and its associated lncRNAs, provides
a comprehensive resource for researchers. The study explored the functions and underlying
processes of long non-coding RNAs (lncRNAs) in lung cancer, aiming to identify potential
biomarkers and therapeutic targets.
Acknowledgment:
The authors gratefully acknowledge the DBT-Centre for Microbial Informatics (https://dbtcmi.in)
for hosting the LCLNCRdb database. Vindal V would like to acknowledge the Institution of
Eminence (IoE), University of Hyderabad (No. UoH/IoE/RC3-21-052), Indian Council of Medical
Research (ICMR), GoI (ISRM/12(72)/2020, ID: 2020-2951), and Department of Biotechnology,
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
GoI (No. BUILDER-DBT-BT/INF/22/SP41176/2020) for their financial support. Mallikarjuna T
would like to thank ICMR, GoI, for the financial support as SRF (Ref. No: ISRM/11(47)/2019).
Author contributions:
Ayushi Dwivedi: Conceptualization, Methodology, Software, Data curation, Formal analysis,
Investigation, Validation, Writing – original draft. Afrin Zulfia S: Methodology, Data curation.
Mallikarjuna Thippana: Methodology, Data curation, Investigation. Sai Nikhith Cholleti:
Methodology, Data curation, Investigation. Vaibhav Vindal: Conceptualization, Methodology,
Software, Resources, Writing – review & editing, Supervision.
Declarations of competing Interests:
The authors have no relevant financial or non-financial interests to disclose.
Funding: This work has been supported by the Institution of Eminence (IOE)-University of
Hyderabad (UoH).
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
References:
1. Cao,Z., Oyang,L., Luo,X., Xia,L., Hu,J., Lin,J., Tan,S., Tang,Y., Zhou,Y., Cao,D., et al. (2022)
The roles of long non-coding RNAs in lung cancer. J. Cancer, 13.
2. Thin,K.Z., Liu,X., Feng,X., Raveendran,S. and Tu,J.C. (2018) LncRNA-DANCR: A valuable
cancer related long non-coding RNA for human cancers. Pathol. Res. Pract., 214.
3. Chen,Z., Lei,T., Chen,X., Gu,J., Huang,J., Lu,B. and Wang,Z. (2020) Long non-coding RNA in
lung cancer. Clin. Chim. Acta, 504.
4. Shi,T., Gao,G. and Cao,Y. (2016) Long noncoding RNAs as novel biomarkers have a promising
future in cancer diagnostics. Dis. Markers, 2016.
5. Zhang,J., Piao,H. yan, Guo,S., Wang,Y., Zhang,T., Zheng,Z. chao and Zhao,Y. (2020)
LINC00163 inhibits the invasion and metastasis of gastric cancer cells as a ceRNA by sponging
miR-183 to regulate the expression of AKAP12. Int. J. Clin. Oncol., 25.
6. Sanchez Calle,A., Kawamura,Y., Yamamoto,Y., Takeshita,F. and Ochiya,T. (2018) Emerging
roles of long non-coding RNA in cancer. Cancer Sci., 109.
7. Chen,Y., Li,C., Pan,Y., Han,S., Feng,B., Gao,Y., Chen,J., Zhang,K., Wang,R. and Chen,L.
(2016) The Emerging Role and Promise of Long Noncoding RNAs in Lung Cancer Treatment.
Cell. Physiol. Biochem., 38.
8. Chen,Z., Fillmore,C.M., Hammerman,P.S., Kim,C.F. and Wong,K.K. (2014) Non-small-cell
lung cancers: A heterogeneous set of diseases. Nat. Rev. Cancer, 14.
9. Luo,J., Ostrem,J., Pellini,B., Imbody,D., Stern,Y., Solanki,H.S., Haura,E.B. and Villaruz,L.C.
(2022) Overcoming KRAS -Mutant Lung Cancer . Am. Soc. Clin. Oncol. Educ. B.,
10.1200/edbk_360354.
10. Chen,S., Zhu,J., Wang,F., Guan,Z., Ge,Y., Yang,X. and Cai,J. (2017) LncRNAs and their role
in cancer stem cells. Oncotarget, 8.
11. Gencel-Augusto,J., Wu,W. and Bivona,T.G. (2023) Long Non-Coding RNAs as Emerging
Targets in Lung Cancer. Cancers (Basel)., 15.
12. Ning,S., Zhang,J., Wang,P., Zhi,H., Wang,J., Liu,Y., Gao,Y., Guo,M., Yue,M., Wang,L., et al.
(2016) Lnc2Cancer: A manually curated database of experimentally supported lncRNAs
associated with various human cancers. Nucleic Acids Res., 44.
13. Gao,Y., Wang,P., Wang,Y., Ma,X., Zhi,H., Zhou,D., Li,X., Fang,Y., Shen,W., Xu,Y., et al.
(2019) Lnc2Cancer v2.0: Updated database of experimentally supported long non-coding RNAs
in human cancers. Nucleic Acids Res., 47.
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
14. Wang,J., Zhang,X., Chen,W., Li,J. and Liu,C. (2018) CRlncRNA: A manually curated database
of cancer-related long non-coding RNAs with experimental proof of functions on
clinicopathological and molecular features. BMC Med. Genomics, 11.
15. Bao,Z., Yang,Z., Huang,Z., Zhou,Y., Cui,Q. and Dong,D. (2019) LncRNADisease 2.0: An
updated database of long non-coding RNA-associated diseases. Nucleic Acids Res., 47.
16. Colaprico,A., Silva,T.C., Olsen,C., Garofano,L., Cava,C., Garolini,D., Sabedot,T.S.,
Malta,T.M., Pagnotta,S.M., Castiglioni,I., et al. (2016) TCGAbiolinks: An R/Bioconductor
package for integrative analysis of TCGA data. Nucleic Acids Res., 44.
17. Winter,D.J. (2017) rentrez: An R package for the NCBI eUtils API. R J., 9.
18. Love,M.I., Huber,W. and Anders,S. (2014) Moderated estimation of fold change and dispersion
for RNA-seq data with DESeq2. Genome Biol., 15.
19. Jeggari,A., Marks,D.S. and Larsson,E. (2012) miRcode: A map of putative microrna target
sites in the long non-coding transcriptome. Bioinformatics, 28.
20. Miao,Y.R., Liu,W., Zhang,Q. and Guo,A.Y. (2018) LncRNASNP2: An updated database of
functional SNPs and mutations in human and mouse lncRNAs. Nucleic Acids Res., 46.
21. Chen,Y. and Wang,X. (2020) MiRDB: An online database for prediction of functional
microRNA targets. Nucleic Acids Res., 48.
22. Lee,Y.W., Chen,M., Chung,I.F. and Chang,T.Y. (2021) LncExplore: A database of pan-cancer
analysis and systematic functional annotation for lncRNAs from RNA-sequencing data. Database,
2021.
23. Lánczky,A. and Győrffy,B. (2021) Web-based survival analysis tool tailored for medical
research (KMplot): Development and implementation. J. Med. Internet Res., 23.
.CC-BY-NC-ND 4.0 International licenseavailable under a
(which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprintthis version posted February 14, 2025. ; https://doi.org/10.1101/2025.02.14.638263doi: bioRxiv preprint
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Long non-coding RNAs (LncRNAs) are mRNA-like molecules that do not encode for proteins and that are longer than 200 nucleotides. LncRNAs play important biological roles in normal cell physiology and organism development. Therefore, deregulation of their activities is involved in disease processes such as cancer. Lung cancer is the leading cause of cancer-related deaths due to late stage at diagnosis, distant metastasis, and high rates of therapeutic failure. LncRNAs are emerging as important molecules in lung cancer for their oncogenic or tumor-suppressive functions. LncRNAs are highly stable in circulation, presenting an opportunity for use as non-invasive and early-stage cancer diagnostic tools. Here, we summarize the latest works providing in vivo evidence available for lncRNAs role in cancer development, therapy-induced resistance, and their potential as biomarkers for diagnosis and prognosis, with a focus on lung cancer. Additionally, we discuss current therapeutic approaches to target lncRNAs. The evidence discussed here strongly suggests that investigation of lncRNAs in lung cancer in addition to protein-coding genes will provide a holistic view of molecular mechanisms of cancer initiation, development, and progression, and could open up a new avenue for cancer treatment.
Article
Full-text available
Lung cancer is the most common malignancy, being a serious threat of human lives. The incidence and mortality of lung cancer has been increasing rapidly in the past decades. Although the development of new therapeutic modes, such as target therapy, the overall survival rate of lung cancer remains low. It is urgent to advance the understanding of molecular oncology and find novel biomarkers and targets for the early diagnosis, treatment, and prognostic prediction of lung cancer. Long non-coding RNAs (lncRNAs) are non-protein coding RNA transcripts that are more than 200 nucleotides in length. LncRNAs exert diverse biological functions by regulating gene expressions at transcriptional, translational, and post-translational levels. In the past decade, it has been shown that lncRNAs are extensively involved in the pathogenesis of various diseases, including lung cancer. In this review, we highlighted the lncRNAs characterized in lung cancer and discussed their translational potential in lung cancer clinics.
Article
Full-text available
Over the past few years, with the rapid growth of deep-sequencing technology and the development of computational prediction algorithms, a large number of long non-coding RNAs (lncRNAs) have been identified in various types of human cancers. Therefore, it has become critical to determine how to properly annotate the potential function of lncRNAs from RNA-sequencing (RNA-seq) data and arrange the robust information and analysis into a useful system readily accessible by biological and clinical researchers. In order to produce a collective interpretation of lncRNA functions, it is necessary to integrate different types of data regarding the important functional diversity and regulatory role of these lncRNAs. In this study, we utilized transcriptomic sequencing data to systematically observe and identify lncRNAs and their potential functions from 5034 The Cancer Genome Atlas RNA-seq datasets covering 24 cancers. Then, we constructed the 'lncExplore' database that was developed to comprehensively integrate various types of genomic annotation data for collective interpretation. The distinctive features in our lncExplore database include (i) novel lncRNAs verified by both coding potential and translation efficiency score, (ii) pan-cancer analysis for studying the significantly aberrant expression across 24 human cancers, (iii) genomic annotation of lncRNAs, such as cis-regulatory information and gene ontology, (iv) observation of the regulatory roles as enhancer RNAs and competing endogenous RNAs and (v) the findings of the potential lncRNA biomarkers for the user-interested cancers by integrating clinical information and disease specificity score. The lncExplore database is to our knowledge the first public lncRNA annotation database providing cancer-specific lncRNA expression profiles for not only known but also novel lncRNAs, enhancer RNAs annotation and clinical analysis based on pan-cancer analysis. lncExplore provides a more complete pathway to highly efficient, novel and more comprehensive translation of laboratory discoveries into the clinical context and will assist in reinterpreting the biological regulatory function of lncRNAs in cancer research. Database URL http://lncexplore.bmi.nycu.edu.tw.
Article
Full-text available
Background Survival analysis is a cornerstone of medical research, enabling the assessment of clinical outcomes for disease progression and treatment efficiency. Despite its central importance, no commonly used spreadsheet software can handle survival analysis and there is no web server available for its computation. Objective Here, we introduce a web-based tool capable of performing univariate and multivariate Cox proportional hazards survival analysis using data generated by genomic, transcriptomic, proteomic, or metabolomic studies. Methods We implemented different methods to establish cut-off values for the trichotomization or dichotomization of continuous data. The false discovery rate is computed to correct for multiple hypothesis testing. A multivariate analysis option enables comparing omics data with clinical variables. Results We established a registration-free web-based survival analysis tool capable of performing univariate and multivariate survival analysis using any custom-generated data. Conclusions This tool fills a gap and will be an invaluable contribution to basic medical and clinical research.
Article
Full-text available
Background Gastric cancer (GC) is the most common and aggressive cancer of the digestive system and poses a serious threat to human health. Since genes do not work alone, our aim was to elucidate the potential network of mRNAs and noncoding RNAs (ncRNAs) in this study. Methods Transcriptome data of GC were obtained from TCGA. R and Perl were used to obtain the differentially expressed RNAs and construct a competing endogenous RNA (ceRNA) regulatory network. To investigate the biological functions of differentially expressed RNAs, loss-of-function and gain-of-function experiments were performed. Real-time PCR (RT-qPCR), western blot analysis, dual-luciferase reporter assays and fluorescence in situ hybridization were conducted to explore the underlying mechanisms of competitive endogenous RNAs (ceRNAs). Results Based on TCGA data and bioinformatics analysis, we identified the LINC00163/miR-183/A-Kinase Anchoring Protein 12 (AKAP12) axis. We observed that AKAP12 was weakly expressed in GC and suppressed invasion and metastasis in GC cells, which could be abolished by miR-183. In addition, LINC00163 can be used as a ceRNA to inhibit the expression of miR-183, thus enhancing the anticancer effect of AKAP12. Conclusion Our results demonstrated that weak LINC00163 expression in GC can sponge miR-183 to promote AKAP12. We established that the LINC00163/miR-183/AKAP12 axis plays an important role in GC invasion and metastasis and may be a potential biomarker and target for GC treatment.
Article
Full-text available
MicroRNAs (miRNAs) are small noncoding RNAs that act as master regulators in many biological processes. miRNAs function mainly by downregulating the expression of their gene targets. Thus, accurate prediction of miRNA targets is critical for characterization of miRNA functions. To this end, we have developed an online database, miRDB, for miRNA target prediction and functional annotations. Recently, we have performed major updates for miRDB. Specifically, by employing an improved algorithm for miRNA target prediction, we now present updated transcriptome-wide target prediction data in miRDB, including 3.5 million predicted targets regulated by 7000 miRNAs in five species. Further, we have implemented the new prediction algorithm into a web server, allowing custom target prediction with user-provided sequences. Another new database feature is the prediction of cell-specific miRNA targets. miRDB now hosts the expression profiles of over 1000 cell lines and presents target prediction data that are tailored for specific cell models. At last, a new web query interface has been added to miRDB for prediction of miRNA functions by integrative analysis of target prediction and Gene Ontology data. All data in miRDB are freely accessible at http://mirdb.org.
Article
Full-text available
Background Recent studies demonstrated that long non-coding RNAs (lncRNAs) could be intricately implicated in cancer-related molecular networks, and related to cancer occurrence, development and prognosis. However, clinicopathological and molecular features for these cancer-related lncRNAs, which are very important in bridging lncRNA basic research with clinical research, fail to well settle to integration. Results After manually reviewing more than 2500 published literature, we collected the cancer-related lncRNAs with the experimental proof of functions. By integrating from literature and public databases, we constructed CRlncRNA, a database of cancer-related lncRNAs. The current version of CRlncRNA embodied 355 entries of cancer-related lncRNAs, covering 1072 cancer-lncRNA associations regarding to 76 types of cancer, and 1238 interactions with different RNAs and proteins. We further annotated clinicopathological features of these lncRNAs, such as the clinical stages and the cancer hallmarks. We also provided tools for data browsing, searching and download, as well as online BLAST, genome browser and gene network visualization service. Conclusions CRlncRNA is a manually curated database for retrieving clinicopathological and molecular features of cancer-related lncRNAs supported by highly reliable evidences. CRlncRNA aims to provide a bridge from lncRNA basic research to clinical research. The lncRNA dataset collected by CRlncRNA can be used as a golden standard dataset for the prospective experimental and in-silico studies of cancer-related lncRNAs. CRlncRNA is freely available for all users at http://crlnc.xtbg.ac.cn.
Article
Full-text available
Mounting evidence suggested that dysfunction of long non-coding RNAs (lncRNAs) is involved in a wide variety of diseases. A knowledgebase with systematic collection and curation of lncRNA-disease associations is critically important for further examining their underlying molecular mechanisms. In 2013, we presented the first release of LncRNADisease, representing a database for collection of experimental supported lncRNA-disease associations. Here, we describe an update of the database. The new developments in LncRNADisease 2.0 include (i) an over 40-fold lncRNA-disease association enhancement compared with the previous version; (ii) providing the transcriptional regulatory relationships among lncRNA, mRNA and miRNA; (iii) providing a confidence score for each lncRNA-disease association; (iv) integrating experimentally supported circular RNA disease associations. LncRNADisease 2.0 documents more than 200 000 lncRNA-disease associations. We expect that this database will continue to serve as a valuable source for potential clinical application related to lncRNAs. LncRNADisease 2.0 is freely available at http://www.rnanut.net/lncrnadisease/.
Article
More than 50 years after the discovery of RAS family proteins, which harbor the most common activating mutations in cancer, the U.S. Food and Drug Administration approved the first direct allele-specific inhibitor of mutant KRAS in lung cancer. We highlight the history of discovering RAS and decades of studies targeting KRAS-driven lung cancer. A landmark article by Shokat and colleagues in 2013 elucidated allosteric inhibition of this undruggable target and paved the way for the first-in-class direct KRASG12C inhibitor. Although these drugs have impressive 36%-45% objective response rates with a median duration of response of 10 months, many tumors do not respond, and diverse mechanisms of resistance have already been observed; this includes new KRAS alterations, activation of alternate RTK pathway proteins, bypass pathways, and transcriptional remodeling. These resistance mechanisms can be profiled using tissue-based and plasma-based testing and help to inform clinical trial options for patients. We conclude with a discussion of research informing ongoing clinical trials to rationally test promising treatments to thwart or overcome resistance to KRASG12C inhibitors and target other KRAS-altered lung cancers.
Article
Lung cancer is the leading cause of cancer-related death worldwide. Owing to the difficulty in early diagnosis and the lack of effective treatment strategies, the 5-year survival rates for lung cancer remain very low. With the development of whole genome and transcriptome sequencing technology, long non-coding RNA (lncRNA) has attracted increasing attention. LncRNAs regulate gene expression at the epigenetic, transcriptional and post-transcriptional levels and are widely involved in a variety of diseases, including tumorigenesis. In lung cancer studies, multiple differentially expressed lncRNAs have been identified; several lncRNAs were identified as oncogenic lncRNAs with tumor-driving effects, while other lncRNAs play a role in tumor inhibition and are called tumor-suppressive lncRNAs. These tumor-suppressive lncRNAs are involved in multiple physiological processes such as cell proliferation, apoptosis, and metastasis and thus participate in tumor progression. In this review, we discussed the oncogenic and tumor-suppressive lncRNAs in lung cancer, as well as their biological functions and regulatory mechanisms. Furthermore, we found the potential significance of lncRNAs in clinical diagnosis and treatment.