ArticlePDF Available

Large-scale integration of microarray data: investigating the pathologies of cancer and infectious diseases


Abstract and Figures

DNA microarray data provide a high-throughput technique for the genome-wide profiling of genes at the transcript level. With large amounts of microarray data deposited on various types and aspects of malignancies, microarray technology has revolutionized the study of cancer. Such experiments aid in the discovery of novel biomarkers and provide insight into disease diagnosis, prognosis and response to treatment. Nonetheless, microarray data contains non-biological obscuring variations and systemic biases, which can distort the extraction of true aberrations in gene expression. Moreover, the number of samples generated by a single experiment is typically less than is statistically required to support the large number of genes studied. As a result, biomarker gene lists produced from independent datasets show little overlap. Therefore, to understand the pathophysiology of cancers and the influence they exert on the cellular processes they override, methods for combining data from different sources are necessary. Meta-analysis techniques have been utilized to address this issue by conducting an individual statistical analysis on each of the acquired datasets, then incorporating the results to generate a final gene list based on aggregated p-values or ranks. However, much of the publicly accessible cancer microarray datasets are unbalanced or asymmetric and therefore lack data from healthy samples. Consequently, critical and considerable amounts of data are overlooked. An integrative approach that combines data prior to analysis can incorporate asymmetric data. For this reason, a merge approach to the previously validated technique, the significance analysis of microarrays, is proposed. The merged SAM technique reproduced the known-cancer literature with higher coverage than meta-analysis in the five independent cancer tissues considered. The same methodology was extended to a database of approximately 6000 healthy and cancer samples arising from thirteen tissues. The integrative approach has allowed for the identification of key genes common to the invasive paths of multiple cancers and can aid in drug discovery. Moreover, this integrative microarray approach was applied to viral data from HIV-1, hepatitis C and influenza to investigate the effect of these infections on iron-binding proteins. Iron is crucial for proteins involved in metabolism, DNA synthesis and immunity, accentuating such proteins as direct or indirect viral targets.
Content may be subject to copyright.
A preview of the PDF is not available
... The next sections will be dedicated to the comprehension of the role and the effectiveness of nanomaterials when coupled to SPEs. [38], Hepatitis C virus [39], Dengue virus [40], Zika Virus [41], Hepatitis B virus [42] and Sars-CoV-2 [43]. ...
There is a growing interest in the development of portable, cost-effective, and easy-to-use biosensors for the rapid detection of diseases caused by infectious viruses: COVID-19 pandemic has highlighted the central role of diagnostics in response to global outbreaks. Among all the existing technologies, screen-printed electrodes (SPEs) represent a valuable technology for the detection of various viral pathogens. During the last five years, various nanomaterials have been utilized to modify SPEs to achieve convincing effects on the analytical performances of portable SPE-based diagnostics. Herein we would like to provide the readers a comprehensive investigation about the recent combination between SPEs and various nanomaterials for detecting viral pathogens. Manufacturing methods and features advances are critically discussed in the context of early-stage detection of diseases caused by HIV-1, HBV, HCV, Zika, Dengue, and Sars-CoV-2. A detailed table is reported to easily guide readers toward the “right” choice depending on the virus of interest.
Full-text available
DAVID bioinformatics resources consists of an integrated biological knowledgebase and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. This protocol explains how to use DAVID, a high-throughput and integrated data-mining environment, to analyze gene lists derived from high-throughput genomic experiments. The procedure first requires uploading a gene list containing any number of common gene identifiers followed by analysis using one or more text and pathway-mining tools such as gene functional classification, functional annotation chart or clustering and functional annotation table. By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.