pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data. / Hansen, Morten; Gerds, Thomas Alexander; Nielsen, Ole Haagen; Seidelin, Jakob Benedict; Troelsen, Jesper Thorvald; Olsen, Jørgen.

I: P L o S One, Bind 7, Nr. 2, 2012, s. e32394.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Hansen, M, Gerds, TA, Nielsen, OH, Seidelin, JB, Troelsen, JT & Olsen, J 2012, 'pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data', P L o S One, bind 7, nr. 2, s. e32394. https://doi.org/10.1371/journal.pone.0032394

APA

Hansen, M., Gerds, T. A., Nielsen, O. H., Seidelin, J. B., Troelsen, J. T., & Olsen, J. (2012). pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data. P L o S One, 7(2), e32394. https://doi.org/10.1371/journal.pone.0032394

Vancouver

Hansen M, Gerds TA, Nielsen OH, Seidelin JB, Troelsen JT, Olsen J. pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data. P L o S One. 2012;7(2):e32394. https://doi.org/10.1371/journal.pone.0032394

Author

Hansen, Morten ; Gerds, Thomas Alexander ; Nielsen, Ole Haagen ; Seidelin, Jakob Benedict ; Troelsen, Jesper Thorvald ; Olsen, Jørgen. / pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data. I: P L o S One. 2012 ; Bind 7, Nr. 2. s. e32394.

Bibtex

@article{8c62a4e340e146c4ad7aef579d9edda9,
title = "pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data",
abstract = "Analyzing data obtained from genome-wide gene expression experiments is challenging due to the quantity of variables, the need for multivariate analyses, and the demands of managing large amounts of data. Here we present the R package pcaGoPromoter, which facilitates the interpretation of genome-wide expression data and overcomes the aforementioned problems. In the first step, principal component analysis (PCA) is applied to survey any differences between experiments and possible groupings. The next step is the interpretation of the principal components with respect to both biological function and regulation by predicted transcription factor binding sites. The robustness of the results is evaluated using cross-validation, and illustrative plots of PCA scores and gene ontology terms are available. pcaGoPromoter works with any platform that uses gene symbols or Entrez IDs as probe identifiers. In addition, support for several popular Affymetrix GeneChip platforms is provided. To illustrate the features of the pcaGoPromoter package a serum stimulation experiment was performed and the genome-wide gene expression in the resulting samples was profiled using the Affymetrix Human Genome U133 Plus 2.0 chip. Array data were analyzed using pcaGoPromoter package tools, resulting in a clear separation of the experiments into three groups: controls, serum only and serum with inhibitor. Functional annotation of the axes in the PCA score plot showed the expected serum-promoted biological processes, e.g., cell cycle progression and the predicted involvement of expected transcription factors, including E2F. In addition, unexpected results, e.g., cholesterol synthesis in serum-depleted cells and NF-¿B activation in inhibitor treated cells, were noted. In summary, the pcaGoPromoter R package provides a collection of tools for analyzing gene expression data. These tools give an overview of the input data via PCA, functional interpretation by gene ontology terms (biological processes), and an indication of the involvement of possible transcription factors.",
author = "Morten Hansen and Gerds, {Thomas Alexander} and Nielsen, {Ole Haagen} and Seidelin, {Jakob Benedict} and Troelsen, {Jesper Thorvald} and J{\o}rgen Olsen",
year = "2012",
doi = "10.1371/journal.pone.0032394",
language = "English",
volume = "7",
pages = "e32394",
journal = "PLoS ONE",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "2",

}

RIS

TY - JOUR

T1 - pcaGoPromoter--an R package for biological and regulatory interpretation of principal components in genome-wide gene expression data

AU - Hansen, Morten

AU - Gerds, Thomas Alexander

AU - Nielsen, Ole Haagen

AU - Seidelin, Jakob Benedict

AU - Troelsen, Jesper Thorvald

AU - Olsen, Jørgen

PY - 2012

Y1 - 2012

N2 - Analyzing data obtained from genome-wide gene expression experiments is challenging due to the quantity of variables, the need for multivariate analyses, and the demands of managing large amounts of data. Here we present the R package pcaGoPromoter, which facilitates the interpretation of genome-wide expression data and overcomes the aforementioned problems. In the first step, principal component analysis (PCA) is applied to survey any differences between experiments and possible groupings. The next step is the interpretation of the principal components with respect to both biological function and regulation by predicted transcription factor binding sites. The robustness of the results is evaluated using cross-validation, and illustrative plots of PCA scores and gene ontology terms are available. pcaGoPromoter works with any platform that uses gene symbols or Entrez IDs as probe identifiers. In addition, support for several popular Affymetrix GeneChip platforms is provided. To illustrate the features of the pcaGoPromoter package a serum stimulation experiment was performed and the genome-wide gene expression in the resulting samples was profiled using the Affymetrix Human Genome U133 Plus 2.0 chip. Array data were analyzed using pcaGoPromoter package tools, resulting in a clear separation of the experiments into three groups: controls, serum only and serum with inhibitor. Functional annotation of the axes in the PCA score plot showed the expected serum-promoted biological processes, e.g., cell cycle progression and the predicted involvement of expected transcription factors, including E2F. In addition, unexpected results, e.g., cholesterol synthesis in serum-depleted cells and NF-¿B activation in inhibitor treated cells, were noted. In summary, the pcaGoPromoter R package provides a collection of tools for analyzing gene expression data. These tools give an overview of the input data via PCA, functional interpretation by gene ontology terms (biological processes), and an indication of the involvement of possible transcription factors.

AB - Analyzing data obtained from genome-wide gene expression experiments is challenging due to the quantity of variables, the need for multivariate analyses, and the demands of managing large amounts of data. Here we present the R package pcaGoPromoter, which facilitates the interpretation of genome-wide expression data and overcomes the aforementioned problems. In the first step, principal component analysis (PCA) is applied to survey any differences between experiments and possible groupings. The next step is the interpretation of the principal components with respect to both biological function and regulation by predicted transcription factor binding sites. The robustness of the results is evaluated using cross-validation, and illustrative plots of PCA scores and gene ontology terms are available. pcaGoPromoter works with any platform that uses gene symbols or Entrez IDs as probe identifiers. In addition, support for several popular Affymetrix GeneChip platforms is provided. To illustrate the features of the pcaGoPromoter package a serum stimulation experiment was performed and the genome-wide gene expression in the resulting samples was profiled using the Affymetrix Human Genome U133 Plus 2.0 chip. Array data were analyzed using pcaGoPromoter package tools, resulting in a clear separation of the experiments into three groups: controls, serum only and serum with inhibitor. Functional annotation of the axes in the PCA score plot showed the expected serum-promoted biological processes, e.g., cell cycle progression and the predicted involvement of expected transcription factors, including E2F. In addition, unexpected results, e.g., cholesterol synthesis in serum-depleted cells and NF-¿B activation in inhibitor treated cells, were noted. In summary, the pcaGoPromoter R package provides a collection of tools for analyzing gene expression data. These tools give an overview of the input data via PCA, functional interpretation by gene ontology terms (biological processes), and an indication of the involvement of possible transcription factors.

U2 - 10.1371/journal.pone.0032394

DO - 10.1371/journal.pone.0032394

M3 - Journal article

C2 - 22384239

VL - 7

SP - e32394

JO - PLoS ONE

JF - PLoS ONE

SN - 1932-6203

IS - 2

ER -

ID: 38061286