Clustering and Column/Row Ordering • NGCHMR

This vignette describes options for ordering rows and columns of a NG-CHM, and the additional options available when the ordering is hierarchical clustering.

Specifying Row / Column Order

The chmNew() function allows for arguments “rowOrder” and “colOrder”, each of which can be a vector, dendrogram, or function specifying the order of the rows/columns.

The following command produces a NG-CHM with rows sorted alphabetically and columns ordered randomly:

library(NGCHM)
library(NGCHMDemoData)
hm <- chmNew('tcga-brca', TCGA.BRCA.ExpressionData, rowOrder=sort(rownames(TCGA.BRCA.ExpressionData)), colOrder=colnames(TCGA.BRCA.ExpressionData)[ sample.int(length(colnames(TCGA.BRCA.ExpressionData))) ])

Hierarchical Clustering Options

For hierarchical clustering, additional options are available to specify the distance measure and clustering method.

The distance measure is specified with the ‘rowDist’ and ‘colDist’ arguments to chmNew(), and available choices include those of the stats::dist() function. There are two additional distance metric options: ‘cosine’ and ‘correlation’. The default is ‘correlation’, which computes the distance measure as 1 minus the Pearson correlation among the rows/columns.

The clustering method is specified with the ‘rowAgglom’ and ‘colAgglom’ arguments, and available options are those of the stats::hclust function. The default is ‘ward.D2’.

This example specifies a NG-CHM with hierarchical clustering using the Euclidean distance metric and complete linkage clustering algorithm for both rows and columns:

hm <- chmNew('tcga-brca', TCGA.BRCA.ExpressionData, rowDist='euclidean', colDist='euclidean', rowAgglom='complete', colAgglom='complete')

Explicit Clustering

Users may perform the clustering explicitly, and use the results in constructing a NG-CHM. The following creates an object of class hclust from the distance matrix of the demo expression data, for both the rows and columns, and uses those clustering results to construct an NG-CHM.

rowClust <- stats::hclust(dist(TCGA.BRCA.ExpressionData))
colClust <- stats::hclust(dist(t(TCGA.BRCA.ExpressionData)))
hm <- chmNew('tcga-brca', TCGA.BRCA.ExpressionData, rowOrder=rowClust, colOrder=colClust)

Similarly, an hclust object can be transformed into a dendrogram and used for the row and/or column ordering. The example below uses the clutering results from above as dedrograms:

rowDendrogram <- as.dendrogram(rowClust)
colDendrogram <- as.dendrogram(colClust)
hm <- chmNew('tcga-brca', TCGA.BRCA.ExpressionData, rowOrder=rowDendrogram, colOrder=colDendrogram)