Japan Bioinformatics KK

 

Data management, distribution, security and analysis

 
 
 

Home

Microarray Survey

Profile

Simbiot

Single-user accounts

Simbiot Mobile

Simbiot Collaboration

Private Servers

Perosnalized Medicine

Intro to Simbiot

Intro To Simbiot (Jp)

About Microarrays

Gene Expression Analysis

cDNA: Expression Analysis

cDNA: Time Course

cDNA: Clustering

cDNA: PCA

SNP Analysis

SNP: GWAS

SNP: LD

SNP: CNV

Consulting

High Speed Sequencers

HSS: De Novo RNA Seq

HSS: RNA Seq

HSS: ChIP Seq

HSS: Genomic Variations

HSS: miRNA Seq

News

News Item: DNAFORM

News Item: Nikkei Bio

News Item: GEN

Case Studies

Case Study: DNAFORM

Case Study: SMU

Case Study: CRO

Partners

Publications

Employment

Contact

 
 

Clustering

Background

Cluster analysis or clustering is the assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense. Clustering is a method of unsupervised learning, and a common technique for statistical data analysis used in many fields, including bioinformatics.

Algorithms

Hierarchical clustering creates a hierarchy of clusters which may be represented in a tree structure called a dendrogram. The root of the tree consists of a single cluster containing all observations, and the leaves correspond to individual observations.

The k-means algorithm assigns each point to the cluster whose center (also called centroid) is nearest. The center is the average of all the points in the cluster — that is, its coordinates are the arithmetic mean for each dimension separately over all the points in the cluster.

A self-organizing map (SOM) or self-organizing feature map (SOFM) is a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples, called a map. Self-organizing maps are different from other artificial neural networks in the sense that they use a neighborhood function to preserve the topological properties of the input space.

Analysis

Simbiot microarray analysis integrates a total of 6 implementation of 3 clustering algorithms.  Each algorithm (k-means, SOM and Hierarchical) is available via Cluster 3.0 (de Hoon, Imoto et al. 2004) function and via R built-in functions.  For more information about the individual algorithms, please follow the links below:

Cluster 3.0: (k-means, SOM, Hierarchical)

R: k-means

R: SOM

R: Hierarchical

Free demo accounts are available at http://www.simbiot.net.

Please also see more information about Simbiot Single User Accounts and Private Server installations as well as a brief introduction to microarray analysis.

References

de Hoon, M. J., S. Imoto, et al. (2004). "Open source clustering software." Bioinformatics 20(9): 1453-4.


Please contact Japan Bioinformatics KK for more information.