Usage and example:
Workflow of pre-processing:
Import packages:
import scanpy as sc
import numpy as np
import giniclust3 as gc
import anndata
Import single cell count matrix from csv file:
adataRaw=sc.read_csv("giniclust3/data/GSM1599495_ES_d0_biorep_techrep1.csv",first_column_names=True)
Filter gene expression matrix:
sc.pp.filter_cells(adataRaw,min_genes=3)
sc.pp.filter_genes(adataRaw,min_cells=200)
Transform expression matrix (skip this step if the input matrix is: col for genes and row for cells):
adataSC=anndata.AnnData(X=adataRaw.X.T,obs=adataRaw.var,var=adataRaw.obs)
Gene expression normalization:
sc.pp.normalize_per_cell(adataSC, counts_per_cell_after=1e4)
Apply GiniClust3 for both common and rare cluster identification:
Perform GiniIndexClust:
gc.gini.calGini(adataSC) ###Calculate Gini Index
adataGini=gc.gini.clusterGini(adataSC,neighbors=3) ###Use higher value of neighbor in larger dataset. Recommend (5:15)
Perform FanoFactorClust:
gc.fano.calFano(adataSC) ###Calculate Fano factor
adataFano=gc.fano.clusterFano(adataSC) ###Cluster based on Fano factor
Generate conseneus matrix and apply ConsensusClust step:
consensusCluster={}
consensusCluster['giniCluster']=np.array(adataSC.obs['rare'].values.tolist())
consensusCluster['fanoCluster']=np.array(adataSC.obs['fano'].values.tolist())
gc.consensus.generateMtilde(consensusCluster) ###Generate consensus matrix
gc.consensus.clusterMtilde(consensusCluster) ###Cluster consensus matrix
####output cluster results to final.txt####
np.savetxt("final.txt",consensusCluster['finalCluster'], delimiter="\t",fmt='%s')
UMAP visualization:
adataGini.obs['final']=consensusCluster['finalCluster']
adataFano.obs['final']=consensusCluster['finalCluster']
gc.plot.plotGini(adataGini)
gc.plot.plotFano(adataFano)
UMAP visualization of the figures