API

Import pandas and cellDancer modules as:

import pandas as pd
import celldancer as cd
import celldancer.cdplt as cdplt
import celldancer.utilities as cdutil
import celldancer.simulation as cdsim

After the preprocessing (optional, cdutil.adata_to_df_with_embed) and the loading of the data (pd.read_csv), the RNA velocity could be estimated by cd.velocity. The projection of RNA velocity to vector fields in the embedding space could be calculated by cd.compute_cell_velocity and visualized by cdplt.scatter_cell. The pseudotime could be calculated by cd.pseudo_time and visualized by cdplt.scatter_cell or cdplt.scatter_gene, the UMAP based on reaction rates could be calculated by cd.embedding_kinetic_para and visualized by cdplt.plot_kinetic_para. Genes with different kinetics could be simulated by cdsim.simulate. cellDancer results could be integrated with dynamo for downstream analysis by cdutil.to_dynamo and cdutil.export_velocity_to_dynamo.

Toolkit functions

Preprocessing and APIs to dynamo (cdutil)

Functions

adata_to_df_with_embed(adata[, us_para, ...])

Convert adata to pandas.DataFrame format and save it as csv file with embedding info.

to_dynamo(cellDancer_df)

Convert the output dataframe of cellDancer to the input of dynamo.

export_velocity_to_dynamo(cellDancer_df, adata)

Replace the velocities in adata of dynamo (“adata” in parameters) with the cellDancer predicted velocities (“cellDancer_df” in parameters).

Velocity estimation and analysis (cd)

Functions

velocity(cell_type_u_s[, gene_list, ...])

Velocity estimation for each cell.

compute_cell_velocity(cellDancer_df[, ...])

Project the RNA velocity onto the embedding space.

pseudo_time(cellDancer_df, grid, dt[, ...])

Compute the gene-shared pseudotime based on the projection of the RNA velocity on the embedding space. :param cellDancer_df: Dataframe of velocity estimation results. Columns=['cellIndex', 'gene_name', unsplice', 'splice', 'unsplice_predict', 'splice_predict', 'alpha', 'beta', 'gamma', 'loss', 'cellID, 'clusters', 'embedding1', 'embedding2', 'velocity1', 'velocity2'] :type cellDancer_df: pandas.DataFrame :param grid: (n_x, n_y), where n_x, n_y are integers. The embedding space (2d, [xmin, xmax] x [ymin, ymax]) is divided into n_x * n_y grids. The cells in the same grid share the same velocity (mean), however, they may not share the pseudotime. If it's set to None, then a recommended value for n_x and n_y is the square root of the number of selected cells (rounded to the nearest tenth.) :type grid: tuple :param dt: Time step used to advance a cell on the embedding for generation of cell diffusion trajectories. Parameter dt should be set together with t_total. Excessively small values of dt demand large t_total, and drastically increase computing time; Excessively large values of dt lead to low-resolution and unrealistic pseudotime estimation. :type dt: float :param t_total: Total number of time steps used for generation of cell diffusion trajectories. The diffusion is stopped by any of the criteria: - reach t_total; - the magnitude of the velocity is less than a cutoff eps; - the cell goes to where no cell resides; - the cell is out of the diffusion box (the grid) :type t_total: optional, float (default: 1000) :param n_repeats: Number of repeated diffusion of each cell used for generation of cell diffusion trajectories. :type n_repeats: optional, int (default: 10) :param psrng_seeds_diffusion: Pseudo random number generator seeds for all the replicas in the generation of cell diffusion trajectories. Its length = n_repeats. Set this for reproducibility. :type psrng_seeds_diffusion: optional, list-like (default: None) :param speed_up: The sampling grid used in compute_cell_velocity.compute. This grid is used for interpolating pseudotime for all cells. :type speed_up: optional, tuple (default: (60,60)) :param n_jobs: Number of threads or processes used for cell diffusion. It follows the scikit-learn convention. -1 means all possible threads. :type n_jobs: optional, int (default: -1) :param n_paths: Number of long paths to extract for cell pseudotime estimation. Note this parameter is very sensitive. For the best outcome, please set the number based on biological knowledge about the cell embedding. :type n_paths: optional, int (default: 5) :param plot_long_trajs: Whether to show the long trajectories whose traverse lengths are local maximums. :type plot_long_trajs: optional, bool`(default: False) :param save: Whether to save the pseudotime-included `cellDancer_df as .csv file. :type save: bool (default: False) :param output_path: Save file path. By default, the .csv file is saved in the current directory. :type output_path: optional, str (default: None).

embedding_kinetic_para(cellDancer_df, ...[, ...])

Calculate the UMAP based on the kinetic parameter(s).

Plotting (cdplt)

Functions

scatter_gene([ax, x, y, cellDancer_df, ...])

Plot the velocity (splice-unsplice) of a gene, or plot the parameter ('alpha', 'beta', 'gamma', 'splice', 'unsplice') in pseudotime, or customize the parameters in x-axis and y-axis of a gene.

scatter_cell(ax, cellDancer_df[, colors, ...])

Plot the RNA velocity on the embedding space; or plot the kinetic parameters ('alpha', 'beta', 'gamma', 'splice', 'unsplice', or 'pseudotime') of one gene on the embedding space.

plot_kinetic_para(ax, kinetic_para, ...[, ...])

Plot the UMAP calculated by the kinetic parameter(s).

PTO_Graph(ax, cellDancer_df[, node_layout, ...])

Graph visualization of selected cells reflecting their orders in pseudotime (PseudoTimeOrdered_Graph: PTO_Graph).

Simulation (cdsim)

Functions

simulate(kinetic_type[, alpha1, alpha2, ...])

Simulate a gene with the kinetic type of mono-kinetic, multi-forward, multi-backward, or transcriptional boost.