celldancer.pseudo_time

celldancer.pseudo_time(cellDancer_df, grid, dt, t_total=1000, n_repeats=10, psrng_seeds_diffusion=None, n_jobs=- 1, speed_up=(60, 60), n_paths=5, plot_long_trajs=False, save=False, output_path=None)

Compute the gene-shared pseudotime based on the projection of the RNA velocity on the embedding space. :param cellDancer_df: Dataframe of velocity estimation results.

Columns=[‘cellIndex’, ‘gene_name’, unsplice’, ‘splice’, ‘unsplice_predict’, ‘splice_predict’, ‘alpha’, ‘beta’, ‘gamma’, ‘loss’, ‘cellID, ‘clusters’, ‘embedding1’, ‘embedding2’, ‘velocity1’, ‘velocity2’]

Parameters
  • grid (tuple) – (n_x, n_y), where n_x, n_y are integers. The embedding space (2d, [xmin, xmax] x [ymin, ymax]) is divided into n_x * n_y grids. The cells in the same grid share the same velocity (mean), however, they may not share the pseudotime. If it’s set to None, then a recommended value for n_x and n_y is the square root of the number of selected cells (rounded to the nearest tenth.)

  • dt (float) – Time step used to advance a cell on the embedding for generation of cell diffusion trajectories. Parameter dt should be set together with t_total. Excessively small values of dt demand large t_total, and drastically increase computing time; Excessively large values of dt lead to low-resolution and unrealistic pseudotime estimation.

  • t_total (optional, float (default: 1000)) – Total number of time steps used for generation of cell diffusion trajectories. The diffusion is stopped by any of the criteria: - reach t_total; - the magnitude of the velocity is less than a cutoff eps; - the cell goes to where no cell resides; - the cell is out of the diffusion box (the grid)

  • n_repeats (optional, int (default: 10)) – Number of repeated diffusion of each cell used for generation of cell diffusion trajectories.

  • psrng_seeds_diffusion (optional, list-like (default: None)) – Pseudo random number generator seeds for all the replicas in the generation of cell diffusion trajectories. Its length = n_repeats. Set this for reproducibility.

  • speed_up (optional, tuple (default: (60,60))) – The sampling grid used in compute_cell_velocity.compute. This grid is used for interpolating pseudotime for all cells.

  • n_jobs (optional, int (default: -1)) – Number of threads or processes used for cell diffusion. It follows the scikit-learn convention. -1 means all possible threads.

  • n_paths (optional, int (default: 5)) – Number of long paths to extract for cell pseudotime estimation. Note this parameter is very sensitive. For the best outcome, please set the number based on biological knowledge about the cell embedding.

  • plot_long_trajs (optional, `bool`(default: False)) – Whether to show the long trajectories whose traverse lengths are local maximums.

  • save (bool (default: False)) – Whether to save the pseudotime-included cellDancer_df as .csv file.

  • output_path (optional, str (default: None)) – Save file path. By default, the .csv file is saved in the current directory.

Returns

cellDancer_df – The updated cellDancer_df with additional columns [‘velocity1’, ‘velocity2’].

Return type

pandas.DataFrame