Case study 3: Pancreatic endocrinogenesis¶

This tutorial shows how cellDancer derives cell fates in the embedding space.

This tutorial also displays cellDancer’s ability to decipher cell identity based on the cell-specific reaction rates.

Below is the case study for pancreatic endocrinogenesis. We follow the gene and cell filtering methods of Bergen et.al. 3,696 cells with 2,000 genes are selected.

Import packages¶

To run the notebook locally, Installation could be referred to install the environment and dependencies.

[1]:

# import packages
import os
import sys
import glob
import pandas as pd
import math
import matplotlib.pyplot as plt
import celldancer as cd
import celldancer.cdplt as cdplt
from celldancer.cdplt import colormap

Load cellDancer result¶

We use the preprocessed data PancreaticEndocrinogenesis_cell_type_u_s.csv.zip to predict the RNA velocities (see the details of pre-processing steps Data Preparation). Below are the command we used for the RNA velocity estimation.

cellDancer_df = cd.velocity(cell_type_u_s, permutation_ratio=0.5, n_jobs=8)

The prediction result can be downloaded from PancreaticEndocrinogenesis_cellDancer_estimation.csv.zip.

[2]:

cellDancer_df_path = 'your_path/PancreaticEndocrinogenesis_cellDancer_estimation.csv'
cellDancer_df=pd.read_csv(cellDancer_df_path)
cellDancer_df

[2]:

	cellIndex	gene_name	unsplice	splice	unsplice_predict	splice_predict	alpha	beta	gamma	loss	cellID	clusters	embedding1	embedding2
0	0	Scaper	0.489151	0.211323	0.486841	0.185478	0.334391	0.629880	0.647696	0.043435	AAACCTGAGAGGGATA	Pre-endocrine	6.143066	-0.063644
1	1	Scaper	0.278262	0.125742	0.293904	0.107517	0.224909	0.626946	0.648537	0.043435	AAACCTGAGCCTTGAT	Ductal	-9.906417	0.197778
2	2	Scaper	0.374380	0.298116	0.348888	0.197525	0.213979	0.648890	0.665279	0.043435	AAACCTGAGGCAATTA	Alpha	7.559791	0.583762
3	3	Scaper	0.320497	0.198031	0.318165	0.146120	0.219611	0.636208	0.655523	0.043435	AAACCTGCATCATCCC	Ductal	-11.283765	4.218998
4	4	Scaper	0.239145	0.141970	0.251417	0.106599	0.192050	0.631560	0.652837	0.043435	AAACCTGGTAAGTGGC	Ngn3 high EP	1.721565	-4.753407
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
7391995	3691	Tmem63a	0.000000	0.136270	0.000447	0.125414	0.005472	0.110095	0.067564	0.045492	TTTGTCAAGTGACATA	Pre-endocrine	4.768472	-1.388353
7391996	3692	Tmem63a	0.028708	0.000000	0.021286	0.022931	0.017270	0.100517	0.072741	0.045492	TTTGTCAAGTGTGGCA	Ngn3 high EP	-1.873335	-4.182650
7391997	3693	Tmem63a	0.000000	0.312690	0.000200	0.288797	0.002445	0.116196	0.064809	0.045492	TTTGTCAGTTGTTTGG	Ductal	-9.882250	-0.105594
7391998	3694	Tmem63a	0.028052	0.154164	0.019519	0.165354	0.008143	0.107124	0.069812	0.045492	TTTGTCATCGAATGCT	Alpha	6.612424	4.531895
7391999	3695	Tmem63a	0.076389	0.046049	0.055438	0.101639	0.024521	0.098170	0.073695	0.045492	TTTGTCATCTGTTTGT	Epsilon	3.071044	1.120432

7392000 rows × 14 columns

Project the RNA velocity onto the embedding space¶

We calculate the projection of RNA velocity on the embedding with cd.compute_cell_velocity(). The projected direction on embedding space, i.e. columns ‘velocity1’ and ‘velocity2’ are added to the original dataframe. We use cdplt.scatter_cell() to display the predicted direction on embedding space.

[3]:

# Compute cell velocity
cellDancer_df=cd.compute_cell_velocity(cellDancer_df=cellDancer_df, projection_neighbor_size=100)

# Plot cell velocity
fig, ax = plt.subplots(figsize=(10,10))
im = cdplt.scatter_cell(ax, cellDancer_df, colors=colormap.colormap_pancreas, alpha=0.5, s=10, velocity=True, legend='on', min_mass=5, arrow_grid=(20,20))
ax.axis('off')
plt.show()

https://raw.githack.com/GuangyuWangLab2021/cellDancer_website/main/docs/_images/notebooks_case_study_pancreas_10_0.png

Build UMAP based on reaction rates¶

We use cd.embedding_kinetic_para() to build the UMAP based on the predicted alpha, predicted beta, predicted gamma, or all of the three.

[4]:

cellDancer_df=cd.embedding_kinetic_para(cellDancer_df,'alpha')
cellDancer_df=cd.embedding_kinetic_para(cellDancer_df,'beta')
cellDancer_df=cd.embedding_kinetic_para(cellDancer_df,'gamma')
cellDancer_df=cd.embedding_kinetic_para(cellDancer_df,'alpha_beta_gamma')

We visualize the UMAP of kinetic parameter(s) with cdplt.plot_kinetic_para().

[5]:

fig, ax = plt.subplots(ncols=4, figsize=(16,8))
cdplt.plot_kinetic_para(ax[0], 'alpha_beta_gamma', cellDancer_df, color_map=colormap.colormap_pancreas)
cdplt.plot_kinetic_para(ax[1], 'alpha', cellDancer_df, color_map=colormap.colormap_pancreas)
cdplt.plot_kinetic_para(ax[2], 'beta', cellDancer_df, color_map=colormap.colormap_pancreas)
cdplt.plot_kinetic_para(ax[3], 'gamma', cellDancer_df, color_map=colormap.colormap_pancreas)

[5]:

<AxesSubplot:title={'center':'UMAP of gamma'}>

https://raw.githack.com/GuangyuWangLab2021/cellDancer_website/main/docs/_images/notebooks_case_study_pancreas_15_1.png