loki.preprocess.generate_gene_df
- loki.preprocess.generate_gene_df(ad, house_keeping_genes, todense=True)
Generates a DataFrame with the top 50 genes for each observation in an AnnData object. It removes genes containing ‘.’ or ‘-’ in their names, as well as genes listed in the provided house_keeping_genes DataFrame/Series under the ‘genesymbol’ column.
- Parameters:
ad (anndata.AnnData) – An AnnData object containing gene expression data.
house_keeping_genes (pandas.DataFrame or pandas.Series) – DataFrame or Series with a ‘genesymbol’ column listing housekeeping genes to exclude.
todense (bool) – Whether to convert the sparse matrix (ad.X) to a dense matrix before creating a DataFrame.
- Returns:
A DataFrame (top_k_genes_str) that contains a ‘label’ column. Each row in ‘label’ is a string with the top 50 gene names (space-separated) for that observation.
- Return type:
pandas.DataFrame