loki.preprocess.generate_gene_df

loki.preprocess.generate_gene_df(ad, house_keeping_genes, todense=True)

Generates a DataFrame with the top 50 genes for each observation in an AnnData object. It removes genes containing ‘.’ or ‘-’ in their names, as well as genes listed in the provided house_keeping_genes DataFrame/Series under the ‘genesymbol’ column.

Parameters:
  • ad (anndata.AnnData) – An AnnData object containing gene expression data.

  • house_keeping_genes (pandas.DataFrame or pandas.Series) – DataFrame or Series with a ‘genesymbol’ column listing housekeeping genes to exclude.

  • todense (bool) – Whether to convert the sparse matrix (ad.X) to a dense matrix before creating a DataFrame.

Returns:

A DataFrame (top_k_genes_str) that contains a ‘label’ column. Each row in ‘label’ is a string with the top 50 gene names (space-separated) for that observation.

Return type:

pandas.DataFrame