{ "cells": [ { "cell_type": "markdown", "id": "f79d7304", "metadata": {}, "source": [ "# Loki Retrieve\n", "This notebook demonstrates how to run *Loki Retrieve* on the demo ST-bank dataset. It takes about 10 seconds to run this notebook on MacBook Pro." ] }, { "cell_type": "code", "execution_count": 1, "id": "f9c44d2c-2bef-4280-96fc-085fc72167da", "metadata": { "scrolled": true }, "outputs": [], "source": [ "import torch\n", "from PIL import Image\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import cv2\n", "import os\n", "\n", "import loki.retrieve\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "id": "0b68f903", "metadata": {}, "source": [ "We provide the embeddings generated from the OmiCLIP model.\n", "The sample data and embeddings are stored in the directory `data/loki_retrieve/`, which can be donwloaded from [Google Drive link](https://drive.google.com/file/d/1aPK1nItsOEPxTihUAKMig-vLY-DMMIce/view?usp=sharing).\n", "\n", "Here is a list of the files that are needed to run the retrieval on example data:\n", "```\n", "├── checkpoint_stbank\n", "│ ├── demo_image_embeddings.pt\n", "│ └── demo_text_embeddings.pt\n", "└── demo_STbank_data\n", " ├── ADI-TCGA-ACCKVFLM.tif\n", " ├── LYM-TCGA-AFIDYMYA.tif\n", " ├── MUS-TCGA-AASRLCCT.tif\n", " ├── NORM-TCGA-AIMANAKC.tif\n", " ├── TUM-TCGA-YRWIKDYQ.tif\n", " ├── AACGATAGAAGGGCCG-1_hires.png\n", " ├── AATCGCGCAGAGGACT-1_hires.png\n", " ├── ACACAAATATTCCTAG-1_hires.png\n", " ├── ACCCTCCCTTGCTATT-1_hires.png\n", " ├── ACTGAAACGCCGTTAG-1_hires.png\n", " ├── AGGTAACCTCCTATTC-1_hires.png\n", " ├── ATTACTTACTGGGCAT-1_hires.png\n", " ├── CAACCTGAACCTGCCA-1_hires.png\n", " ├── CAGATGTTTGTCCCAA-1_hires.png\n", " ├── CCCTCAGATCGAGAAC-1_hires.png\n", " ├── CGTTGTAAACGTCAGG-1_hires.png\n", " ├── GCACAAACGAGGCGTG-1_hires.png\n", " ├── GCCGGTCGTATCTCTC-1_hires.png\n", " ├── GCTAGCGATAGGTCTT-1_hires.png\n", " ├── GCTTAATGTAACTAAC-1_hires.png\n", " ├── GGACACAAGTTTACAC-1_hires.png\n", " ├── GGGCTACTATTTCGTG-1_hires.png\n", " ├── GGTGTAGGTAAGTAAA-1_hires.png\n", " ├── TACCAAATAGCCCAGA-1_hires.png\n", " ├── TACCTACTCCCAGTAT-1_hires.png\n", " ├── TATCAGTGGCGTAGTC-1_hires.png\n", " ├── TATGGCCCGGCCTCGC-1_hires.png\n", " ├── TCAGAACCTCCACAGG-1_hires.png\n", " ├── TTGCCAAGCAGAACCC-1_hires.png\n", " ├── TTGGACCTATAACAGT-1_hires.png\n", " └── demo_dataset.csv \n", "```" ] }, { "cell_type": "code", "execution_count": 2, "id": "09584693", "metadata": {}, "outputs": [], "source": [ "data_path = './data/loki_retrieve/' \n", "image_dir = os.path.join(data_path, 'demo_STbank_data/')" ] }, { "cell_type": "code", "execution_count": 3, "id": "bb771e65-f56c-454c-86ad-4da8d2e5d761", "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
| \n", " | label | \n", "img_idx | \n", "
|---|---|---|
| CAGATGTTTGTCCCAA-1 | \n", "COL1A1 COL1A2 COL3A1 S100A6 DCN IGKC FSTL1 C3 ... | \n", "CAGATGTTTGTCCCAA-1_hires | \n", "
| GGGCTACTATTTCGTG-1 | \n", "COL1A1 COL3A1 IGKC COL1A2 DCN FABP4 IGFBP7 LUM... | \n", "GGGCTACTATTTCGTG-1_hires | \n", "
| TATCAGTGGCGTAGTC-1 | \n", "COL1A1 COL3A1 COL1A2 FBN1 IGKC DCN GSN IGHG2 F... | \n", "TATCAGTGGCGTAGTC-1_hires | \n", "
| GCCGGTCGTATCTCTC-1 | \n", "TAGLN MYL6 MYL9 TPM2 EMILIN1 RARRES2 GREM1 SEL... | \n", "GCCGGTCGTATCTCTC-1_hires | \n", "
| GGTGTAGGTAAGTAAA-1 | \n", "MYL9 DES RPS12 TPM2 TAGLN ACTA2 TFF1 S100A6 IS... | \n", "GGTGTAGGTAAGTAAA-1_hires | \n", "
| TACCAAATAGCCCAGA-1 | \n", "TAGLN DES MYL6 ACTG2 IGFBP7 TPM2 GSN MYL9 SELE... | \n", "TACCAAATAGCCCAGA-1_hires | \n", "
| AATCGCGCAGAGGACT-1 | \n", "DEFA5 DEFA6 REG3A PHGR1 FABP6 REG1A RPS12 S100... | \n", "AATCGCGCAGAGGACT-1_hires | \n", "
| TCAGAACCTCCACAGG-1 | \n", "DEFA5 PHGR1 REG1A SPINK4 FABP1 DEFA6 MUC2 FABP... | \n", "TCAGAACCTCCACAGG-1_hires | \n", "
| AACGATAGAAGGGCCG-1 | \n", "DEFA5 DEFA6 RPS12 RPL37 OLFM4 PHGR1 REG1A REG3... | \n", "AACGATAGAAGGGCCG-1_hires | \n", "