|
1 | | -# NEW_REPO |
2 | | - |
3 | | -This repository serves only as a Python template for new projects. |
4 | | - |
5 | | -## Create a new repository |
6 | | - |
7 | | -- Create a [new repo](https://github.com/new) and select `CellProfiling/cell-pro-template` as template repository. |
8 | | -- Clone your new repo. |
9 | | -- Search and replace all occurences of `NEW_REPO`, `AUTHOR_NAME` and `AUTHOR_EMAIL`. Replace `NEW_REPO` with the name of the new repo. |
10 | | -- Add package requirements in `install_requires` in [`setup.py`](setup.py) and in [`requirements.txt`](requirements.txt) as needed. |
11 | | -- Update this `README.md` with a description of and instructions for your new repo. |
12 | | - |
13 | | -## Development |
14 | | - |
15 | | -- Install and set up development environment. |
16 | | - |
17 | | - ```sh |
18 | | - pip install -r requirements_dev.txt |
19 | | - ``` |
20 | | - |
21 | | - This will install all requirements. |
22 | | -It will also install this package in development mode, so that code changes are applied immediately without reinstall necessary. |
23 | | - |
24 | | -- Here's a list of development tools we use. |
25 | | - - [black](https://pypi.org/project/black/) |
26 | | - - [flake8](https://pypi.org/project/flake8/) |
27 | | - - [pydocstyle](https://pypi.org/project/pydocstyle/) |
28 | | - - [pylint](https://pypi.org/project/pylint/) |
29 | | - - [pytest](https://pypi.org/project/pytest/) |
30 | | - - [tox](https://pypi.org/project/tox/) |
31 | | -- It's recommended to use the corresponding code formatter and linters also in your code editor to get instant feedback. A popular editor that can do this is [`vscode`](https://code.visualstudio.com/). |
32 | | -- Run all tests, check formatting and linting. |
33 | | - |
34 | | - ```sh |
35 | | - tox |
36 | | - ``` |
37 | | - |
38 | | -- Run a single tox environment. |
39 | | - |
40 | | - ```sh |
41 | | - tox -e lint |
42 | | - ``` |
43 | | - |
44 | | -- Reinstall all tox environments. |
45 | | - |
46 | | - ```sh |
47 | | - tox -r |
48 | | - ``` |
49 | | - |
50 | | -- Run pytest and all tests. |
51 | | - |
52 | | - ```sh |
53 | | - pytest |
54 | | - ``` |
55 | | - |
56 | | -- Run pytest and calculate coverage for the package. |
57 | | - |
58 | | - ```sh |
59 | | - pytest --cov-report term-missing --cov=NEW_REPO |
60 | | - ``` |
61 | | - |
62 | | -- Continous integration is by default supported via [GitHub actions](https://help.github.com/en/actions). GitHub actions is free for public repos and comes with 2000 free Ubuntu build minutes per month for private repos. |
63 | | - |
64 | | -- To activate continuous integration testing on Travis CI, add a `.travis.yml` file with this contents to the repo. |
65 | | - |
66 | | - ```yaml |
67 | | - dist: xenial |
68 | | - language: python |
69 | | - cache: pip |
70 | | - python: |
71 | | - - "3.6" |
72 | | - - "3.7" |
73 | | - - "3.8" |
74 | | - install: |
75 | | - - pip install -U tox-travis |
76 | | - script: tox |
77 | | - ``` |
78 | | -
|
79 | | - Note that Travis CI is free for public repos, but requires a subscription for private repos. |
| 1 | +# ProtVL Inference Pipeline |
| 2 | + |
| 3 | +A multi-GPU inference pipeline for generating protein expression images using ProtVL. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +This script performs conditional image generation for proteins. It takes reference microscopy channels (DAPI, tubulin, ER) as input and generates predicted protein expression patterns. Supports distributed inference across multiple GPUs via HuggingFace Accelerate. |
| 8 | + |
| 9 | +## Requirements |
| 10 | + |
| 11 | +- Python 3.x |
| 12 | +- PyTorch |
| 13 | +- HuggingFace Diffusers & Accelerate |
| 14 | +- timm |
| 15 | +- NumPy, Pandas, SciPy |
| 16 | +- tifffile |
| 17 | +- tqdm |
| 18 | + |
| 19 | +## Usage |
| 20 | + |
| 21 | +CPU: |
| 22 | +```bash |
| 23 | +python ordinary_sampler_standalone.py \ |
| 24 | + --csv_file_path", "p4ha2_example.csv \ |
| 25 | + --model_path, ./checkpoint-1020000/ \ |
| 26 | + --vae_path ./vae \ |
| 27 | + --antibody_map_path ./antibody_map.pkl \ |
| 28 | + --cell_line_map_path ./cell_line_dict.pkl \ |
| 29 | + --antibody_map_path ./antibody_dict.pkl \ |
| 30 | + --mixed_precision ./example_output\ |
| 31 | + --batch_size 16 \ |
| 32 | + --num_workers, 4 \ |
| 33 | + --num_inference_steps 100 |
| 34 | +``` |
| 35 | + |
| 36 | +Single GPU: |
| 37 | +```bash |
| 38 | +python ordinary_sampler_standalone.py \ |
| 39 | + --csv_file_path", "p4ha2_example.csv \ |
| 40 | + --model_path, ./checkpoint-1020000/ \ |
| 41 | + --vae_path ./vae \ |
| 42 | + --antibody_map_path ./antibody_map.pkl \ |
| 43 | + --cell_line_map_path ./cell_line_dict.pkl \ |
| 44 | + --antibody_map_path ./antibody_dict.pkl \ |
| 45 | + --mixed_precision ./example_output\ |
| 46 | + --batch_size 16 \ |
| 47 | + --num_workers, 4 \ |
| 48 | + --num_inference_steps 100 |
| 49 | +``` |
| 50 | + |
| 51 | +Multi-GPU with Accelerate: |
| 52 | +```bash |
| 53 | +accelerate launch --num_processes 4 ordinary_sampler_standalone.py \ |
| 54 | + --csv_file_path", "p4ha2_example.csv \ |
| 55 | + --model_path, ./checkpoint-1020000/ \ |
| 56 | + --vae_path ./vae \ |
| 57 | + --antibody_map_path ./antibody_map.pkl \ |
| 58 | + --cell_line_map_path ./cell_line_dict.pkl \ |
| 59 | + --antibody_map_path ./antibody_dict.pkl \ |
| 60 | + --mixed_precision ./example_output\ |
| 61 | + --batch_size 16 \ |
| 62 | + --num_workers 4 \ |
| 63 | + --num_inference_steps 100 |
| 64 | +``` |
| 65 | + |
| 66 | + |
| 67 | +### Key Arguments |
| 68 | + |
| 69 | +| Argument | Default | Description | |
| 70 | +|----------|---------|-------------| |
| 71 | +| `--model_path` | Required | Path to pretrained DiT model | |
| 72 | +| `--vae_path` | Required | Path to VAE checkpoint | |
| 73 | +| `--csv_file_path` | Required | CSV with image paths and metadata | |
| 74 | +| `--cell_line_map_path` | Required | Cell line name-to-index mapping | |
| 75 | +| `--antibody_map_path` | Required | Antibody name-to-index mapping | |
| 76 | +| `--output_dir` | `output` | Output directory for generated images | |
| 77 | +| `--batch_size` | 4 | Samples per GPU | |
| 78 | +| `--num_inference_steps` | 50 | Diffusion sampling steps | |
| 79 | +| `--mixed_precision` | `no` | Mixed precision mode (`no`, `fp16`, `bf16`) | |
| 80 | + |
| 81 | +## Input Format |
| 82 | + |
| 83 | +**CSV file** must contain columns: |
| 84 | +- `image_path`: Path to input TIFF |
| 85 | +- `cell_line_name`: Cell line identifier |
| 86 | +- `gene_name`: Target protein/antibody name |
| 87 | + |
| 88 | +**Image format**: Normalized (-1 to 1) 3 or 4-channel TIFF (DAPI, Antibody (Optional), Tubulin, ER) with shape (H, W, C) |
| 89 | + |
| 90 | +## Output |
| 91 | + |
| 92 | +For each input image, generates: |
| 93 | +- `{basename}_{cell_line}_{protein}_pred.tif`: Predicted protein + reference channels |
| 94 | +- `{basename}_{cell_line}_{protein}_real.tif`: Ground truth + reference channels (if available) |
| 95 | + |
| 96 | +Output TIFFs have 4 channels in order: DAPI, Protein, Tubulin, ER |
| 97 | + |
| 98 | +## Logging |
| 99 | + |
| 100 | +Synchronized logs across all GPUs are written to `--log_dir`: |
| 101 | +- `inference_log_{timestamp}.txt`: Human-readable log |
| 102 | +- `metrics_{timestamp}.json`: Machine-parseable metrics |
0 commit comments