Doctor-Patient Conversation to Clinical Note Generator

Fine-tuning Llama 3.1 8B to automatically convert doctor-patient conversation transcripts into structured clinical notes using QLoRA.

Prerequisites

Python 3.10+
Conda (Anaconda or Miniconda) — for local environment setup
A Google account (if running on Google Colab)
A HuggingFace account with access to Meta Llama models (https://huggingface.co/meta-llama)
GPU — this project requires a CUDA-capable GPU. An A100 is recommended, but T4/V100 will also work (slower training).

Environment Setup (Local)

Step 1: Create a Conda Environment

conda create -n clinical-note-gen python=3.10 -y
conda activate clinical-note-gen

Step 2: Install PyTorch with CUDA

Install PyTorch with the appropriate CUDA version for your system. For CUDA 12.1:

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia -y

For other CUDA versions, check https://pytorch.org/get-started/locally/

Step 3: Install Project Dependencies

pip install -r requirements.txt

Step 4: Install BLEURT (optional — for BLEURT evaluation metric)

pip install git+https://github.com/google-research/bleurt.git

Step 5: Run the Notebook

jupyter notebook fine_tuned_LLM.ipynb

How to Run (Google Colab — Recommended)

If you don't have a local GPU, use Google Colab instead. No environment setup is needed — the notebook installs everything automatically.

Step 1: Open in Google Colab

Upload fine_tuned_LLM.ipynb to Google Colab, or open it directly if hosted on GitHub/Google Drive.

Step 2: Select GPU Runtime

Go to Runtime > Change runtime type
Set Hardware accelerator to GPU
Select A100 if available (under the "High-RAM" option), otherwise T4 or V100
Click Save

Step 3: Run All Cells

Go to Runtime > Run all
Alternatively, run cells one by one using Shift + Enter

Important: After the first cell (installation cell), you may need to restart the runtime before continuing. Colab will prompt you if needed. After restarting, run all cells from the second cell onward.

That's it. The notebook handles everything automatically:

Installs all dependencies
Downloads the ACI-Bench dataset
Loads and prepares the data
Loads the pre-trained model
Runs baseline evaluation
Trains 3 hyperparameter configurations
Evaluates and compares all models
Performs error analysis
Saves the best model
Runs a demo inference

Project Structure

.
├── fine_tuned_LLM.ipynb      # Main notebook — run this
├── requirements.txt          # Python dependencies
├── README.md                 # This file

Files Generated After Running

.
├── aci-bench/                # Cloned dataset (created automatically)
├── best_model_lora/          # Saved best fine-tuned LoRA adapters
├── outputs_config1/          # Training checkpoints for Config 1
├── outputs_config2/          # Training checkpoints for Config 2
├── outputs_config3/          # Training checkpoints for Config 3
├── evaluation_results.pkl    # Full evaluation results (with predictions)
├── evaluation_results.json   # Metrics summary in JSON format
├── evaluation_results.png    # Visualization charts
├── sample_predictions.txt    # Example generated notes for review

Troubleshooting

"CUDA out of memory" — Restart the runtime and try again. If it persists, reduce MAX_SEQ_LENGTH from 4096 to 2048 in the model loading cell.
"Module not found" errors after installation — Restart the runtime (or your Jupyter kernel) after the installation cell, then run from the imports cell onward.
Slow training — Make sure you are using a GPU runtime. CPU-only will not work for this project.
Conda environment issues — If pip install -r requirements.txt fails, try installing packages one at a time to identify the problematic dependency.
BLEURT installation fails — BLEURT is optional. The notebook will skip BLEURT scoring if it is not installed and still run all other metrics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Doctor-Patient Conversation to Clinical Note Generator

Prerequisites

Environment Setup (Local)

Step 1: Create a Conda Environment

Step 2: Install PyTorch with CUDA

Step 3: Install Project Dependencies

Step 4: Install BLEURT (optional — for BLEURT evaluation metric)

Step 5: Run the Notebook

How to Run (Google Colab — Recommended)

Step 1: Open in Google Colab

Step 2: Select GPU Runtime

Step 3: Run All Cells

Project Structure

Files Generated After Running

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
fine_tuned_LLM.ipynb		fine_tuned_LLM.ipynb
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Doctor-Patient Conversation to Clinical Note Generator

Prerequisites

Environment Setup (Local)

Step 1: Create a Conda Environment

Step 2: Install PyTorch with CUDA

Step 3: Install Project Dependencies

Step 4: Install BLEURT (optional — for BLEURT evaluation metric)

Step 5: Run the Notebook

How to Run (Google Colab — Recommended)

Step 1: Open in Google Colab

Step 2: Select GPU Runtime

Step 3: Run All Cells

Project Structure

Files Generated After Running

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages