CS329H_DiningbyDesign

This repository contains the implementation of a personalized restaurant recommendation system using Direct Preference Optimization (DPO) to fine-tune large language models on user preferences derived from Yelp data.

Overview

This project fine-tunes small language models (Qwen3-0.6B, LFM2-350M) using DPO on personalized restaurant recommendation tasks. We generate natural language user and restaurant profiles from Yelp review data, then train models to predict user preferences by comparing their perplexity on preferred vs. non-preferred restaurants.

Key Components:

Profile generation from Yelp reviews using GPT-based models
DPO training with LoRA parameter-efficient fine-tuning
Multiple experimental settings (single-item vs. full-list recommendations)
Baseline comparison with GPT-4o-mini
Evaluation metrics: Perplexity (chosen/rejected), MAP, NDCG, Pairwise Accuracy

Repository Structure

CS329H_DiningbyDesign/
├── LLM4Rec/                          # Main experimentation directory (Google Colab)
│   ├── data/                         # Shared data files
│   ├── hannah_qwen_list/             # Qwen3-0.6B length2-list recommendation
│   │   └── dpo_train_hannah_list.ipynb
│   ├── hannah_qwen_single/           # Qwen3-0.6B single-item recommendation
│   │   └── dpo_train_hannah_single.ipynb
│   ├── hannah_GPT4o_baseline/        # GPT-4o-mini baseline experiments
│   │   └── dpo_train_hannah3_baseline.ipynb
│   ├── justin_LFM_fulllist/          # LFM2-350M full-list recommendation
│   │   └── justin_LFM_Fulllist.ipynb
│   ├── justin_Qwen_fulllist/         # Qwen3-0.6B full-list (alternative)
│   │   └── justin_Qwen_Fulllist.ipynb
│   ├── yanzhen_final_list/           # LFM2 length2-list recommendation
│   │   └── dpo_train_yanzhen_list.ipynb
│   └── yanzhen_final_single/         # LFM2 single-item recommendation
│       └── dpo_train_yanzhen_single.ipynb
│
├── LLM4Rec_server/                   # Server-based training (basic DPO)
│   ├── script/                       # Training and inference scripts
│   │   ├── train_full.sh
│   │   ├── train_small.sh
│   │   └── run_inference_eval.sh
│   ├── train_dpo.py                  # DPO training script
│   ├── inference_dpo.py              # Inference and evaluation
│   ├── create_dpo_dataset.py         # Dataset preparation
│   ├── create_eval_split.py          # Evaluation split creation
│   ├── analyze_inference_results.py  # Results analysis
│   ├── download_data.py              # Data download utility
│   └── requirements.txt              # Python dependencies
│
├── profile_generation/               # Profile generation scripts
│   ├── generate_profiles.py          # Main profile generation script
│   ├── generate_profile_user.sh      # User profile generation
│   ├── generate_profile_resaurants.sh
│   └── constants.py                  # Configuration constants
│
├── data_processing_code/             # Data preprocessing notebooks
│   ├── data_processing.ipynb
│   └── data_combine_clean.ipynb
│
└── README.md                         # This file

Environment Setup

For Google Colab (LLM4Rec)

The notebooks in LLM4Rec/ are designed to run on Google Colab with A100 GPUs. Each notebook contains installation cells:

!pip install trl
!pip install bitsandbytes
!pip install huggingface_hub

If you want to run them locally, in the root use

pip install -r requirements.txt

Required packages (versions will be auto-installed):

torch>=2.0.0
transformers>=4.40.0
datasets>=2.14.0
peft>=0.10.0
trl>=0.8.0
bitsandbytes>=0.43.0
accelerate>=0.27.0
wandb>=0.16.0

For Server Environment (LLM4Rec_server)

cd LLM4Rec_server
pip install -r requirements.txt

See LLM4Rec_server/requirements.txt for exact dependency specifications.

Datasets

All datasets are hosted on Hugging Face and will be automatically downloaded when running the scripts.

User & Restaurant Profiles

zetianli/CS329H_Project_user_profiles
- 20,000 natural language user profiles generated from Yelp reviews
- Includes user review history with ratings
- Generated using GPT-based models (Openai/gpt-oss-120b)
zetianli/CS329H_Project_business
- ~78,000 restaurant and non-restaurant business profiles
- Includes sampled review comments and business metadata
- Generated using GPT-based models

DPO Training & Evaluation Data Examples (Full List)

zetianli/CS329H_DPO_FullList_train
- DPO training set with user-restaurant preference pairs
- Constructed from users with rating gaps ≥ 2 stars
zetianli/CS329H_DPO_LFM_FullList_test_output
- Test set outputs for base and DPO-trained LFM2 models
zetianli/CS329H_DPO_Qwen_FullList_test_output
- Test set outputs for base and DPO-trained Qwen3 models
HannahGrj/LLM4Rec_DPO_List_test_with_responses
- GPT-4o-mini baseline outputs on test set

Data Format

Training Input Format:

User profile + Restaurant1 profile → Rating1
User profile + Restaurant2 profile → Rating2
...
User profile + RestaurantN profile → RatingN

Reproducing Results

Quick Start (Server Environment)

# 1. Download data
cd LLM4Rec_server
python download_data.py

# 2. Create DPO dataset
python create_dpo_dataset.py

# 3. Train on small dataset (for testing)
bash script/train_small.sh

# 4. Train on full dataset
bash script/train_full.sh

# 5. Run inference and evaluation
bash script/run_inference_eval.sh

# 6. Analyze results
python analyze_inference_results.py

Reproducing Paper Results (Google Colab)

Each experimental setting in LLM4Rec/ has a dedicated notebook:

Open the desired notebook in Google Colab
Mount Google Drive (for saving checkpoints)
Run all cells sequentially

Notebook	Model	Setting	Produces
`hannah_qwen_list/dpo_train_hannah_list.ipynb`	Qwen3-0.6B	Length2-list	Metrics & model checkpoints
`hannah_qwen_single/dpo_train_hannah_single.ipynb`	Qwen3-0.6B	Single-item	Metrics & model checkpoints
`hannah_GPT4o_baseline/dpo_train_hannah3_baseline.ipynb`	GPT-4o-mini	Baseline	Baseline metrics
`justin_LFM_fulllist/justin_LFM_Fulllist.ipynb`	LFM2-350M	Full-list	Metrics & model checkpoints
`justin_Qwen_fulllist/justin_Qwen_Fulllist.ipynb`	Qwen3-0.6B	Full-list (alt)	Metrics & model checkpoints
`yanzhen_final_list/dpo_train_yanzhen_list.ipynb`	LFM2-350M	Length2-list	Metrics & model checkpoints
`yanzhen_final_single/dpo_train_yanzhen_single.ipynb`	LFM2-350M	Single-item	Metrics & model checkpoints

Expected outputs:

Training logs in Weights & Biases
Model checkpoints saved to Google Drive
Evaluation metrics (PPL, MAP, NDCG, Pairwise Accuracy) printed in notebook

Generating Profiles (Optional)

To regenerate user/restaurant profiles from raw Yelp data:

cd profile_generation

# Generate user profiles
bash generate_profile_user.sh

# Generate restaurant profiles
bash generate_profile_resaurants.sh

Note: Profile generation requires access to GPT-based models and may incur API costs.

Trained Models

The trained DPO model checkpoints are available and referenced in the notebooks. However, due to the size limitation, we cannot upload everything to Github. Links to use trained models via HuggingFace:

Computational Requirements

Hardware

Recommended (used in our experiments):

GPU: NVIDIA A100 (40GB) on Google Colab
RAM: 12GB+ system memory
Storage: 20GB+ free space (for datasets and checkpoints)

Minimum:

GPU: NVIDIA T4 (16GB) or equivalent
RAM: 8GB+ system memory
4-bit quantization can reduce memory requirements

Runtime Estimates

Task	Hardware	Approximate Time
Full DPO training (2 epochs, 2000 examples)	A100 (40GB)	2-3 hours
Small dataset training (100 examples)	A100 (40GB)	10-15 minutes
Inference on test set (200 examples)	A100 (40GB)	5-10 minutes
Profile generation (20k users)	CPU/GPU	4-6 hours

Total time to reproduce all experiments: ~20 hours on A100 GPUs (excluding pre-trained model loading)

Software

Python 3.8+
CUDA 11.8+ (for GPU acceleration)
Google Colab (for LLM4Rec notebooks)

Reproducibility

Random Seeds

All stochastic processes use fixed random seeds for reproducibility:

Python/NumPy: RANDOM_SEED = 42
PyTorch: seed = 42 (set in training configs)
Dataset splitting: random_state=42
Data sampling: random.seed(42)

Seeds are set in:

LLM4Rec_server/train_dpo.py:40 (--random_seed=42)
LLM4Rec_server/create_small_dataset.py:23
Each notebook in LLM4Rec/ (cell with RANDOM_SEED = 42)

Deterministic Execution

Training uses deterministic algorithms where possible:

use_seedable_sampler: true in training configs
Gradient checkpointing enabled for memory consistency
Fixed learning rate schedules (no adaptive scheduling)

Package Versions

While requirements.txt specifies minimum versions (e.g., torch>=2.0.0), our experiments used:

torch: 2.9.0+cu126
transformers: 4.57.2
datasets: 4.0.0
peft: 0.18.0
trl: 0.25.1
bitsandbytes: 0.48.2

For exact reproducibility, these versions are recommended (though not strictly required).

Running Scripts End-to-End

All scripts and notebooks are designed to run without modification:

No manual data preprocessing required - datasets auto-download from Hugging Face
No hardcoded paths - all paths are configurable or relative
No missing dependencies - all requirements specified in requirements.txt
Consistent output formats - all scripts produce standard JSON/JSONL outputs

Mapping Scripts to Paper Results

Paper Section/Figure	Script/Notebook	Output
Training metrics (Loss, PPL)	All training notebooks	Weights & Biases logs + notebook output
Evaluation metrics (MAP, NDCG)	`LLM4Rec_server/analyze_inference_results.py`	Console output + JSON files
Baseline comparison	`hannah_GPT4o_baseline/dpo_train_hannah3_baseline.ipynb`	Notebook output
Model performance comparison	All notebooks in `LLM4Rec/`	Combined metrics in W&B

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS329H_DiningbyDesign

Table of Contents

Overview

Repository Structure

Environment Setup

For Google Colab (LLM4Rec)

For Server Environment (LLM4Rec_server)

Datasets

User & Restaurant Profiles

DPO Training & Evaluation Data Examples (Full List)

Data Format

Reproducing Results

Quick Start (Server Environment)

Reproducing Paper Results (Google Colab)

Generating Profiles (Optional)

Trained Models

Computational Requirements

Hardware

Runtime Estimates

Software

Reproducibility

Random Seeds

Deterministic Execution

Package Versions

Running Scripts End-to-End

Mapping Scripts to Paper Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
LLM4Rec		LLM4Rec
LLM4Rec_server		LLM4Rec_server
data_processing_code		data_processing_code
profile_generation		profile_generation
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

CS329H_DiningbyDesign

Table of Contents

Overview

Repository Structure

Environment Setup

For Google Colab (LLM4Rec)

For Server Environment (LLM4Rec_server)

Datasets

User & Restaurant Profiles

DPO Training & Evaluation Data Examples (Full List)

Data Format

Reproducing Results

Quick Start (Server Environment)

Reproducing Paper Results (Google Colab)

Generating Profiles (Optional)

Trained Models

Computational Requirements

Hardware

Runtime Estimates

Software

Reproducibility

Random Seeds

Deterministic Execution

Package Versions

Running Scripts End-to-End

Mapping Scripts to Paper Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages