Skip to content

More details in fine-tuning with MotionVid-QA #5

@Junhojuno

Description

@Junhojuno

Hello, authors!

I have some questions about fine-tuning the VLM following the paper.
The sentences "During SFT, the vision tower, LLM, and merger components were all trainable. During DPO training, we exclusively trained the LLM part." in Appendix D are not clear to me. I guess LLM is LoRA-tuned, but vision tower is not sure. is the vision tower also LoRA tuned? or fully fine-tuned?

Additionally, it is also not clear the fine-tuning setting, such as epochs.

Could you please help clarify these?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions