Record daily reading of papers and related reproduction results in Chinese.
For more notes, please follow the blog: https://nopsled.blog.csdn.net/
- Diffusion:
- DiT
- Flow:
- Flow Matching [Link]
- MOE:
- Vision Transformer:
-
Agent: LLM-based Single/Multi Agent model/system
-
Base Model: Large Language Model
-
Dataset: Data building and processing for Model training
-
Long Sequence
- RLM [Link]
-
Prompt: Prompt Engineering
-
Omni: LLM-based full modal model
-
Quantization: Model Weight/Optimizer/Activation Compressing
- COAT [Link]
-
Speech: Speech LLM
- ALM: Audio LLM for auido Input
- Audio Flamingo 3 [Link]
- ALM: Audio LLM for auido Input
-
Survey
-
Training: LLM Model Training:
- Ptrtrain
- FIM (fill-in-the-middle) [Link]
- RL
- SFT:
- EAFT
- Speculative Decoding or MTP: Speculative Decoding or Multi-token Prediction
- Ptrtrain
-
VLM: Visual LLM