Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
a04ef3e
feat:add Qwen2.5omni text modal processing
KKkai0315 Jan 22, 2026
c9333ab
add qwen2.5omni vision, audio modal
KKkai0315 Jan 23, 2026
e959822
fix: Enhance quantization modules. Introduced FixedActivationQDQ for …
chenghuaWang Jan 17, 2026
0672432
fix: Suppress deprecated comma-subscript warnings in CMake and remove…
chenghuaWang Jan 17, 2026
927f7eb
feat(qualcomm): Add installation targets for flatbuffers and MllmQNNB…
chenghuaWang Jan 19, 2026
d2e6b36
feat(qualcomm): Refactor Qwen3 model to integrate ConcatObserver for …
chenghuaWang Jan 19, 2026
48c259a
feat(cpu): Implement fill operations for various data types including…
chenghuaWang Jan 20, 2026
e976d11
feat(qnn): Enhance QNNBackend initialization with improved logging an…
chenghuaWang Jan 21, 2026
224d68e
feat(qnn): Update quantization handling and embedding output data typ…
chenghuaWang Jan 23, 2026
d2d5c09
feat(qwen3): Integrate QEmbedding for quantized embeddings and refine…
chenghuaWang Jan 23, 2026
c4f2306
fix
KKkai0315 Jan 23, 2026
a235a13
fix
KKkai0315 Jan 23, 2026
eeac11f
Merge remote-tracking branch 'refs/remotes/origin/main'
KKkai0315 Jan 23, 2026
adc3b64
add ConvTranspose1dOp & TanhOp
KKkai0315 Jan 24, 2026
674f97c
fix: fix Tanh op and add test for Tanh Op and ConvTranspose1d Op
KKkai0315 Jan 25, 2026
e1ba448
add minicpmo45
KKkai0315 Feb 23, 2026
8c0cda7
merge
KKkai0315 Feb 23, 2026
af574ae
add
KKkai0315 Feb 24, 2026
06b754c
add qwen2.5o talker
KKkai0315 Mar 5, 2026
5676edc
add
KKkai0315 Mar 5, 2026
4baacd3
Merge branch 'main' into main
oreomaker Mar 12, 2026
3bdf6e0
fix
KKkai0315 Mar 12, 2026
571b93d
add minicpm-o4.5 system ref audio prompt path
KKkai0315 Mar 12, 2026
d7c1b30
fix
KKkai0315 Mar 25, 2026
f185440
feat(mllm_kernel): simplify JIT usage in README and update kernel exa…
chenghuaWang Feb 17, 2026
289b74b
feat: update dependencies and refactor mobile module structure
chenghuaWang Feb 18, 2026
45c2fb7
feat: enhance configuration management and update dependencies
chenghuaWang Feb 18, 2026
14ce9cd
feat: add main entry points and configuration for pymllm and mllm-kernel
chenghuaWang Feb 18, 2026
027b0df
feat: enhance layer implementations and add new components
chenghuaWang Feb 19, 2026
f6aee67
feat: add initial files for pymllm architecture and launch functionality
chenghuaWang Feb 19, 2026
4fd3d34
feat: update dependencies and enhance configuration structure
chenghuaWang Feb 21, 2026
57ef372
feat: implement store_cache functionality and related components
chenghuaWang Feb 21, 2026
7f78efa
refactor: improve socket initialization in TokenizerProcess
chenghuaWang Feb 21, 2026
7f5d7d9
feat(engine): support batch generation and enable shared memory queue…
chenghuaWang Feb 27, 2026
5d13411
feat(mllm-kernel): add high-performance create_kv_indices CUDA kernel…
chenghuaWang Mar 2, 2026
f10363c
feat(sampling): add sampling module with FlashInfer acceleration and …
chenghuaWang Mar 2, 2026
c366ffc
feat(cuda): add fused GDN decode and RMSNorm+SiLU gating kernels for …
chenghuaWang Mar 9, 2026
506d61a
fix(attention): refine FlashInfer backend logic and improve RadixCach…
chenghuaWang Mar 17, 2026
a420a05
refactor: improve code readability and structure across multiple modules
chenghuaWang Mar 17, 2026
9d33d0d
chore: update installation instructions and add new skills for pymllm
chenghuaWang Mar 17, 2026
fd16226
refactor: enhance installation instructions and improve cache management
chenghuaWang Mar 17, 2026
a6a993a
refactor: enhance configuration management and improve process health…
chenghuaWang Mar 18, 2026
a78e3a0
feat(mllm-kernel): introduce new Marlin kernel implementations for ef…
chenghuaWang Mar 18, 2026
9453134
feat(quantization): implement quantization configuration loading and …
chenghuaWang Mar 18, 2026
5560096
feat(docs): update README files with latest news and model integratio…
chenghuaWang Mar 18, 2026
e78ea11
fix
KKkai0315 Mar 25, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
486 changes: 486 additions & 0 deletions .claude/skills/impl-jit-kernel/SKILL.md

Large diffs are not rendered by default.

73 changes: 73 additions & 0 deletions .claude/skills/install-pymllm/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
name: install-pymllm
description: Install the pymllm Python package. Asks the user whether to do a full build (with CMake C++ compilation) or a fast install (Python-only, skip CMake). Use when the user asks to install, set up, or reinstall pymllm.
---

# Install pymllm

## Goal

Help the user install the `pymllm` package with the right configuration for their use case.

## Workflow

### Step 1: Ask the user which install mode they want

Use `AskUserQuestion` to present two options:

**Full Install (with C++ build)**
- Compiles the C++ mllm runtime and FFI extension via CMake
- Required if the user needs mobile inference, model conversion with FFI, or CPU/QNN backends
- Slower (several minutes depending on the machine)
- Command: `pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall`

**Fast Install (Python-only, skip CMake)**
- Skips the entire CMake build step
- Only installs the pure Python package
- Recommended for users who only use CUDA backends (FlashInfer, TileLang) and do not need the C++ mllm runtime
- Much faster (seconds)
- Command: `SKBUILD_WHEEL_CMAKE=false pip install -e .`

### Step 2: Ask editable or non-editable

Use `AskUserQuestion` to ask:

- **Editable (`pip install -e .`)**: For active development. Python imports point to the source tree. Changes to `.py` files take effect immediately without reinstalling.
- **Non-editable (wheel)**: For stable usage. Installs a wheel into site-packages.

### Step 3: Ask whether the user needs CUDA optional dependencies

Use `AskUserQuestion` to ask whether the user needs CUDA support (FlashInfer, TileLang, pyzmq, etc.).

This determines whether to append `[cuda]` to the install specifier (e.g. `pip install -e ".[cuda]"` instead of `pip install -e .`).

**This applies to ALL install modes.** For fast-install users this is especially important since the CUDA packages are the primary compute backend.

### Step 4: Execute the install

Based on user choices, compose and run the appropriate command. The install specifier is either `.` or `".[cuda]"` depending on Step 3.

| Mode | Editable | CUDA | Command |
|------|----------|------|---------|
| Full | Yes | No | `pip install -e -v .` |
| Full | Yes | Yes | `pip install -e -v ".[cuda]"` |
| Full | No | No | `pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall` |
| Full | No | Yes | `pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall && pip install "pymllm[cuda]"` |
| Fast | Yes | No | `SKBUILD_WHEEL_CMAKE=false pip install -e .` |
| Fast | Yes | Yes | `SKBUILD_WHEEL_CMAKE=false pip install -e ".[cuda]"` |
| Fast | No | No | `SKBUILD_WHEEL_CMAKE=false pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall` |
| Fast | No | Yes | `SKBUILD_WHEEL_CMAKE=false pip wheel -v -w dist . && pip install dist/*.whl --force-reinstall && pip install "pymllm[cuda]"` |

### Step 5: Post-install for editable + full build

If the user chose **editable + full build**, the compiled `.so` files live in a build directory (e.g. `build/bin/`), not in the source tree. The Python code at `pymllm/__init__.py` looks for libraries at `pymllm/lib/MllmFFIExtension.so`. A symlink is needed to bridge this gap.

**Invoke the `/link-pymllm-lib` skill** to help the user set up the symlink.

## Important Notes

- The project root must contain `pyproject.toml` with `scikit-build-core` as the build backend.
- The `wheel.cmake = true` flag in `pyproject.toml` controls whether CMake runs. The env var `SKBUILD_WHEEL_CMAKE=false` overrides it at install time without modifying the file.
- For non-editable full builds, the `.so` files are bundled inside the wheel automatically — no symlink needed.
- For fast installs, `pymllm.is_mobile_available()` will return `False` since no C++ libraries are present. This is expected.
- The `[cuda]` optional dependencies are defined in `pyproject.toml` under `[project.optional-dependencies]`.
83 changes: 83 additions & 0 deletions .claude/skills/link-pymllm-lib/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
name: link-pymllm-lib
description: Create or update the pymllm/lib symlink to point to a C++ build directory's bin/ folder. Required after editable installs with C++ builds so that Python can find the compiled .so libraries. Use when the user asks to link, fix, or set up pymllm native libraries.
---

# Link pymllm lib

## Goal

Create a symlink at `pymllm/lib` pointing to the correct build output directory so that an editable-installed pymllm can load the compiled C++ shared libraries (`MllmFFIExtension.so`, `libMllmRT.so`, etc.).

## Background

When pymllm is installed in editable mode (`pip install -e .`), Python imports from the source tree directly. The C++ libraries are compiled into `<build-dir>/bin/` by CMake, but pymllm looks for them at `pymllm/lib/`. A symlink bridges this gap:

```
pymllm/lib -> <project-root>/<build-dir>/bin
```

## Workflow

### Step 1: Detect available build directories

Scan the project root for directories matching the pattern `build*/bin/` that contain `MllmFFIExtension.so` (or `.dylib` on macOS). List all valid candidates.

Common build directories and their corresponding platforms:

| Build directory | Platform / Config | Typical build command |
|----------------|-------------------|----------------------|
| `build/bin` | X86 CPU only | `python task.py tasks/build_x86.yaml` |
| `build-x86-cuda/bin` | X86 + CUDA | `python task.py tasks/build_x86_cuda.yaml` |
| `build-qnn-aot/bin` | X86 + QNN AOT | `python task.py tasks/build_x86_qnn_aot.yaml` |
| `build-android-arm64-v8a-qnn/bin` | Android ARM + QNN | `python task.py tasks/build_android_qnn.yaml` |

### Step 2: Ask the user which build to link

Use `AskUserQuestion` to let the user pick from the detected build directories. Show each option with its path and the platform it corresponds to.

If no build directories with `.so` files are found, inform the user they need to build first:

```bash
pip install -r requirements.txt
python task.py tasks/build_x86.yaml # or another build task
```

### Step 3: Check existing symlink

Before creating a new symlink, check if `pymllm/lib` already exists:

- If it's a symlink, show where it currently points and confirm replacement.
- If it's a real directory, warn the user and ask before removing it.
- If it doesn't exist, proceed directly.

### Step 4: Create the symlink

```bash
ln -sfn <project-root>/<build-dir>/bin <project-root>/pymllm/lib
```

Use `ln -sfn` to atomically replace any existing symlink.

### Step 5: Verify

After creating the symlink, verify by checking that the target `.so` file is accessible:

```bash
ls -la pymllm/lib/MllmFFIExtension.so
```

Then run a quick Python check:

```bash
python -c "import pymllm; print('mobile available:', pymllm.is_mobile_available())"
```

If `is_mobile_available()` returns `True`, the link is correct.

## Important Notes

- The symlink target must be an **absolute path** for reliability.
- On macOS, the library extension is `.dylib` instead of `.so`.
- Android build directories (e.g., `build-android-arm64-v8a-qnn/bin`) contain ARM binaries that cannot run on x86 hosts. Warn the user if they select one of these on a non-ARM machine.
- If the user has multiple build directories, they can re-run this skill anytime to switch which build pymllm uses.
44 changes: 44 additions & 0 deletions .claude/skills/update-codeowners/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
name: update-codeowners
description: Updates CODEOWNERS entries safely with consistent path and owner formatting. Use when the user asks to add, remove, or modify CODEOWNERS rules, ownership mappings, reviewers, or module maintainers.
---

# Update CODEOWNERS

## Goal
Maintain `CODEOWNERS` accurately while preserving the repository's existing section/comment style.

## Workflow
1. Read the current `CODEOWNERS` file before editing.
2. Identify requested changes as one of:
- Add new path rule
- Modify owners for existing path rule
- Remove obsolete path rule
- Reorganize section comments (only if requested)
3. Update rules in place instead of creating duplicates for the same path.
4. Keep existing section headers and comment style unless the user asks to refactor structure.
5. Return a concise changelog describing which paths were added, changed, or removed.

## Rule Format
- Use one rule per line: `<path-pattern> <owner1> <owner2> ...`
- Owners must be GitHub handles prefixed with `@`.
- Keep path style consistent with the file (in this repo, path patterns typically start with `/`).
- Do not leave rules with empty owner lists.

## Editing Guidelines
- Prefer minimal edits near related sections.
- If a path already exists, update that line instead of adding a second conflicting line.
- If a new rule logically belongs to an existing section, place it in that section.
- Preserve human-readable grouping and blank lines.
- Keep comments intact unless they are clearly outdated and the user asked for cleanup.

## Validation Checklist
- [ ] Every non-comment, non-empty line has at least one owner.
- [ ] Every owner token starts with `@`.
- [ ] No accidental duplicate rule for the exact same path pattern.
- [ ] Existing comments/sections were preserved unless explicitly changed.

## Example Requests
- "Add `/mllm/models/new_model/ @alice @bob` under models."
- "Change `/core/Storage` owner to `@team-core`."
- "Remove ownership rule for deprecated path `/legacy/`."
2 changes: 1 addition & 1 deletion .codespellrc
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
[codespell]
ignore-words-list = ans, als, hel, boostrap, childs, te, vas, hsa, ment, cann, thi, makro, wil, rouge, PRIS, bfloat, constexpr, cuda, dlpack, expt, forceinline, ifndef, linalg, LPBQ, mllm, pymllm, Quantizaton, Qwen, ROCM, silu, torchao
ignore-words-list = ans, als, hel, boostrap, childs, te, vas, hsa, ment, cann, thi, makro, wil, rouge, PRIS, bfloat, constexpr, cuda, dlpack, expt, forceinline, ifndef, linalg, LPBQ, mllm, pymllm, Quantizaton, Qwen, ROCM, silu, torchao, flashinfer
skip = *.json,*.jsonl,*.patch,*.txt
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
.cache/
.tmp/
compile_commands.json
.claude/
settings.local.json

# MLLM Team Specific
tasks/mllmteam*
Expand All @@ -13,7 +13,7 @@ tasks/mllmteam*

# Building files and binary
build*/
install*/
/install*/
mllm-sdk-*/
mllm-install-*/

Expand Down
2 changes: 2 additions & 0 deletions README-ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ mllm

## 最新动态

- [2026 年 3 月 18 日] 🔥🔥🔥 `pymllm` 已支持在 Jetson Orin 和 Jetson Thor 设备上使用 CUDA(实验特性,仍在持续开发中)。
- [2026 年 2 月 3 日] 🔥🔥🔥 MLLM Qnn AOT 已支持在 NPU 上全图执行![快速开始](https://ubiquitouslearning.github.io/mllm/qnn_backend/aot_execute.html), [技术报告](https://chenghuawang.github.io/News/2026-01-29-mllm-qnn-aot-support/)
- [2025 年 11 月 27 日] Android Demo 更新:通过一种全新的 In-App Go 服务架构,在 Android 上实现了 Qwen3 和 DeepSeek-OCR 的稳定流式推理。
- [2025 年 11 月 23 日] MLLM v2 发布!
Expand Down Expand Up @@ -78,6 +79,7 @@ mllm 框架可以与主流社区框架的模型检查点无缝集成。通过 ml
|-----------------------------------------------------------------------------|------|-----------------------|
| [Qwen3-0.6B](https://github.com/QwenLM/Qwen3) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen3-0.6B-w4a32kai) | |
| [Qwen3-1.7B](https://github.com/QwenLM/Qwen3) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen3-1.7B-w4a8-i8mm-kai) | [W4A16-SM8650](https://modelscope.cn/models/mllmTeam/Qwen3-1.7B-Qnn-AOT-SM8650/summary) |
| [Qwen3-4B](https://github.com/QwenLM/Qwen3) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen3-4B-w4a8-i8mm-kai) | |
| [DeepSeek-OCR](https://github.com/deepseek-ai/DeepSeek-OCR) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/DeepSeek-OCR-w4a8-i8mm-kai) | |
| [SmolLM3](https://huggingface.co/blog/smollm3)| [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/SmolLM3-3B-w4a8-i8mm-kai) | |
| [Qwen2-VL-2B-Instruct](https://qwenlm.github.io/zh/blog/qwen2-vl/)|[✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen2-VL-2B-Instruct-w4a32kai) ||
Expand Down
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ mllm

## Latest News

- [2026 Mar 18] 🔥🔥🔥 `pymllm` now supports CUDA on Jetson Orin and Jetson Thor devices (experimental; still under active development).
- [2026 Feb 03] 🔥🔥🔥 MLLM Qnn AOT Support for Full Graph Execution on NPU! [Quick Start](https://ubiquitouslearning.github.io/mllm/qnn_backend/aot_execute.html), [Technical Report](https://chenghuawang.github.io/News/2026-01-29-mllm-qnn-aot-support-en/)
- [2025 Nov 27] Android Demo Update: Enabled stable Qwen3 and DeepSeek-OCR streaming on Android via a novel In-App Go Server Architecture.
- [2025 Nov 23] MLLM v2 released!
Expand Down Expand Up @@ -76,6 +77,7 @@ The mllm framework integrates seamlessly with popular community frameworks' chec
|-----------------------------------------------------------------------------|------|-----------------------|
| [Qwen3-0.6B](https://github.com/QwenLM/Qwen3) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen3-0.6B-w4a32kai) | |
| [Qwen3-1.7B](https://github.com/QwenLM/Qwen3) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen3-1.7B-w4a8-i8mm-kai) | [W4A16-SM8650](https://modelscope.cn/models/mllmTeam/Qwen3-1.7B-Qnn-AOT-SM8650/) |
| [Qwen3-4B](https://github.com/QwenLM/Qwen3) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen3-4B-w4a8-i8mm-kai) | |
| [DeepSeek-OCR](https://github.com/deepseek-ai/DeepSeek-OCR) | [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/DeepSeek-OCR-w4a8-i8mm-kai) | |
| [SmolLM3](https://huggingface.co/blog/smollm3)| [✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/SmolLM3-3B-w4a8-i8mm-kai) | |
| [Qwen2-VL-2B-Instruct](https://qwenlm.github.io/zh/blog/qwen2-vl/)|[✔️ w4a8](https://www.modelscope.cn/models/mllmTeam/Qwen2-VL-2B-Instruct-w4a32kai) ||
Expand Down Expand Up @@ -308,6 +310,15 @@ mllm provides a set of model converters to convert models from other popular mod
bash ./scripts/install_pymllm.sh
```

> **Tip for CUDA-only users:** If you only use CUDA backends (e.g., FlashInfer, TileLang) and do not need the C++ mllm runtime, you can skip the CMake build to speed up installation significantly:
>
> ```shell
> SKBUILD_WHEEL_CMAKE=false pip install -e .
> pip install pymllm[cuda]
> ```
>
> This installs only the pure Python package without compiling the C++ components.

**future:**

Once PyPI approves the creation of the mllm organization, we will publish it there. Afterwards, you can use the command below to install it in the future.
Expand Down
Binary file added assets/pymllm-arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,17 @@ mllm provides a set of model converters to convert models from other popular mod

bash ./scripts/install_pymllm.sh

.. tip::

**For CUDA-only users:** If you only use CUDA backends (e.g., FlashInfer, TileLang) and do not need the C++ mllm runtime, you can skip the CMake build to speed up installation significantly:

.. code-block:: shell

SKBUILD_WHEEL_CMAKE=false pip install -e .
pip install pymllm[cuda]

This installs only the pure Python package without compiling the C++ components.

**future:**

Once PyPI approves the creation of the mllm organization, we will publish it there. Afterwards, you can use the command below to install it in the future.
Expand Down
5 changes: 5 additions & 0 deletions docs/qnn_backend/aot_execute.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,10 @@ Taking ``qwen3_qnn_aot`` as an example, the detailed steps are as follows.
pip install -e .

# link lib to pymllm's dir, so that tvm ffi can find the lib
#
# NOTE:! build x86 qualcomm aot first !
source <absolute path to where you install qnn>/bin/envsetup.sh
python task.py tasks/build_x86_qnn_aot.yaml
ln -s <absolute path to where you build mllm>/bin/ mllm/pymllm/lib


Expand All @@ -82,6 +86,7 @@ Taking ``qwen3_qnn_aot`` as an example, the detailed steps are as follows.
.. code-block:: shell

# In the mllm-v2 project root directory
source <absolute path to where you install qnn>/bin/envsetup.sh
python task.py tasks/build_x86_qnn_aot.yaml

# Run the compiler program
Expand Down
1 change: 1 addition & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ add_subdirectory(qwen2vl)
add_subdirectory(qwen2vl_tracer)
add_subdirectory(qwen2_5vl)
add_subdirectory(qwen2_5vl_tracer)
add_subdirectory(minicpm_o45)
add_subdirectory(llama)
add_subdirectory(minicpm_o)
add_subdirectory(minicpm4)
Expand Down
11 changes: 11 additions & 0 deletions examples/minicpm_o45/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
add_executable(mllm-minicpm-o45-runner main.cpp)
target_link_libraries(mllm-minicpm-o45-runner PRIVATE MllmRT MllmCPUBackend)
target_include_directories(mllm-minicpm-o45-runner PRIVATE ${MLLM_INCLUDE_DIR})

add_executable(mllm-minicpm-o45-runner-dbg main_dbg.cpp)
target_link_libraries(mllm-minicpm-o45-runner-dbg PRIVATE MllmRT MllmCPUBackend)
target_include_directories(mllm-minicpm-o45-runner-dbg PRIVATE ${MLLM_INCLUDE_DIR})

# add_executable(mllm-minicpm-o45-runner-python main_python.cpp)
# target_link_libraries(mllm-minicpm-o45-runner-python PRIVATE MllmRT MllmCPUBackend)
# target_include_directories(mllm-minicpm-o45-runner-python PRIVATE ${MLLM_INCLUDE_DIR})
Loading
Loading