Gemma 3 を IPEX-LLM で内蔵 Intel UHD Graphics で動かしてみる

Gemma 3 がリリースされました。

Google AI Gemma オープンモデル |Google for Developers | Google AI for Developers

Gemma 3 は軽量であることもうたっており、NVIDIA H100 GPU 1枚で高い性能を誇るとのことです。

また、自分は PC でゲームをしないため、NVIDIA や AMD の外部の GPU を所有していないのですが、Intel が開発している IPEX-LLM というのを用いると、

Intel Core Ultra プロセッサー
Intel Core プロセッサー 11 世代～14 世代
Intel Arc Aシリーズ、Bシリーズ

で Gemma 3 を動作させられるようです。

intel/ipex-llm: Accelerate local LLM inference and finetuning on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max)

環境情報

ハードウェア

CPU：Intel Core i5-14600T
内蔵グラフィック：Intel UHD Graphics 770
メモリ：DDR5 64GB

メインメモリの容量の最大半分をグラフィックメモリとして使用できるようですので、グラフィックメモリとしては、32GB 最大確保されるようです

ソフトウェア

IPEX-LLM：2.2.0 nightly build（llama-cpp-ipex-llm-2.2.0b20250313-win.zip）
GPU ドライバ：32.0.101.6647

下記のページによると、31.0.101.5122 以上のバージョンが必要なようです。
メーカー製ドライバを使用している場合は考慮が必要かもしれません。

ipex-llm/docs/mddocs/Quickstart/install_windows_gpu.md at main · intel/ipex-llm

導入

Releases · intel/ipex-llm

こちらのサイトから最新版をダウンロードします。

記事執筆時点の最新：llama-cpp-ipex-llm-2.2.0b20250313-win.zip

llama.cpp Portable Zip と呼ばれており、
一部のドキュメントでは Python の環境の準備などの手順も書かれていますが、インストールやビルドせずにそのまま利用できます。

補足：ollama について

ollama は本家においては 0.6.0 で Gemma 3 をサポートしました。

記事執筆時点において、IPEX-LLM の ollama については 0.5 系となり、Gemma 3 を利用できません。

> ollama --version
ggml_sycl_init: found 1 SYCL devices:
ollama version is 0.5.4-ipexllm-20250313

モデルのダウンロード

Gemma 3 – a ggml-org Collection

こちらからモデルをダウンロードします。

試行錯誤の結果、手元の物理メモリ 64GB の場合、

gemma-3-4b-it-f16.gguf
gemma-3-12b-it-Q4_K_M.gguf

あたりの、モデルサイズが 7GB 台のものであれば 32GB メモリを越えるくらいでしたので、32GB メモリの方はもう少し小さいモデルが必要です。

gemma-3-12b-it-Q8_0.gguf はモデルサイズは 12GB であり、起動直後は 40GB のメモリを使用量となりますが、いろいろ使っているとよく落ちるので、gemma-3-4b-it-f16.gguf を使用します。

起動

この状態で起動すると下記のように表示されます。

llm_load_print_meta: model type       = 4B
llm_load_print_meta: model ftype      = F16
llm_load_print_meta: model params     = 3.88 B
llm_load_print_meta: model size       = 7.23 GiB (16.00 BPW)
中略
Found 1 SYCL devices:
|  |                   |                                       |       |Max    |        |Max  |Global |
    |
|  |                   |                                       |       |compute|Max work|sub  |mem    |
    |
|ID|        Device Type|                                   Name|Version|units  |group   |group|size   |       Driver version|
|--|-------------------|---------------------------------------|-------|-------|--------|-----|-------|---------------------|
| 0| [level_zero:gpu:0]|                 Intel UHD Graphics 770|   12.2|     32|     512|   32| 31709M|            1.6.32413|
中略
main: gemma-3-4b-it-f16.gguf

 Running in chat mode, available commands:
   /image <path>    load an image
   /clear           clear the chat history
   /quit or /exit   exit the program

>

動作確認

日本語対応状況

Gemma 3 自体は日本語に対応していますが、
IPEX-LLM の llama.cpp が日本語の入力に対応していないようで、文字化けしてしまします。

> Can you translate "こんにちは" in English and one line.
This is a tricky one! “ããã«ã¡ã¯” is Japanese text written using the JIS (Japanese Industrial Standard) character set. It translates to:

**“おめでとうございます” (Omedetou gozaimasu)**

Which means: **“Congratulations!”**

Here’s a one-line translation:

It’s a formal way to express congratulations on a happy occasion.

日本語や絵文字の出力は対応していました。

> Can you translate "Cat" in Japanese with Emoji and one line.
 (Neko) - 猫 (Neko) means "cat" in Japanese.

Here’s a one-line translation:

猫は可愛いです。(Neko wa kawaii desu.) - "Cats are cute."

※絵文字は出力されましたが、このブログが絵文字に対応していないようでした。

画像

今回は、「いらすとや」の画像の説明をしてみました。

走るフクロウのヒナのイラスト | かわいいフリー素材集いらすとや

> /image bird_fukurou_run.png
Encoding image bird_fukurou_run.png
Image encoded in 15234 ms
Image decoded in 4765 ms

> Please explain this image in Japanese in 5 lines or less.
Okay, here's a description of the image in Japanese, in 5 lines or less:

この画像は、可愛らしいフクロウのイラストです。
(Kono gazou wa, kawairashii fukurou no ilustto desu.)
This image is an illustration of a cute owl.

白と茶色の色使いが特徴で、少し不機嫌そうな表情をしています。
(Shiro to chairo no iro tsukai ga tokuchou de, sukoshi fukigen-sai na hyoujou o shiteimasu.)
It's characterized by the use of white and brown colors and has a slightly grumpy expression.

丸い目とピンクの頬が、さらに可愛らしさを強調しています。
(Marui me to pinku no hoho ga, sarani kawarashisa o kyouchou shiteimasu.)
Round eyes and pink cheeks further emphasize its cuteness.

全体的に、温かみのある優しい雰囲気です。
(Zentai ni kite, atatakama no aru yasashii fun'iki desu.)
Overall, it has a warm and gentle atmosphere.


Do you want me to describe it in a different way, or perhaps focus on a specific detail?

動作時の負荷状況

内蔵グラフィックの 3D がフルに使われていることが確認できます。

CPU も使われていますが、フルで使われてはいません。

まとめ

外部の GPU が無くとも、

Intel UHD Graphics 770 で IPEX-LLM を用いて Gemma 3 が動作し、

テキストだけでなく画像も処理できることを確認できました。

4B が現実的な限界で、27B については遠い状態ではありますが、
動作されるとされている NVIDIA H100 は 80GB のメモリを搭載していますので、内蔵グラフィックでも 192GB 以上のメモリを搭載した環境であれば速度はさておき IPEX-LLM の llama.cpp でも起動はさせられるかもしれません。

とはいえ、Gemma 3 を小さいモデルで良いのでちょっと試してみたいという目的は果たせて良かったかなと思います。

Comments

comments