RX7900XTXでStable-Diffusion-WebUI(Automatic1111)を使おうとしたが失敗した記録

※　先にお断りしておきますが、これは失敗の記録です。これを読んでもRadeon RX7900XTXでStable-Diffusion WebUI(Automatic111)が使えるようにはなりませんので注意してください。

今回は分かる人向けの参考と自分の備忘録に書いているだけですので、いつものようにステップバイステップで親切には解説しません。

所謂「俺の屍を越えていけ」的な記録です。

成功した暁には丁寧な解説をやるかもしれません。

Windows版(DirectML版)その１

Radeon向けのWindowsはDirectML版が出ています。

lshqqytiger / stable-diffusion-webui-directml

上がそのページになります。

これを

AUTOMATIC1111 / stable-diffusion-webui v1.0.0-pre

上の通常版(CUDA向け)と中身を入れ替えて実行してみました。

実行に当たっては\repositories内の2つのフォルダ(k-diffusion,stable-diffusion-stability-ai)がなぜかコピーされません。

こちらの議論のgatttahaveit1氏の投稿についているレスにも同様の事例が出ています。

よって、以下のページからあらかじめダウンロードしておいて自分で追加することになります。

stable-diffusion-webui-directml/repositories/

この方法を使った場合、update.batのgit pullは失敗します。

実行結果

実行結果は今回はSSではなく、ターミナルに表示されたテキストをコピペしておきます。

Commit hash: <none>
Installing torch and torchvision
Collecting torch==1.13.1
Using cached torch-1.13.1-cp310-cp310-win_amd64.whl (162.6 MB)
Collecting torchvision==0.14.1
Using cached torchvision-0.14.1-cp310-cp310-win_amd64.whl (1.1 MB)
Collecting torch-directml
Using cached torch_directml-0.1.13.1.dev230301-cp310-cp310-win_amd64.whl (7.4 MB)
Collecting typing-extensions
Using cached typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting numpy
Using cached numpy-1.24.2-cp310-cp310-win_amd64.whl (14.8 MB)
Collecting requests
Using cached requests-2.28.2-py3-none-any.whl (62 kB)
Collecting pillow!=8.3.*,>=5.3.0
Using cached Pillow-9.5.0-cp310-cp310-win_amd64.whl (2.5 MB)
Collecting charset-normalizer<4,>=2
Using cached charset_normalizer-3.1.0-cp310-cp310-win_amd64.whl (97 kB)
Collecting certifi>=2017.4.17
Using cached certifi-2022.12.7-py3-none-any.whl (155 kB)
Collecting urllib3<1.27,>=1.21.1
Using cached urllib3-1.26.15-py2.py3-none-any.whl (140 kB)
Collecting idna<4,>=2.5
Using cached idna-3.4-py3-none-any.whl (61 kB)
Installing collected packages: urllib3, typing-extensions, pillow, numpy, idna, charset-normalizer, certifi, torch, requests, torchvision, torch-directml
Successfully installed certifi-2022.12.7 charset-normalizer-3.1.0 idna-3.4 numpy-1.24.2 pillow-9.5.0 requests-2.28.2 torch-1.13.1 torch-directml-0.1.13.1.dev230301 torchvision-0.14.1 typing-extensions-4.5.0 urllib3-1.26.15
Installing gfpgan
Installing clip
Installing open_clip
Cloning Taming Transformers into repositories\taming-transformers...
Cloning CodeFormer into repositories\CodeFormer...
Cloning BLIP into repositories\BLIP...
Installing requirements for CodeFormer
Installing requirements for Web UI
Launching Web UI with arguments:
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
Traceback (most recent call last):
File "E:\sd.webui_dml\webui\launch.py", line 353, in <module>
start()
File "E:\sd.webui_dml\webui\launch.py", line 344, in start
import webui
File "E:\sd.webui_dml\webui\webui.py", line 15, in <module>
from modules import import_hook, errors, extra_networks, ui_extra_networks_checkpoints
File "E:\sd.webui_dml\webui\modules\ui_extra_networks_checkpoints.py", line 6, in <module>
from modules import shared, ui_extra_networks, sd_models
File "E:\sd.webui_dml\webui\modules\sd_models.py", line 15, in <module>
from modules import paths, shared, modelloader, devices, script_callbacks, sd_vae, sd_disable_initialization, errors, hashes, sd_models_config
File "E:\sd.webui_dml\webui\modules\sd_disable_initialization.py", line 1, in <module>
import ldm.modules.encoders.modules
File "E:\sd.webui_dml\webui\repositories\stable-diffusion-stability-ai\ldm\modules\encoders\modules.py", line 3, in <module>
import kornia
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\__init__.py", line 11, in <module>
from . import augmentation, color, contrib, core, enhance, feature, io, losses, metrics, morphology, tracking, utils, x
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\augmentation\__init__.py", line 1, in <module>
from kornia.augmentation._2d import (
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\augmentation\_2d\__init__.py", line 3, in <module>
from kornia.augmentation._2d.mix import *
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\augmentation\_2d\mix\__init__.py", line 1, in <module>
from kornia.augmentation._2d.mix.cutmix import RandomCutMix, RandomCutMixV2
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\augmentation\_2d\mix\cutmix.py", line 7, in <module>
from kornia.augmentation._2d.mix.base import MixAugmentationBase, MixAugmentationBaseV2
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\augmentation\_2d\mix\base.py", line 10, in <module>
from kornia.geometry.boxes import Boxes
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\geometry\boxes.py", line 582, in <module>
class Boxes3D:
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_script.py", line 1323, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_recursive.py", line 47, in _compile_and_register_class
script_class = torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_recursive.py", line 863, in try_compile_fn
return torch.jit.script(fn, _rcb=rcb)
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_script.py", line 1343, in script
fn = torch._C._jit_script_compile(
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_recursive.py", line 863, in try_compile_fn
return torch.jit.script(fn, _rcb=rcb)
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_script.py", line 1343, in script
fn = torch._C._jit_script_compile(
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_recursive.py", line 863, in try_compile_fn
return torch.jit.script(fn, _rcb=rcb)
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_script.py", line 1343, in script
fn = torch._C._jit_script_compile(
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_recursive.py", line 863, in try_compile_fn
return torch.jit.script(fn, _rcb=rcb)
File "E:\sd.webui_dml\system\python\lib\site-packages\torch\jit\_script.py", line 1343, in script
fn = torch._C._jit_script_compile(
RuntimeError:

aten::pad(Tensor self, int[] pad, str mode="constant", float? value=None) -> Tensor:
Expected a value of type 'List[int]' for argument 'pad' but instead found type 'Tensor (inferred)'.
Inferred the value for argument 'pad' to be of type 'Tensor' because it was not annotated with an explicit type.
:
File "E:\sd.webui_dml\webui\modules\devices.py", line 246
def pad(input, pad, mode='constant', value=None):
if input.dtype == torch.float16 and input.device.type == 'privateuseone':
return _pad(input.float(), pad, mode, value).type(input.dtype)
~~~~ <--- HERE
else:
return _pad(input, pad, mode, value)
'pad' is being compiled since it was called from 'convert_points_to_homogeneous'
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\geometry\conversions.py", line 199
raise ValueError(f"Input must be at least a 2D tensor. Got {points.shape}")

return torch.nn.functional.pad(points, [0, 1], "constant", 1.0)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
'convert_points_to_homogeneous' is being compiled since it was called from 'transform_points'
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\geometry\linalg.py", line 189
trans_01 = torch.repeat_interleave(trans_01, repeats=points_1.shape[0] // trans_01.shape[0], dim=0)
# to homogeneous
points_1_h = convert_points_to_homogeneous(points_1) # BxNxD+1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
# transform coordinates
points_0_h = torch.bmm(points_1_h, trans_01.permute(0, 2, 1))
'transform_points' is being compiled since it was called from '_transform_boxes'
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\geometry\boxes.py", line 53
)

transformed_boxes: torch.Tensor = transform_points(M, points)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
transformed_boxes = transformed_boxes.view_as(boxes)
return transformed_boxes
'_transform_boxes' is being compiled since it was called from 'Boxes3D.transform_boxes'
File "E:\sd.webui_dml\system\python\lib\site-packages\kornia\geometry\boxes.py", line 897
# Due to some torch.jit.script bug (at least <= 1.9), you need to pass all arguments to __init__ when
# constructing the class from inside of a method.
transformed_boxes = _transform_boxes(self._data, M)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
if inplace:
self._data = transformed_boxes

続行するには何かキーを押してください . . .

Windows版(DirectML版)その2

Google検索 - automatic1111 スタンドアローンセットアップ法・改

新・ワンタッチ版automatic1111は個人の方が作られたそうですので、どこのサイトさんも気を使って記事への直接のリンクは避けています。

私もその例に倣って直リンはしませんので、上の検索リンクから自分で探してください。

記事の中ほどからダウンロードできます。

ファイル名は「AUTOMATIC1111_webui_0301.ZIP」です。

ちなみに、このバッチファイルはとてもよく出来ています。

現物をそのままコピーしないでgithubからcloneしてきますので、事故に遭う可能性がとても低いと思います。

こちらの場合、updateもgithubからpullしてきますので事故る可能性は非常に低いでしょう。

このファイを展開して出てくる以下の3つのファイルを書き換えます。

1_セットアップ.bat、2_スタート_webui-user.bat、3_通常更新 (git pull).bat

書き換え内容

git.exe clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git

を

git.exe clone https://github.com/lshqqytiger/stable-diffusion-webui-directml.git

に

stable-diffusion-webui

を

stable-diffusion-webui-directml

にそれぞれ書き換えます。

gitのアドレスにも同じ文字列が使われていますので、単純に置き換えすると間違える可能性がありますから注意してください。

gitのアドレスは箇所が少ない(3つのファイルで合計2か所)ですから、該当箇所を覚えておいて後から修正するのが良いと思います。

置き換えに当たってはテキストエディタの置き換え機能を使うと簡単です・・・と言うか人力でやる人はまずいないと思いますが。

このバージョンを使うとControl Netを使うのになぜかパッチを当てる必要があるという情報がどこかのwikiにありました。

実行結果

1_セットアップ.bat

■ AUTOMATIC1111 webui 簡単セットアップ　2023/03/02版 ■

　！注意！

　大容量のファイルをダウンロードするのでとても時間がかかります
　安定した回線環境でのセットアップをオススメします
　回線の影響によっては、ダウンロードが正常に行われず中断してしまうことがあります

　Ｑ．セットアップの準備はよろしいですか？

　　　YES の場合は y と入力して Enter を押してください
　　　No の場合はそのまま Enter を押すか、ウィンドウを閉じてください

⇒ y

[各種ファイルのダウンロードを開始します]

hint: Using 'master' as the name for the initial branch. This default branch name
hint: is subject to change. To configure the initial branch name to use in all
hint: of your new repositories, which will suppress this warning, call:
hint:
hint: git config --global init.defaultBranch <name>
hint:
hint: Names commonly chosen instead of 'master' are 'main', 'trunk' and
hint: 'development'. The just-created branch can be renamed via this command:
hint:
hint: git branch -m <name>
Initialized empty Git repository in E:/sd.webui_dml2/.git/
Cloning into 'stable-diffusion-webui-directml'...
remote: Enumerating objects: 18428, done.
remote: Total 18428 (delta 0), reused 0 (delta 0), pack-reused 18428Receiving objects: 100% (18428/18428), 24.98 MiB | 1Receiving objects: 100% (18428/18428), 28.57 MiB | 17.11 MiB/s, done.

Resolving deltas: 100% (12876/12876), done.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: ae337fa39b6d4598b377ff312c53b14c15142331
Installing torch and torchvision
Collecting torch==1.13.1
Downloading torch-1.13.1-cp310-cp310-win_amd64.whl (162.6 MB)
---------------------------------------- 162.6/162.6 MB 19.2 MB/s eta 0:00:00
Collecting torchvision==0.14.1
Downloading torchvision-0.14.1-cp310-cp310-win_amd64.whl (1.1 MB)
---------------------------------------- 1.1/1.1 MB 34.9 MB/s eta 0:00:00
Collecting torch-directml
Downloading torch_directml-0.1.13.1.dev230301-cp310-cp310-win_amd64.whl (7.4 MB)
---------------------------------------- 7.4/7.4 MB 39.7 MB/s eta 0:00:00
Collecting typing-extensions
Downloading typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting requests
Downloading requests-2.28.2-py3-none-any.whl (62 kB)
---------------------------------------- 62.8/62.8 kB ? eta 0:00:00
Collecting numpy
Downloading numpy-1.24.2-cp310-cp310-win_amd64.whl (14.8 MB)
---------------------------------------- 14.8/14.8 MB 38.4 MB/s eta 0:00:00
Collecting pillow!=8.3.*,>=5.3.0
Downloading Pillow-9.5.0-cp310-cp310-win_amd64.whl (2.5 MB)
---------------------------------------- 2.5/2.5 MB 53.2 MB/s eta 0:00:00
Collecting idna<4,>=2.5
Downloading idna-3.4-py3-none-any.whl (61 kB)
---------------------------------------- 61.5/61.5 kB ? eta 0:00:00
Collecting urllib3<1.27,>=1.21.1
Downloading urllib3-1.26.15-py2.py3-none-any.whl (140 kB)
---------------------------------------- 140.9/140.9 kB ? eta 0:00:00
Collecting charset-normalizer<4,>=2
Downloading charset_normalizer-3.1.0-cp310-cp310-win_amd64.whl (97 kB)
---------------------------------------- 97.1/97.1 kB ? eta 0:00:00
Collecting certifi>=2017.4.17
Downloading certifi-2022.12.7-py3-none-any.whl (155 kB)
---------------------------------------- 155.3/155.3 kB ? eta 0:00:00
Installing collected packages: urllib3, typing-extensions, pillow, numpy, idna, charset-normalizer, certifi, torch, requests, torchvision, torch-directml
Successfully installed certifi-2022.12.7 charset-normalizer-3.1.0 idna-3.4 numpy-1.24.2 pillow-9.5.0 requests-2.28.2 torch-1.13.1 torch-directml-0.1.13.1.dev230301 torchvision-0.14.1 typing-extensions-4.5.0 urllib3-1.26.15

[notice] A new release of pip available: 22.2.1 -> 23.0.1
[notice] To update, run: E:\sd.webui_dml2\stable-diffusion-webui-directml\venv\Scripts\python.exe -m pip install --upgrade pip
Installing gfpgan
Installing clip
Installing open_clip
Cloning Taming Transformers into E:\sd.webui_dml2\stable-diffusion-webui-directml\repositories\taming-transformers...
Cloning CodeFormer into E:\sd.webui_dml2\stable-diffusion-webui-directml\repositories\CodeFormer...
Cloning BLIP into E:\sd.webui_dml2\stable-diffusion-webui-directml\repositories\BLIP...
Installing requirements for CodeFormer
Installing requirements for Web UI
Launching Web UI with arguments: --autolaunch
Traceback (most recent call last):
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\launch.py", line 353, in <module>
start()
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\launch.py", line 344, in start
import webui
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\webui.py", line 16, in <module>
from modules import paths, timer, import_hook, errors
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\modules\paths.py", line 19, in <module>
assert sd_path is not None, "Couldn't find Stable Diffusion in any of: " + str(possible_sd_paths)
AssertionError: Couldn't find Stable Diffusion in any of: ['E:\\sd.webui_dml2\\stable-diffusion-webui-directml\\repositories/stable-diffusion-stability-ai', '.', 'E:\\sd.webui_dml2']
続行するには何かキーを押してください . . .

2_スタート_webui-user.bat

※　モデルデータおよび\repository配下の2つのファルダをコピー後

Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Commit hash: ae337fa39b6d4598b377ff312c53b14c15142331
Installing requirements for Web UI
Launching Web UI with arguments: --autolaunch
Warning: caught exception 'Torch not compiled with CUDA enabled', memory monitor disabled
No module 'xformers'. Proceeding without it.
Loading weights [89d59c3dde] from E:\sd.webui_dml2\stable-diffusion-webui-directml\models\Stable-diffusion\model-001.ckpt
Creating model from config: E:\sd.webui_dml2\stable-diffusion-webui-directml\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying cross attention optimization (InvokeAI).
Textual inversion embeddings loaded(0):
Model loaded in 3.3s (load weights from disk: 1.2s, create model: 0.2s, apply weights to model: 0.4s, apply half(): 0.4s, move model to device: 1.1s).
Traceback (most recent call last):
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\launch.py", line 353, in <module>
start()
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\launch.py", line 348, in start
webui.webui()
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\webui.py", line 243, in webui
shared.demo = modules.ui.create_ui()
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\modules\ui.py", line 468, in create_ui
extra_networks_ui = ui_extra_networks.create_ui(extra_networks, extra_networks_button, 'txt2img')
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\modules\ui_extra_networks.py", line 175, in create_ui
page_elem = gr.HTML(page.create_html(ui.tabname))
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\modules\ui_extra_networks.py", line 91, in create_html
items_html += self.create_html_for_item(item, tabname)
File "E:\sd.webui_dml2\stable-diffusion-webui-directml\modules\ui_extra_networks.py", line 132, in create_html_for_item
return self.card_page.format(**args)
KeyError: 'style'
続行するには何かキーを押してください . . .

こちらは、アップデート前のCUDA版automatic1111v1.0.0-preに出ていたエラーメッセージとよく似ていますので、DirectML版にも同様のアップデートが必要なのかもしれません。

というわけで、2つとも失敗するという何とも切ない結果に・・・。

どちらも、大量のエラーメッセージを吐いて停止します。

本家版のアップデートが入る前はちゃんと動いたと思います(未確認)。

失敗した理由がわかる方、コメントで情報をいただけると幸いです。

Linux版

LinuxではROCmを使ってRX7900XTXを認識させてStable-Diffusion-WebUI(Automatic1111)を実行するという方法になります。

まず、OSはメジャーなUbuntu22.04LTSを使います。

私は容量が小さくて軽い方がいいのでLubuntu22.04LTSを使いました。

LinuxはWindowsと比較すると桁違いにインストールの難度が上がりますので、注意してください。

これを見てやってみようと思ったとしても22.04LTS系列はカーネルが対応していないのでRX7900シリーズではやらない方がいいです。

いきなり挫折すると思います。

まず、普通にISOをダウンロードしてインストールします。

インストールからOS起動するまでの注意

RX7900XTXは認識されませんので、リカバリーモード(Windowsのセーフモードのようなもの)でインストール、実行してください。

普通に起動すると見事に起動途中でハングアップします。

また、インストールに当たっては一部のWiFiアダプタがあるとリカバリーモードでも起動しませんので、最新(WiFi6E対応)の無線LANはBIOSからオフにした方が良いです。

Linuxの安定板は最新の自作パーツと極めて相性が悪いです。

当然ですが、オープンソースなってドライバが提供されるまで時間がかかるからです。

Linuxをインストールするならば枯れた環境にするのが最適です。

組み合わせによっては超高難易度の作業になりますので、初心者の人は潔く諦めた方が幸せになれますよ。

無事にROCmをインストールできるところまで行っても、インストールして再起動するとマウスカーソルが消えて操作不能になりました。

自己解決できないレベルの人はトライしない方がいいでしょう。

このような状況はWindowsでは120%あり得ません。

万が一あっても大問題になってすぐに解決策が出てきます。

Linuxが誰でも使えるなんてのは嘘っぱちですから、皆さん本気にしないようにしてください。

ここまでいろいろ書いてきましたが、開発など用途がきちんと決まっているならばLinuxは素晴らしいOSだと思います。

ただ、自作erや初心者向けではないというだけです。

無理に使おうとすると死ぬほど苦労しますのでやめた方がいいです。

rocminfoのログを一応載せておきます。

ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE

==========
HSA Agents
==========
*******
Agent 1
*******
Name: 13th Gen Intel(R) Core(TM) i7-13700K
Uuid: CPU-XX
Marketing Name: 13th Gen Intel(R) Core(TM) i7-13700K
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 49152(0xc000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5300
BDFID: 0
Internal Node ID: 0
Compute Unit: 24
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 16130584(0xf62218) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 16130584(0xf62218) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16130584(0xf62218) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1100
Uuid: GPU-XX
Marketing Name: Radeon RX 7900 XTX
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 98304(0x18000) KB
Chip ID: 29772(0x744c)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3220
BDFID: 768
Internal Node ID: 1
Compute Unit: 96
SIMDs per CU: 2
Shader Engines: 12
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 25149440(0x17fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***

ちゃんと認識しているので、絶対に動くと確信していました。(フラグ)

そのほかGoogle Chromeとfcitx-mozc(日本語入力システム)を追加でインストールしました。

Google Chromeじゃないと正常に動かないサイトと言うのがごくごくわずかにありますので念のためですね。

漸くOSが正常に動くようになったのでブラウザを立ち上げて情報収集しようと思ったら日本語入力システムがインストールされてない事実に気が付きました。

デフォルトで日本語入力ができない時点でどんなレベルのOSなのか想像がつくのではないでしょうか。

必要に迫られて久しぶりにLinuxを使いましたが、Ubuntu系と言う一番メジャーなディストリビューションですら未だにこんなレベルのOSなのは驚きです。

ちなみに、私のモニターは4KなのですがRadeon Software for Linuxをインストールするまで豆粒くらいの文字表示になって作業は困難を極めました。(苦笑

Linux+4Kモニターは鬼門です。

さて、22.04LTS系はPytonがなんとWEBUIを使うのに最適といわれている3.10.6になっています。

python3をpythonに張り替えるシンボリックリンクのパッケージをインストールしてvenvで仮想環境を作ってそのままwebuiを実行しました。

AUTOMATIC1111 / stable-diffusion-webui

本家版です。

こちらにAMD(ROCm)で実行させる方法が書いてあります。

全部英語ですが、ここも今回は解説しません。

Windowsのようにプログラムそのものを圧縮ファイルで配布するのではなく、コマンド打ちでgitからcloneしてくる形になっています。

gitは最初から入っていますので安心してください。

繰り返しますが、まだRX7900シリーズ(RDNA3)は対応してません。

上の手順で実行しても正常に動きません。

実行した結果

Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
Commit hash: 22bcc7be428c94e9408f589966c2040187245d81
Installing torch and torchvision
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/rocm5.1.1
Collecting torch
Using cached torch-2.0.0-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)
Collecting torchvision
Using cached torchvision-0.15.1-cp310-cp310-manylinux1_x86_64.whl (6.0 MB)
Collecting nvidia-cusolver-cu11==11.4.0.1
Using cached nvidia_cusolver_cu11-11.4.0.1-2-py3-none-manylinux1_x86_64.whl (102.6 MB)
Collecting nvidia-cufft-cu11==10.9.0.58
Using cached nvidia_cufft_cu11-10.9.0.58-py3-none-manylinux1_x86_64.whl (168.4 MB)
Collecting nvidia-cusparse-cu11==11.7.4.91
Using cached nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
Collecting nvidia-cublas-cu11==11.10.3.66
Using cached nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
Collecting typing-extensions
Using cached typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting jinja2
Using cached https://download.pytorch.org/whl/Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting nvidia-cudnn-cu11==8.5.0.96
Using cached nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
Collecting nvidia-nccl-cu11==2.14.3
Using cached nvidia_nccl_cu11-2.14.3-py3-none-manylinux1_x86_64.whl (177.1 MB)
Collecting triton==2.0.0
Using cached https://download.pytorch.org/whl/triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
Collecting nvidia-nvtx-cu11==11.7.91
Using cached nvidia_nvtx_cu11-11.7.91-py3-none-manylinux1_x86_64.whl (98 kB)
Collecting networkx
Using cached networkx-3.1-py3-none-any.whl (2.1 MB)
Collecting sympy
Using cached https://download.pytorch.org/whl/sympy-1.11.1-py3-none-any.whl (6.5 MB)
Collecting nvidia-curand-cu11==10.2.10.91
Using cached nvidia_curand_cu11-10.2.10.91-py3-none-manylinux1_x86_64.whl (54.6 MB)
Collecting filelock
Using cached filelock-3.11.0-py3-none-any.whl (10.0 kB)
Collecting nvidia-cuda-nvrtc-cu11==11.7.99
Using cached nvidia_cuda_nvrtc_cu11-11.7.99-2-py3-none-manylinux1_x86_64.whl (21.0 MB)
Collecting nvidia-cuda-runtime-cu11==11.7.99
Using cached nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
Collecting nvidia-cuda-cupti-cu11==11.7.101
Using cached nvidia_cuda_cupti_cu11-11.7.101-py3-none-manylinux1_x86_64.whl (11.8 MB)
Requirement already satisfied: setuptools in ./venv/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (59.6.0)
Requirement already satisfied: wheel in ./venv/lib/python3.10/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch) (0.40.0)
Collecting cmake
Using cached cmake-3.26.1-py2.py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (24.0 MB)
Collecting lit
Using cached lit-16.0.0-py3-none-any.whl
Collecting pillow!=8.3.*,>=5.3.0
Using cached Pillow-9.5.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.4 MB)
Collecting numpy
Using cached numpy-1.24.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB)
Collecting requests
Using cached requests-2.28.2-py3-none-any.whl (62 kB)
Collecting MarkupSafe>=2.0
Using cached https://download.pytorch.org/whl/MarkupSafe-2.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (25 kB)
Collecting urllib3<1.27,>=1.21.1
Using cached urllib3-1.26.15-py2.py3-none-any.whl (140 kB)
Collecting idna<4,>=2.5
Using cached https://download.pytorch.org/whl/idna-3.4-py3-none-any.whl (61 kB)
Collecting certifi>=2017.4.17
Using cached https://download.pytorch.org/whl/certifi-2022.12.7-py3-none-any.whl (155 kB)
Collecting charset-normalizer<4,>=2
Using cached charset_normalizer-3.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (199 kB)
Collecting mpmath>=0.19
Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
Installing collected packages: mpmath, lit, cmake, urllib3, typing-extensions, sympy, pillow, nvidia-nvtx-cu11, nvidia-nccl-cu11, nvidia-cusparse-cu11, nvidia-curand-cu11, nvidia-cufft-cu11, nvidia-cuda-runtime-cu11, nvidia-cuda-nvrtc-cu11, nvidia-cuda-cupti-cu11, nvidia-cublas-cu11, numpy, networkx, MarkupSafe, idna, filelock, charset-normalizer, certifi, requests, nvidia-cusolver-cu11, nvidia-cudnn-cu11, jinja2, triton, torch, torchvision
Successfully installed MarkupSafe-2.1.2 certifi-2022.12.7 charset-normalizer-3.1.0 cmake-3.26.1 filelock-3.11.0 idna-3.4 jinja2-3.1.2 lit-16.0.0 mpmath-1.3.0 networkx-3.1 numpy-1.24.2 nvidia-cublas-cu11-11.10.3.66 nvidia-cuda-cupti-cu11-11.7.101 nvidia-cuda-nvrtc-cu11-11.7.99 nvidia-cuda-runtime-cu11-11.7.99 nvidia-cudnn-cu11-8.5.0.96 nvidia-cufft-cu11-10.9.0.58 nvidia-curand-cu11-10.2.10.91 nvidia-cusolver-cu11-11.4.0.1 nvidia-cusparse-cu11-11.7.4.91 nvidia-nccl-cu11-2.14.3 nvidia-nvtx-cu11-11.7.91 pillow-9.5.0 requests-2.28.2 sympy-1.11.1 torch-2.0.0 torchvision-0.15.1 triton-2.0.0 typing-extensions-4.5.0 urllib3-1.26.15
Traceback (most recent call last):
File "/home/linadm/stable-diffusion-webui/launch.py", line 355, in <module>
prepare_environment()
File "/home/linadm/stable-diffusion-webui/launch.py", line 260, in prepare_environment
run_python("import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'")
File "/home/linadm/stable-diffusion-webui/launch.py", line 121, in run_python
return run(f'"{python}" -c "{code}"', desc, errdesc)
File "/home/linadm/stable-diffusion-webui/launch.py", line 97, in run
raise RuntimeError(message)
RuntimeError: Error running command.
Command: "/home/linadm/stable-diffusion-webui/venv/bin/python" -c "import torch; assert torch.cuda.is_available(), 'Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check'"
Error code: 1
stdout: <empty>
stderr: Traceback (most recent call last):
File "<string>", line 1, in <module>
AssertionError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

「Torch is not able to use GPU」とあるので、ROCmがインストールされているにも関わらずPytorchがGPUを認識していないようです。

一応念の為確認してみました。

Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.cuda.is_available()
False

結果は上の通り、「torch.cuda.is_available()」の戻り値が見事に「False」でした。

どうしようも無くなってツイッターで管を巻いていたのですが、親切な人が教えてくださいました。

そもそもRDNA3はまだROCmに対応していないはずです。
— cyatarow (@cyatarow9) April 9, 2023

「本当かよ」と思って調べてみたら、ここの議論のMushoz氏の発言に答えがありました。

7900 XTXのサポートについては、こちらでより大きな議論が行われているので、お気づきかどうかわかりませんが： #1880
7900XTXでPytorchを実行できるDockerfileも投稿されています（ただし、期待したほどのパフォーマンスではなく、バグもあるようです）。さらに、@saadrahimはそのトピックで、正式なサポートは5.5.0を待つように確認しました。
最後に、@saadrahimは、このサポートがここに上陸することを期待できる大まかなスケジュールに関して、前向きな発言ができるかどうか周囲に尋ねていることも述べています： #1836 (スレッド内の返信)
残念ながら、彼はまだ大まかなスケジュールを教えてくれませんが、近いうちにもっと詳しい情報を教えてくれることを期待しています。とにかく、この2つのトピックを共有したかったのです。）
krasin がサムズアップの絵文字で反応しました。

彼によると、RX7900シリーズの対応はROCm5.5.0以降になるようです。

ちょっと徒労感に襲われて、今回の無駄なあがきの顛末を締めたいと思います。

通りすがりより:
2023年4月11日 12:56 PM
directmlの場合lshqqytiger / stable-diffusion-webui-directmlのものの最新のものをダウンロードして、venvもモジュールのバージョンアップがあるので面倒がらずにつくりなおしが必要ですかね。
上記の内容でドライバ最新にして、こちらの7900xtxでは無事に動いてます。inpaint機能はcmd　のargにいろいろつけないと正常に機能しなかったです。
それから他の記事でzen3->zen4は性能向上はR23のシングルスコア25％向上で、zen3->zen4のクロック向上は約1割で、IPCは最大15％は向上しているのでもう一度確認してください。
- Mr.K より:
  2023年4月12日 7:23 AM
  情報ありがとうございます。
  やってみます。
kitty より:
2023年4月11日 2:39 PM
そういうとこだぞ>AMD
LinuxのHiDPI対応はデスクトップ環境がサポートするのでそこまでが苦痛。