Loading... ## MMIO(Memory Mapped I/O) 前段时间配置星河的第二台GPU机器的时候,驱动怎么都打不上,运行了一下dmesg发现: ``` [ 886.661014] NVRM: The system BIOS may have misconfigured your GPU. [ 886.661018] nvidia: probe of 0000:45:00.0 failed with error -1 [ 886.661058] NVRM: The NVIDIA probe routine failed for 2 device(s). [ 886.661060] NVRM: None of the NVIDIA graphics adapters were initialized! [ 886.661298] nvidia-nvlink: Unregistered the Nvlink Core, major device number 241 [ 886.779812] nvidia-nvlink: Nvlink Core is being initialized, major device number 241 [ 886.780112] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: NVRM: BAR1 is 0M @ 0x0 (PCI:0000:44:00.0) [ 886.780118] NVRM: The system BIOS may have misconfigured your GPU. [ 886.780132] nvidia: probe of 0000:44:00.0 failed with error -1 [ 886.780152] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: NVRM: BAR1 is 0M @ 0x0 (PCI:0000:45:00.0) [ 886.780156] NVRM: The system BIOS may have misconfigured your GPU. [ 886.780166] nvidia: probe of 0000:45:00.0 failed with error -1 [ 886.780193] NVRM: The NVIDIA probe routine failed for 2 device(s). [ 886.780195] NVRM: None of the NVIDIA graphics adapters were initialized! [ 886.780285] nvidia-nvlink: Unregistered the Nvlink Core, major device number 241 [ 886.902779] nvidia-nvlink: Nvlink Core is being initialized, major device number 241 [ 886.903055] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: NVRM: BAR1 is 0M @ 0x0 (PCI:0000:44:00.0) [ 886.903059] NVRM: The system BIOS may have misconfigured your GPU. [ 886.903072] nvidia: probe of 0000:44:00.0 failed with error -1 [ 886.903088] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: NVRM: BAR1 is 0M @ 0x0 (PCI:0000:45:00.0) [ 886.903090] NVRM: The system BIOS may have misconfigured your GPU. [ 886.903096] nvidia: probe of 0000:45:00.0 failed with error -1 [ 886.903117] NVRM: The NVIDIA probe routine failed for 2 device(s). [ 886.903118] NVRM: None of the NVIDIA graphics adapters were initialized! [ 886.903181] nvidia-nvlink: Unregistered the Nvlink Core, major device number 241 ``` 在询问了[墨羲](https://blog.morxi.com)之后找到了问题所在:MMIO(Memory Mapped I/O)。 在不同的BIOS里或许配置名不同,但基本都和内存沾边。  原理是如果MMIO开太多(超过操作系统支持的寻址范围)或者开太少,都有可能导致系统无法正常进行寻址,PCI设备无法连接。 基本上服务器主板里都有这个设置,耐心看看说明书就好了。 如果是VMWare ESXi的直通机器,可以参考我之前写的这篇博客[VMWare ESXi 显卡直通 (PCI 设备直通) 出现 DevicePowerOn 错误](https://c4a15wh.cn/index.php/archives/12/),照着操作基本不会出问题。 另外PVE直通基本不会出现这个问题,但切记,只有主机寻址认到了计算卡才有直通的机会,也就是说主板MMIO还是要配置。 ## 说明 本篇博客是对之前的[安装 GPU 及配置环境的注意事项](https://c4a15wh.cn/index.php/archives/14/)的扩写,具体可以看之前的那篇。 Q.E.D C4a15Wh_5.1 2021-12-17 13:50 最后修改:2025 年 01 月 15 日 © 允许规范转载 赞 如果这对你有用,我乐意之至。
1 条评论
感谢技术分享,加油!