I constructed a particular PCIe card to check GPUs on the Pi

I partnered up with Mirek (of Mirkotronics / @Mirko_DIY on Twitter) to construct the Pi4GPU (or ‘PiG’ for brief):
This journey began virtually three years in the past: virtually instantly after the Raspberry Pi Compute Module 4 was launched, I began testing graphics playing cards on it.
First I attempted some low-spec playing cards just like the Zotac Nvidia GT 710 and the VisionTek AMD Radeon 5450. They saved locking up whatever the driver and Linux variations.
Over the subsequent couple years, I saved testing an increasing number of playing cards—over 14 on the time of this writing. The rationale? Every card (actually, every era of every vendor’s playing cards) had quirks that made it kind of more likely to run on the Pi.
Why’s that? Effectively, the Pi has issues with cache coherency on the PCI Specific bus past 32-bits. And plenty of (effectively, all these days) of the drivers count on that to perform. That wasn’t the primary drawback, although—early on, there have been points with the BAR (Base Deal with Register) area allotted on the Pi’s OS. Fortunately this could be worked around by re-compiling a DTB file.
This complete endeavor is what impressed me to create the Raspberry Pi PCIe Device Database, which paperwork (usually in excruciating element) the travails mentioning numerous PCI Specific gadgets on a Raspberry Pi—and now, on different ARM SBCs just like the Radxa Rock 5 and the Pine64 SOQuartz.
Making a Customized PCB
Final 12 months I floated the concept of a customized PCB to ‘plug a pc right into a graphics card’ to Mirek. He did not instantly say ‘no’, so I began pursuing the concept, developing with this fairly fundamental post-it word illustration:
Miraculously, by way of a sequence of emails, we refined that idea right into a working PCB design. Mirek had it printed by JLCPCB, soldered on a bunch of SMD components, and shipped the PCBs (together with some metallic brackets he had his good friend Adam fabricate) by February.
Within the midst of that journey, I had another major surgery, and wound up engaged on a so-far-still-secret-project that soaked up your entire month of March 2023.
So right here we’re in the present day: between spare hours in March and some weeks’ time lastly testing this factor with all of the playing cards this month, I’ve uncovered one or two minor quirks with the construct, designed a 3D-printed base (pictured above) that helps as much as a 4090-sized behemoth PCIe card (pictured beneath), and documented every part in our open source Pi4GPU repository.
The cardboard has a normal commonplace PCIe x4 (bodily) edge connector, and it plugs right into a particular PCIe-to-PCIe adapter board, which inserts neatly into the recessed a part of a 3D printed base.
The graphics card slots into the x16 (bodily, solely x1 pins are related) slot on the adapter board, and if wanted, you utilize an exterior energy provide to energy beefier GPUs.
The Pi4GPU board itself may be powered by way of USB-C or 6-pin PCIe (inside edge), or by way of 12v barrel plug (exterior edge). It additionally has 2 USB 2.0 ports, a full-size HDMI port, and a 1 Gbps LAN port (all by way of the rear PCIe bracket):
It could bodily slot inside a pc in a motherboard, however that’s not really helpful, as I have not examined that configuration…
Exterior GPUs on a Raspberry Pi (or different Arm SBCs)
So now we have this card, and I can plug it into a wide range of graphics playing cards… do any of them work? Or for avid readers of this weblog—has something modified since last year’s update?
Effectively… it is nonetheless a bit grim. Thus far we solely have some older AMD playing cards working with a kernel patch for the radeon
graphics driver, and the SM750 GPU which is used on ASRock Rack’s M2_VGA, utilizing this patch to the sm750
driver.
In some constructive information, although, Nvidia’s proprietary driver now not hard-locks-up when it hits the reminiscence errors on the Pi. Now it would try to load, however then error out. That would make debugging simpler… if the motive force supply had been absolutely open and obtainable.
Sadly, it is not.
And on AMD’s facet (in addition to different distributors for different PCI Specific playing cards, like Google’s Coral TPU), there is no want to both keep a fork of their drivers or spend time hacking their driver to work on the buggy PCIe bus on the Pi.
And the Pi is not alone—it appears different SBC SoCs just like the Rockchip RK3566 and RK3588 have a barely damaged PCIe implementation as effectively. ‘Cache coherency’ is the issue—these ARM SoCs which have their heritage in TV bins and embedded gadgets haven’t got absolutely working PCI Specific implementations.
Different ARM chips do, nevertheless, like this Ampere Altra Improvement Platform that simply arrived courtesy of Ampere:
It has a completely useful PCI Specific implementation, and likewise has 128 lanes of PCI Specific Gen 4… which is a couple of zillion occasions extra bandwidth than the single PCIe Gen 2 lane on the Pi and similar-era SoCs.
Try my preliminary evaluate of the Ampere system: Testing a 96-core Ampere Altra Developer Platform.
Apple’s M-series chips may need much more bandwidth (per lane), however there is no simple solution to get on the PCI Specific enlargement on them. Possibly the upcoming Mac Professional will make it occur, however I am not holding my breath.
Video
Try my video with much more element in regards to the undertaking: