Void Computing – Extra AR glasses USB protocols: the Worse, the Higher and the Prettier

We have discovered a drop-in alternative for the Nreal Mild, known as
the Grawoow G530 (or Metavision M53, and who is aware of what number of different
names), so we lastly have 3 extra protocols to jot down about on this weblog.
The previous blog post
within the matter has change into considerably of a reference in
the group, and has truly pushed some gross sales for the corporate, so it
appeared like a good suggestion to jot down about our newer findings, and share
it with anybody .
The put up itself will most likely be a bit dry for the informal reader.
Sorry about that.
We began trying to find a alternative for the XREAL Light
from Day 1, as a result of it isn’t supported or manufactured by XREAL anymore.
We would have liked glasses with stereo cams and energetic assist.
Some months in the past I obtained contacted on LinkedIn by a Chinese language vendor, and after a little bit of speaking,
we purchased a take a look at piece. It wasn’t as straightforward as simply happening a webshop (and I needed to do every kind of
import paperwork), but it surely was nonetheless clean: ship some mails, wire cash, obtain glasses.
I name it worse, as a result of it’s a tiny bit worse in each regard: it appears to be like cheaper,
the plastic components match worse, and the protocol is lacking some info that
is essential for good assist. The principle factor it has going for it’s that it is nonetheless
accessible from distributors.
The structure is extraordinarily just like the XREAL Mild, a lot in order that I am
solely going to link the draft architecture pic.
Principal elements:
- All the usual DP-over-USB-C driving two micro OLED LCD show stuff on two USB 3 lanes
- An USB 3 RGB CAM on the remaining two lanes
- Stereo grayscale cams and IMU pushed by an OV-580 (precisely the identical because the Mild, even
the protocol… largely) - Distance sensor on the brow
- 4 bodily buttons: brightness up/down, and quantity up/down
- Judging by the USB system descriptors, the MCU and audio functionalities are
finished by the identical chip.
Gotta love Chinese language copying tradition. By the best way, the glasses appear to be extensively
white-labeled; the Metavision M53 appear to be the identical {hardware}. Even the
firmware and SDK say G530 and never M53.
USB interfaces
The system comes up as two hubs (one USB 3 and one USB 2) and 5 units:
- Realtek Semiconductor Corp. RGB Digicam
- VID:
0bda
, PID:5880
- Bathroom-standard USB3 digital camera, able to full HD at some fairly excessive body charges
- CVT Electronics.Co.,Ltd G530
- VID:
1ff7
, PID:0ff4
- Interfaces
- 0: HID. That is the glasses management (MCU) endpoint.
- 1,2,3: Audio
- OmniVision Applied sciences, Inc. USB Digicam-OV580
- VID:
05a9
PID:0f87
- The OV580 with an UVC (stereo cam) and a HID (IMU) interface
MCU Protocol
The MCU management is predominantly by means of management packets, though there
is an interrupt endpoint for the brow detector occasion.
The management protocol is all the time two management packets, one to ship the command
and one to obtain the consequence. The magic libusb
parameters are:
Ship:
bmRequestType: 0x21 (CTRL_TYPE_CLASS|CTRL_RECIPIENT_INTERFACE|ENDPOINT_OUT)
bRequest: 9
wValue: 0x201
wIndex: 0
Obtain:
bmRequestType: 0xa1 (CTRL_TYPE_CLASS|CTRL_RECIPIENT_INTERFACE|ENDPOINT_IN)
bRequest: 1
wValue: 0x102
wIndex: 0
Be aware that these are the usual SetReport
and GetReport
HID requests (see
Part 7.2 within the HID Device Class definition),
so these may be accessible with some commonplace report-based HID APIs.
The packet construction is as follows:
- Header: 2 bytes, fastened
0xaa, 0xbb
- Command: 2 bytes, huge endian
- Extra information dimension: 2 bytes, huge endian
- Extra information: variable, might be 0 bytes
- Checksum: sum of all earlier bytes, excluding the
0xaa
,0xbb
half.
An instance packets:
- Get serial quantity:
[0xaa, 0xbb, 0x80, 5, 0, 0, 0x85]
- Response:
[0xaa, 0xbb, 0x80, 5, 0, 5, 0x33, 0x31, 0x33, 0x33, 0x37, 0x8b]
- (I’ve by accident overwritten the serial earlier than I might learn the unique one.)
Instructions (information is empty for “Get” instructions right here):
Command | ID | Knowledge |
---|---|---|
Get firmware model | 0xffe1 | 8 bytes, unknown format |
Get serial quantity | 0x8005 | The serial quantity as UTF-8 string |
Set serial quantity | 0x8004 | Similar as above |
Get show mode | 0x8007 | Show mode as a single byte: 0 is mirrored, some other nonzero is SBS 60Hz |
Set show mode | 0x8008 | Similar as above |
Get show brightness | 0x801d | Brightness as a single byte, 0-4 |
Set show brightness | 0x801e | Similar as above |
Some instructions I have never listed (or examined actually), however might be simply obtained from
the SDK libraries:
- Sensor and digital camera allow/disable: All sensors and cameras are enabled and streaming
by default, so no want to the touch these - Show settings, like brightness distinction per channel, and every kind of low-level DP stuff
- Audio quantity
- Firmware replace
- Sketchy stuff like getting and setting a HDCP key
You may as well constantly learn the Interrupt endpoint on endpoint quantity 0x85
,
the place it’s best to get key and distance sensor occasions (in the identical 0xaa 0xbb
format
because the management packets), however I solely ever obtained the “glasses taken off” occasion,
and it was not price implementing.
Getting the calibration data
Versus the XREAL protos, the place you may get the calibration JSON
from the OV580, you truly should do it over the above MCU protocol, utilizing
command IDs 0x8009
(metadata) and 0x800a
(precise calibration information).
The metadata response appears to be like one thing like this:[0, 0, 0, 241, 0, 0, 10, 210, 3, 142]
- 2 bytes header, which needs to be 0
- 2 bytes is the “max packet dimension” (huge endian). We’ll be doing 256 byte management packets anyway, however good to know I assume?
- 4 bytes information dimension (huge endian)
- 2 extra unknown bytes.
The “get calibration information” packet wants extra information: a 0 byte, after which
4 byte offset, in huge endian. So it is [0, 0, 0, 0, 0]
for the primary packet,
[0, 0, 0, 0, 241]
for the subsequent, and so forth.
Response is similar 5 bytes adopted by a 0 byte (so 6 in whole), after which
the precise information. For those who request extra information than the calibration file dimension,
the packet will likely be smaller, and even empty. So requesting the metadata is variety
of ineffective, you’ll be able to simply request information till you get an empty response.
IMU protocol
Happily that is one other glasses that provides you an IMU stream out of the field,
and also you needn’t struggle for it. All it’s a must to do is constantly learn 0x80
chunks
on the HID interrupt endpoint 0x89
of the OV580 system.
It’s a massive packet, and the SDK solely parses the uncooked accelerometer, gyro and temperature
information. Quite a lot of the packets appear to be fastened bytes, and the one factor that modifications
(apart from what we already know) are two sequence numbers. Yeah, sequence numbers, not
even correct timestamps.
All information are transferred as little endian signed ints. The conversion components
are the identical as within the Invensense MPU6050 docs.
Knowledge | Offset | Dimension | Conversion |
---|---|---|---|
Acceleration | 0x58 | 3*4 | Divide by 16384.0 after which convert g s to m/s2 |
Gyroscope | 0x3c | 3*4 | Divide by 16.4 after which convert °/s to rad/s |
Temperature | 0x2a | 2 | Divide by 326.8, then add 25.0 |
The Rokid Max is a logical evolution of the Rokid Air.
Higher design, higher match, higher protocol, and the DisplayPort half is outwardly 2ms faster,
decreasing motion-to-photon latency. All the things else is just about the identical,
a lot so that almost all of it may be dealt with by the same code.
They even stored the gimmicky focal adjustment knobs (although
it is nonetheless unusable for individuals who have astigmatism)
Protocol
The principle new protocol factor is “sensor information marker = 17
” within the IMU information packets, which mixes
all earlier packets into one. Its construction appears to be like like this:
Index | Bytes | Description |
---|---|---|
0x00 | 1 | Sensor information marker (17) |
0x01 | 8 | Timestamp (little endian) |
0x09 | 3×4 | Gyroscope x, y and z studying in f32 format |
0x15 | 3×4 | Accelerometer x, y and z studying in f32 format |
0x21 | 3×4 | Magnetometer x, y and z studying in f32 format |
0x2d | 1 | Bodily key statuses (bitfield) |
0x2e | 1 | Proximity sensor standing (close to=0, far=1) |
0x2f | 1 | ? |
0x30 | 8 | Timestamp of final VSYNC (little endian) |
0x38 | 3 | ??? |
0x3b | 1 | Show brightness |
0x3c | 1 | Quantity |
0x3d | 3 | ??? |
Display modes
The Max added a bunch of latest show modes:
Mode | SBS | Decision | Refresh price |
---|---|---|---|
0 | 1920×1080 | 60Hz | |
1 | Sure | 3840×1080 | 60Hz |
2 | Sure* | 1920×1080 | 60Hz |
3 | 1920×1080 | 120Hz | |
4 | Sure | 3840×1200 | 90Hz |
5 | Sure | 3840×1200 | 60Hz |
*: It is a “half SBS” mode, that means that it splits the common HD
picture in half, after which stretches every half horizontally over every of
the glasses.
Modes above 6 are equal to mode 3.
The XREAL Air is not an evolution of the Mild, it’s rather more
just like the Rokid Max, however with a manner higher design. And I imply quite a bit higher,
the factor truly appears to be like like common (albeit a bit huge) sun shades.
It’s the first AR glasses that passes the “Tram #4 take a look at”: I might put on it on Tram #4
and other people would not actually discover. Possibly the cable hanging down.
Sadly it would not have a digital camera, so no inside-out 6DOF anymore.
Alternatively, it has absolutely the lowest show delay out of all 6 we
described in these weblog posts, so the picture is rock steady even with dynamic
head actions.
The protocol is bizarre. They stored the separate USB interfaces for the MCU
and IMU + DSP pair. Each are completely different from the Mild’s.
Sadly this put up was written manner after I completed work on the Air,
so I am writing it based mostly on the code of ar-drivers-rs.
MCU protocol
Packets are despatched over common HID learn()
and write()
primitives, over interface 4
(endpoints 0x86
and 0x07
). Packet dimension is 0x40
each methods.
Index | Bytes | Description |
---|---|---|
0x00 | 1 | Header (0xfd) |
0x01 | 4 | Checksum (see beneath) |
0x05 | 2 | Size of extra information |
0x07 | 4 | Request ID (not checked by the MCU, solely used to establish solutions. Could be something) |
0x0b | 4 | Timestamp (additionally not checked, might be 0) |
0x0f | 2 | Command ID |
0x11 | 5 | Zeros (most likely) |
0x16 | n | Extra information |
Each int is Little Endian.
The checksum is CRC32(Adler) like
the Mild’s. The checksum information is from byte 5 to the top of the packet (i.e. the size discipline + 17).
Once more, there is no such thing as a must individually allow occasions or {hardware}, so we
solely want the naked minimal instructions:
Command | ID | Knowledge |
---|---|---|
Get MCU FW model | 0x0026 | Model as UTF-8 string |
Get serial quantity | 0x0015 | The serial quantity as UTF-8 string |
Get show mode | 0x0007 | Show mode as a single byte |
Set show mode | 0x0008 | Similar as above |
There may be a .js file within the official app that describes much more
instructions for each the Air and the Mild. There aren’t many attention-grabbing issues,
only a couple model strings, firmware replace, reboot, and fidgeting with
the show.
Some asynchronous occasions additionally arrive on the identical channel (generally between
command and its reply). They use the identical packet format because the instructions and replies.
The one one price in search of is ID 0x6c05, which is the important thing press (extra exactly key launch)
occasion.
Display modes
In addition they added much more show modes:
Mode | SBS | Decision | Refresh price |
---|---|---|---|
1 | 1920×1080 | 60Hz | |
3 | Sure | 3840×1080 | 60Hz |
4 | Sure | 3840×1080 | 72Hz |
5 | 1920×1080 | 72Hz | |
8 | Sure* | 1920×1080 | 60Hz |
9 | Sure | 3840×1080 | 90Hz |
10 | 1920×1080 | 90Hz | |
11 | 1920×1080 | 120Hz |
*: It is a “half SBS” mode, that means that it splits the common HD
picture in half, after which stretches every half horizontally over every of
the glasses. That is the alternative for Mode 1, which was vertically
stretched half-SBS on the Mild.
Invalid show modes trigger an error, and I checked all 256 values.
The IMU protocol
IMU packets are additionally despatched/acquired with common HID learn()
and write()
,
over interface 3 (endpoints 0x84
and 0x05
), with 0x40
-sized packets.
Index | Bytes | Description |
---|---|---|
0x00 | 1 | Header (0xaa) |
0x01 | 4 | Checksum (similar as MCU checksum) |
0x05 | 2 | Size of extra information |
0x07 | 1 | Command ID |
0x08 | n | Extra information |
Each int is Little Endian.
Apparently, whereas the packet format may be very completely different, the instructions are
precisely the identical because the Mild’s:
Command | Id | Command information |
---|---|---|
Get calibration file size | 0x14 | Calibration file id in line with the SDK, would not appear to have an effect on something. Could be empty |
Get calibration file half | 0x15 | Must be block quantity. Does not do something, might be empty. |
Allow IMU stream | 0x19 | 0: disable, 1: Allow |
The calibration file format is analogous, though this time they did not stuff
3 completely different recordsdata in there, you solely have the JSON.
The IMU packet format is completely different, extra compact, however the logic is similar:
Index | Bytes | Description |
---|---|---|
0x00 | 2 | Header (0x01, 0x02) |
0x02 | 2 | Temperature (uncooked information from the ICM-20602) |
0x04 | 8 | Timestamp (nanoseconds) |
0x0C | 2 | Gyroscope multiplier |
0x0e | 4 | Gyroscope divisor |
0x12 | 3 | Gyroscope X studying |
0x15 | 3 | Gyroscope Y studying |
0x18 | 3 | Gyroscope Z studying |
0x1b | 2 | Accelerometer multiplier |
0x1d | 4 | Accelerometer divisor |
0x21 | 3 | Accelerometer X studying |
0x24 | 3 | Accelerometer Y studying |
0x27 | 3 | Accelerometer Z studying |
0x2a | 2 | Magnetometer offset |
0x2c | 4 | Magnetometer divisor |
0x30 | 2 | Magnetometer X studying |
0x32 | 2 | Magnetometer Y studying |
0x34 | 2 | Magnetometer Z studying |
Sure, there are 3 byte signed integers there. They’re encoded the identical
manner as “common” 4 byte integers (little endian, one’s complement), however on 3 bytes.
Fortunately the Rust parsing library I exploit has built-in assist for these,
as a result of manually changing is a ache.
One factor to notice is that the coordinate system of the uncooked sensor
readings is different from the calibration file’s coordinate system.
I all the time wished to assist the Rokid Max, however I did not actually wish to purchase one
simply to do it. Fortunately, a form soul from Canada truly got in contact with me on github
paid for each the glasses and my time to do it. Thanks once more Mauve.
The one additional was that I needed to additionally make a Monado driver.
Monado is a pleasant piece of software program that implements the OpenXR API,
so any OpenXR-using apps (main 3D engines, some AR desktops for instance)
can use any Monado-supported {hardware}. They’ve a really pleasant discord, and the code is excellent high quality,
it was a pleasure to work with, and my code obtained reviewed mainly immediately. As soon as the
feedback had been fastened, it was in trunk the subsequent day.
Help for the Rokid Max has been merged to primary. Some individuals are working
on supporting the Nreal Air, and (as of writing) it really works properly, however there are
some kinks to be ironed out.
Possibly you’ll be able to assist 🙂
For those who want Augmented Reality problem solving,
or need assist implementing an AR or VR thought, drop us a mail at info@voidcomputing.hu