Fixing the Quantity on my Bluetooth Earbuds

I just lately bought a pair of Tozo T6 earbuds. They’re nice and I like them however they play a sound everytime you pair, unpair, or join them, and it’s means too loud for my desire. I additionally wasn’t in a position to repair it by e.g. setting the equalizer to subtract a number of decibels throughout the board. I requested them by way of electronic mail about this they usually responded promptly and mentioned that there was nothing they might do, which is comprehensible – it’s most likely not a typical request. Nevertheless it was too loud for me to proceed utilizing them, so I made a decision to attempt to remedy the issue myself.

Tozo T6 earbuds. They’re nice. I solely have one grievance…
To unravel the issue, I would like to change the firmware that runs on the gadget. My preliminary expectation of how this is able to work was:
- I’d get a binary file from someplace for my gadget. Individuals usually share firmware recordsdata on-line, perhaps I might discover a copy utilizing a search engine.
- The firmware file can be in some easily-understood binary construction like ELF.
- The audio recordsdata can be contained within the binary someplace, perhaps as an ELF image (going from the final level). Figuring out how the picture format labored would let me modify the info inside it whereas ensuring that I didn’t by chance give it a corrupt picture and probably brick my gadget.
- The audio recordsdata can be in a format that may be simple to rework, perhaps PCM because it’s operating on an embedded gadget with maybe restricted computational energy to decode audio (after all, it’s additionally a headset, so most likely it may well decode compressed audio simply positive).
- As soon as I can modify information inside the firmware picture (both unpack/repack it or modify information in-situ by figuring out its offset and size inside the picture), accomplish that to make the audio quieter (e.g. if it’s PCM then perhaps halve every pattern, and so on).
- Lastly, I’d flash my modified firmware to the gadget with some form of software made for my mannequin of kit or its underlying chipset, and so on.
A few of these assumptions ended up being fully false and unwarranted (I do not know why I hoped that the audio can be uncompressed on a low-power gadget like this, for instance), however this was my pondering beginning out. It additionally doesn’t embody any reverse engineering, which ended up taking on more often than not that wasn’t spent on organising infrastructure (reminiscent of an intercepting proxy), however this was largely simply happening rabbitholes. In the long run, I really didn’t must reverse engineer a lot in any respect. So this put up is much less about reverse engineering and extra in regards to the basic technique of fixing my specific downside.
Step one is to gather details about what precisely the gadget is. There appear to be a number of completely different entities concerned within the manufacturing of low-cost electronics:
- The seller, who really manufacturers and sells the gadget – Tozo, on this case.
- The chipset, a selected piece of {hardware} the gadget is generally designed round, which runs the firmware code and may need particular options for the applying at hand.
- The ISA – the chipset will run code of a selected instruction set, with a “core” derived from another base tech like ARM, MIPS, and so on (or maybe they rolled their very own).
- Further options – the chipset may combine know-how from different folks onto their chip, reminiscent of extra coprocessors, chips to interface with completely different sorts of {hardware}, and so on.
The chipset for my gadget ended up being an Airoha AB1562, which apparently is predicated on an Tensilica Xtensa ISA and features a “Cadence HiFi DSP coprocessor”. I wasn’t capable of finding this out looking for my gadget mannequin nonetheless, so I ended up simply trying by the disassembly for his or her Android app. There I discovered an SDK for an organization known as “Airoha”, with references to particular chip fashions and containing all the primitives for speaking to gadgets. I wasn’t capable of finding some other chipset SDK’s, so I assumed this was the chipset producer.
I nonetheless needed to discover the particular mannequin although. After a bit extra looking out, I discovered a Reddit neighborhood for discussing AirPods clones known as /r/airreps, which gave me some good recommendation about easy methods to proceed. They’ve additionally written an Android utility known as “AirReps156X” which additionally makes use of the Airoha SDK, and may present diagnostic details about Airoha gadgets. I used to be in a position to connect with this app with my gadget, so it’s undoubtedly an Airoha chipset, and one of many diagnostic strings was “QW_1562U_SDK1.5.1”, which leads me to consider that my gadget chipset is within the Airoha AB1562 collection:

The “Airreps 156X” app show gadget data.
The appliance additionally permits you to flash new firmware, which is a reasonably crucial merchandise on the guidelines. So now that we’ve gotten that prerequisite out of the best way and recognized the chipset all that’s left to do is to seek out the precise firmware and modify it.
The Tozo utility is my first lead. Whenever you join your earbuds to the Tozo app, it shows the firmware model and whether or not or not it’s “present”. So it should speak to a server someplace, which is aware of what the most recent model of the firmware is:

Tozo’s app displaying the present firmware model.
In principle, I might most likely simply learn by the decompiled code in jadx or one thing till I discover out what servers it’s speaking to, how, and what it does when it checks for updates (reminiscent of, presumably, downloading the firmware recordsdata I’m in search of). However that’s a whole lot of work, so I’ve a greater concept – when the app checks if the firmware is updated, it may additionally make API requests that would shed some mild on the place to seek out the precise firmware recordsdata. It would even have the ability to be tricked into making an attempt to replace, which might additionally lead me to the precise URLs. So some fast and soiled “dynamic evaluation” by way of visitors interception is the logical subsequent step.
Organising an intercepting proxy
To this finish I arrange an intercepting proxy utilizing my wi-fi NIC with hostapd and mitmproxy, and patched the Tozo app with apktool + uber apk signer in order that it’ll allow us to strip the TLS and eavesdrop on its uncooked community visitors.
Patching the APK is fairly customary stuff – I simply adopted this gist. The concept is that Android purposes have two CA shops, one that may be simply modified by the consumer, and one that may’t, and by default most Android apps solely test TLS certificates in opposition to the latter. However, by patching the APK, we will inform it to make use of the previous too, which is the place we put our mitmproxy-provided TLS certificates that we’re going to make use of to eavesdrop on all the community visitors utilized by our app. Then we have now to signal it so Android will take care of it.
The intercepting proxy setup was fairly simple – simply arrange the AP, arrange some iptables guidelines to direct visitors to mitmproxy’s listening port, and do the standard NAT tune and dance:
|
|
The script I used to start out and cease the TLS-stripping wi-fi AP. I do know make
isn’t actually the precise software for this, but it surely’s a pressure of behavior at this level.
Snooping on the applying’s community visitors
As soon as it was all working, I noticed that once I linked the gadget to the app and the “present” string popped up subsequent to the firmware model, it made a request to an endpoint /api/v1/getOtaVersionV3. And, lo and behold, the response incorporates hyperlinks to all the firmware bins we’re in search of! How good. No trickery wanted.
getOtaVersionV3 Request in Wireshark getOtaVersionV3 Response in mitmproxy
There are 4 recordsdata, two per earbud, every having a “FotaPackage” and a “FileSystemImage”. The 2 filesystem photographs are similar, so we wind up with three distinct recordsdata – two fotapackages for L and R earbuds, and the filesystem picture.
The very first thing anyone does after they get a bizarre file is, they run the Linux command “file” on it, to see if it has a magic quantity that signifies its file format:

Unhelpful.
…and, if that doesn’t assist, they could run strings or hexdump over it, to see if there’s any attention-grabbing human-readable ASCII strings in it:

From the filesystem picture – some filenames, not less than.
after which binwalk to see if there’s any recordsdata embedded in it:

The LZ4 area incorporates the NVROM, which isn’t helpful to us – it doesn’t include any audio recordsdata.
Sadly, binwalk didn’t discover something, even the mp3 recordsdata whose filenames are referenced instantly within the picture. They’re certainly in there, it’s simply that the mp3 file format is simply not very simple to carve out of arbitrary binary information as a result of it doesn’t have any form of magic quantity (it may well begin with both 0xFFFF or 0xFFFE, neither of that are notably distinctive, and there’s no footer). So though you’ll be able to inform they’re in there, it’s not instantly apparent easy methods to unambiguously calculate the offset and size for every mp3 file. So, I made a decision one of the best ways to determine this out can be to decipher the filesystem picture format, which most likely has data that tells you the place every file begins and ends.
Entropy evaluation
The following step for that is entropy evaluation. This mainly tells you what components of a file are fixed (0x00 or 0xFF are in style bytes for this), which components resemble random noise, which components are legitimate ASCII textual content, and the offsets at which a type of issues adjustments into one other. It’s helpful as a result of it generally permits you to visualize the construction of one thing with out really figuring out something about it.
The filesystem picture regarded promising (generated with http://binvis.io/):

It does some grouping of the info into blocks to make the construction extra seen, so it’s not row-by-row such as you may anticipate.
Sadly, the FotaPackage recordsdata nonetheless had been clearly encrypted or compressed in some way:
It’s not trying good. Oof.
I additionally seen that the left and proper FotaPackage binaries had some curious variations – their headers solely differed sporadically, whereas the physique was similar apart from the top, the place there was about 7KB of full distinction.
The header (0x00-0x1000) seems to be unencrypted and solely differs in small segments. Then, on the finish, the footer immediately adjustments to fully completely different, at round 0xC4E38.
I wasn’t fully certain as to the that means of this, past the truth that there was clearly some form of opaque transformation at work. My preliminary guess was encryption, with the identical key/IV however completely different plaintexts, and that the sudden distinction corresponds to a single-byte distinction (maybe an #ifdef EARBUD_R doBluetoothMasterThings(); #else doBluetoothSlaveThings(); #endif form of factor) that then results in the remainder of the file being completely different, however I wasn’t in a position to confirm this. No matter why, it was apparent I wasn’t going to get something out of them with out critical effort.
A fast appraisal of the scenario
The truth that we all know the audio is mp3 was really fairly unhealthy information to me at first look. My understanding of media encoders is that they typically have a whole lot of choices for easy methods to encode one thing, whereas generally a given decoder will barf on a superbly well-formed file that occurs to make use of a characteristic it wasn’t anticipating.
That is very unhealthy for us for 2 causes:
- Our decoder is of totally unknown provenance, who is aware of what it’d barf on.
- The audio will get performed proper when the gadget first pairs, so if we produce an mp3 file that the decoder doesn’t like and the gadget crashes earlier than we will hook up with it once more, then we threat placing it into an unrecoverable state.
Moreover, even when we use exactly the anticipated encoding parameters whereas producing our volume-adjusted mp3 recordsdata, if we alter the size of the file whereas modifying it then we should additionally guarantee that we account for this after we modify the filesystem picture, the place what that particularly means is dependent upon the precise construction of the filesystem format. Presumably, it data the size of every file someplace, and we want to ensure we regulate this quantity to precisely refect the brand new size of the file (or else it would both be truncated or have rubbish added to the top). This can be a lot of labor, and with an unsure end result.
So, at this level I used to be considerably nervous about the way forward for this undertaking, and was desperately making an attempt to determine easy methods to proceed with none re-encoding.
Fortunately, it seems that you may certainly modify the quantity (or “acquire”) of an mp3 file with out altering its size, or re-encoding it, and even modifying its metadata. It’s form of like how one can rotate a JPEG file with out re-encoding it as effectively – you’ll be able to simply look inside its information constructions and modify them for this one specific transformation while not having to vary anything. Fairly neat!

An mp3 file reworked with mp3gain – see, only some bytes of distinction.
ROFS
Again to the filesystem picture – it appears to include the mp3 recordsdata that correspond to the sounds I’m making an attempt to enquiet, and I need to substitute them with modified variations, so at a naked minimal I must know the place recordsdata start and finish inside the picture. At this level, binwalk couldn’t establish them, so I believed that the issue was that both they had been both obfuscated in some way (compressed/encrypted) or the construction of the filesystem is likely to be making it troublesome to establish them. So, I made a decision that the following level of assault was to know the construction of the filesystem picture, which begins with the ASCII string “ROFS”:

Hexdump of ROFS picture.
Step one is to seek for details about something with that identify on-line, however no cube. I’m fairly certain it’s bespoke to this specific chipset producer, as I used to be fully unable to seek out any reference or documentation of one thing known as “ROFS” that may describe the file I’ve and the Airoha SDK I’d later discover incorporates an implementation of an interface for studying recordsdata from it.
At this level, I made a considerably regrettable resolution that the following plan of action was to attempt to assault the firmware, in order that I might get to the code that presumably understands the filesystem picture format. However the firmware code was (seemingly) encrypted, so I made a decision to see in the event that they had been doing one thing foolish with their encryption and test if perhaps the FotaPackage recordsdata had been decrypted client-side by the SDK earlier than being despatched out over the wire. I used to be finally in a position to confirm with some certainty that the SDK doesn’t rework the firmware in any means earlier than sending it out, but it surely took me a number of hours of studying decompiled code earlier than I got here to this conclusion. So after all, I didn’t achieve attacking the firmware crypto and it was all a waste of time. Oh effectively.
SDK Breakthrough
The ultimate breakthrough occurred once I looked for the chipset identify on-line, and located a replica of their SDK. Wanting by it, I might see that it had a bunch of .mp3 recordsdata in it – the identical ones I might hear on the gadget. I wrote a fast python program to test if a file was contained inside one other file (most likely a software already exists for this?) and verified that the mp3 recordsdata within the SDK had been contained within the filesystem picture verbatim.
|
|
bincontains.py
I used to be barely nervous that the ROFS picture may include extra information, reminiscent of checksums for the recordsdata inside it, however I briefly skimmed the ROFS-related code from the SDK (sadly it solely appeared to exist as prebuilt object recordsdata) and it solely had a number of symbols in it for working on the filesystem, none of which instructed the presence of checksumming:

There’s not a whole lot of code in right here, and there’s no exterior references to something regarding checksumming.
So with that out of the best way, at this level I even have all the pieces I would like to finish the duty of modifying the quantity of the sound recordsdata within the firmware picture with no additional reverse engineering. I’ve:
- A solution to flash up to date firmware to the gadget, in addition to the firmware recordsdata themselves.
- Information that the mp3 recordsdata from the SDK are included verbatim within the filesystem picture (no compression, splitting into blocks, and so on). This implies I’ve their lengths and offsets within the filesystem picture.
- Information that you may modify the acquire of an mp3 file with out re-encoding it or altering its size.
- The belief that the filesystem format doesn’t embody any checksumming or extra details about its recordsdata that may be invalidated upon modifying their occupant byte vary within the picture.
Then, it’s simply so simple as looping over the mp3 recordsdata, and if it’s contained within the picture, operating mp3gain on the file after which changing it within the picture with the gain-modified model. I used an adjustment of -19.5 decibels.
|
|
binsearch.sh

A fast binary diff of the ultimate firmware picture – only some bytes of distinction, as anticipated.
Lastly, I flashed it to the gadget and it labored!

Hooray!
I shortly verified that the gadget was absolutely operational and the sound was, certainly, a lot quieter than it was once I began.
Mission achieved!
I didn’t find yourself having to decrypt the firmware (most likely unimaginable for me) or perceive the ROFS picture format in any respect – more often than not spent reverse-engineering was really happening rabbitholes that didn’t assist me in the long run.
I additionally form of want that quantity management of system sounds was a first-class characteristic – from a UI perspective, I believe it’s an error for a tool that performs audio to not have a quantity management that modifies all the sound that will get produced by the gadget. However apparently it has a workaround, so I assume it’s positive.
However yeah, all in all this was a fairly enjoyable little undertaking, would do once more/10.