Now Reading
DJI – The ART of obfuscation

DJI – The ART of obfuscation

2024-02-20 01:31:03

Research of an Android runtime (ART) hijacking mechanism for bytecode
injection by way of a step-by-step evaluation of the packer used to guard the
DJI Pilot Android utility.


Logo

Introduction

On the planet of Android functions, it isn’t unusual to return throughout
functions protected by a packer. The position of a packer is to guard all
or a part of the applying code from static evaluation. There are a lot of causes
why a developer would possibly need to defend an utility:

  • Defend precious enterprise logic;
  • Defend utility monetization logic (e.g. a license administration mechanism);
  • Evading standard evaluation instruments to cover malicious logic;

Right here, we check out the DJI Pilot
utility, to not perceive why builders need to defend their code – this
has already been the topic of earlier work (see particularly this
DJI Pilot analysis)
– however to spotlight a runtime mechanism carried out by DJI to guard its
utility code. This safety is predicated on using a modified model of
the SecNeo packer.

The article particulars the varied phases within the evaluation to know how the
utility code is obfuscated. A Python proof-of-concept named
DxFx for statically unpacking
the DJI Pilot utility is offered as sensible help for this text.
DxFx doesn’t declare to be a SecNeo unpacker. Its sole goal is to enhance the
reader’s understanding of the varied mechanisms carried out by the packer
by way of Python code. It won’t be maintained sooner or later.

Focused utility

The evaluation is carried out on the newest model of the DJI Pilot utility:

  • Model: 2.5.1.17
  • SHA256: 642aa123437c259eea5895fe01dc4210c4a3a430842b79612074d88745f54714
  • Download link

DxFx offered in help of the article has additionally been examined on the next
variations of the DJI Pilot utility:

  • Model: 2.5.1.15
  • SHA256: d6f96f049bc92b01c4782e27ed94a55ab232717c7defc4c14c1059e4fa5254c8

and

  • Model: 2.5.1.10
  • SHA256: 860d9d75dc2b2e9426f811589b624b96000fea07cc981b15005686d3c55251d9

Bytecode, the place are you?

Major evaluation

Static evaluation of the APK initially reveals that the results of bytecode
decompilation is, to say the least, uncluttered…


Decompiled tree

It’s because, like different packers, SecNeo leaves solely a bootstrap code in
the bytecode to launch the applying’s unpacking section. Right here, the packer
bootstrap code masses the native libDexHelper.so library:


Decompiled tree

Step one within the evaluation is subsequently to seek out the bytecode containing the
utility’s enterprise logic.

The packer logic is current within the native library libDexHelper.so. Nevertheless,
the code of this library is itself packed. So, we have now to unpack… the packer
to research its logic.

Because the goal of this text is to not perceive how the packer itself is
protected, this half is just not handled in-depth, and we merely dump the
library at runtime from the DJI Pilot utility course of reminiscence area. There
are a large number of how to do that, utilizing instruments reminiscent of gdb or Frida.

Nevertheless, chances are you’ll be in for a number of surprises:

Can't connect to course of 25562: Operation not permitted (1), course of 25562 is already traced by course of 25598

or:

Failed to connect: course of not discovered

The packer comprises some countermeasures, as partially described on this
issue,
to forestall using dynamic instruments. Thankfully, these will be simply bypassed.

As soon as libDexHelper.so has been dumped from reminiscence, it may be analyzed with a
disassembly software.

First have a look at the packer binary

An preliminary temporary evaluation of the libDexHelper.so library reveals the presence
of the decrypt_jar_128K image. A hook of the related perform with
Frida reveals {that a} buffer is handed as enter and comprises the contents of a
DEX file as output :

'use strict';

const dlopen_ext = Module.getExportByName(null, '__loader_android_dlopen_ext');

perform primary() {
  const decrypt_jar_128K_addr = Module.getExportByName(
    'libDexHelper.so', 'decrypt_jar_128K'
  );

  /**
  * decrypt_jar_128K perform hook
  */
  Interceptor.connect(decrypt_jar_128K_addr, {
    onEnter: perform(args) {
      this.dex_buffer_ptr = args[1];
    },
    onLeave: perform() {
      console.log(`nReading dex buffer @ ${this.dex_buffer_ptr}`);
      console.log(this.dex_buffer_ptr.readByteArray(16));
    }
  });
}

/**
 * Bootstrap
 */
const boot_intercept = Interceptor.connect(dlopen_ext, {
  onEnter: perform(args) {
    this.title = args[0].readUtf8String();
  },
  onLeave: perform() {
    if (this.title.contains('libDexHelper.so')) {
      primary()
      boot_intercept.detach();
    }
  }
});

The results of the script is:

Studying dex buffer @ 0x74d1e63140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 4a 8b b5 fd 1b 58 54 1f  dex.035.J....XT.

Studying dex buffer @ 0x74d268c140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 6f 02 2a 0b 48 26 a5 e0  dex.035.o.*.H&..

Studying dex buffer @ 0x74d3005140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 8a b4 08 1c 90 61 5a 34  dex.035......aZ4

Studying dex buffer @ 0x74d3643140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 cb b9 8e 72 35 3a d8 bc  dex.035....r5:..

Studying dex buffer @ 0x74d4055140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 c2 8b a3 7b 64 3b c6 54  dex.035....{d;.T

Studying dex buffer @ 0x74d4a5f140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 dd 47 c2 4e a1 39 cc 79  dex.035..G.N.9.y

Studying dex buffer @ 0x74d552f140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 58 17 ae a9 56 21 f1 1f  dex.035.X...V!..

Studying dex buffer @ 0x74d5a77140
           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  64 65 78 0a 30 33 35 00 84 62 14 0d ac 5f b7 f8  dex.035..b..._..

So, right here we will see that 8 DEX recordsdata (with the dex.035 magic) are
unpacked. It’s doable to change the earlier hook to have the ability to dump the
varied DEX recordsdata as they’re unpacked. One other answer is to know the place
the packed DEX recordsdata are saved within the APK and the way we will unpack them
statically.

Static unpacking of DEX recordsdata

The benefit of the dynamic extraction methodology lies in its fast
implementation. Nevertheless, the latter requires the applying to be run and an
surroundings set as much as permit instrumentation of the method. Static extraction,
then again, permits chilly unpacking of DEX recordsdata immediately from the APK.
The disadvantage of the static strategy is that it requires a barely deeper
understanding of how the packer works.

DEX recordsdata the place are you?

Some variations of the SecNeo packer retailer the bytecode within the classes0.jar
file situated within the APK belongings. Sadly, this isn’t the case right here as
the file doesn’t exist.

Nevertheless, if we take a better have a look at the courses.dex file situated on the root
of the APK and alleged to include solely the packer bootstrap code, we will see
that one thing is unsuitable with its dimension:

du -h courses.dex
63M     courses.dex

63MB is a really massive dimension for the code we noticed within the first evaluation.
Normally, the multidex mechanism
will break up the bytecode file into a number of .dex recordsdata properly earlier than reaching
this dimension. File entropy evaluation additionally offers us some fascinating clues:


Entropy of classes.dex

We will see 8 peaks tending in the direction of an entropy of 8, which can
counsel that these chunks are encrypted. The earlier Frida hook revealed
that 8 DEX recordsdata have been unpacked, which might be no coincidence. The 8 chunks
proven within the graph correspond to 128KB sections, so we will make the connection
with the decrypt_jar_128K image of the perform. A differential evaluation
with the dynamically obtained recordsdata lastly confirms that the courses.dex
file comprises all 8 DEX recordsdata after the SecNeo bootstrap code. The primary 128K
chunk of every DEX file is encrypted to most likely conceal sure info that
may very well be used to detect the presence of the hidden recordsdata just like the
magic number
within the header.

Encryption evaluation

To know how the primary 128KB of every DEX is decrypted, we have to analyze
how the decrypt_jar_128K perform works.

One of many perform’s fundamental blocks comprises the encryption logic:

loc_8DC78
ADD             W3, W3, #1      ; i++
LDRB            W6, [X5],#1     ; x = buffer[cursor++]
AND             W7, W3, #0xFF   ; i %= 256
SUB             W0, W5, W1
MOV             X3, X7
CMP             X2, X0
LDRB            W0, [X8,X7]     ; +--
ADD             W4, W4, W0      ; | j = (j + S[i]) % 256
AND             W9, W4, #0xFF   ; +--
MOV             X4, X9
LDRB            W10, [X8,X9]    ; +--
STRB            W10, [X8,X7]    ; |
STRB            W0, [X8,X9]     ; | S[i], S[j] = S[j], S[i]
LDRB            W7, [X8,X7]     ; +--
ADD             W0, W7, W0      ; +--
UXTB            W0, W0          ; |
LDRB            W0, [X8,X0]     ; | x = S[(S[i] + S[j]) % 256] ^ x
EOR             W0, W0, W6      ; +--
STURB           W0, [X5,#-1]    ; buffer[cursor-1] = x
B.HI            loc_8DC78

That is RC4‘s pseudo-random technology
algorithm (PRGA):

i := 0
j := 0
whereas GeneratingOutput:
    i := (i + 1) mod 256
    j := (j + S[i]) mod 256
    swap values of S[i] and S[j]
    t := (S[i] + S[j]) mod 256
    Okay := S[t]
    output Okay
endwhile

Evaluation of the decrypt_jar_128K CFG offers us details about the place totally different
elements of the RC4 algorithm are situated:


decrypt_jar_128K CFG

Encryption key technology

The important thing’s cross-references result in a technology perform primarily based on a easy XOR
between a 16-byte hardcoded fixed and the 16 first bytes of the string
com.dji.trade.pilot:


Generate RC4 key DEX

We are actually in a position to statically unpack DEX recordsdata.

The DEX encryption is at the moment carried out within the DexPool class of DxFx

Nevertheless, disassembly of the unpacked DEX recordsdata reveals an issue. The code for
numerous strategies appears to have been stolen, overwritten, and changed
primarily by nop directions:


Stolen bytecode

We will subsequently assume that the packer has a second bytecode safety
mechanism.

Bytecode the place are you? Once more…

Technique debug data

The assorted strategies whose code is stolen all appear to include a
debug info offset
(debug_info_off) which additionally seems within the physique of the strategy:


Method degug_info_off

It appears there’s something fishy with the debug_info_off, this subject may
play a job within the methodology code unpacking mechanism, maybe as an identifier.
Furthermore, a courses.dgc file situated within the APK belongings comprises a big
variety of debug data offsets utilized in stolen strategies… The courses.dgc file
subsequently appears a probably fascinating candidate for additional evaluation.

The courses.dgc file

An entropy evaluation reveals that the start of the file (oddly sufficient, a
128KB chunk) most likely comprises encrypted information:


Entropy of classes.dgc

It is a good result in comply with within the libDexHelper.so binary.

Encryption evaluation

Seemingly, a mechanism just like the 128KB chunk encryption of DEX
recordsdata is used for the courses.dgc file. Evaluation of libDexHelper.so reveals
a perform whose scheme additionally corresponds to an RC4 encryption algorithm:


DGC RC4 decryption

We will affirm that’s the courses.dgc decryption perform by utilizing a easy
Frida hook:

'use strict';

const dlopen_ext = Module.getExportByName(null, "__loader_android_dlopen_ext");
const nullptr = 0;

perform primary() {
  const rc4_fct_addr = Module.getExportByName(
    'libDexHelper.so',
    'p416302DA23BEF5D5A81473ACFAC4DA25'
  );

  Interceptor.connect(rc4_fct_addr, {
    onEnter: perform(args) {
      console.log(args[0].readByteArray(32))
    }
  });
}

Interceptor.connect(dlopen_ext, {
  onEnter: perform(args) {
    this.title = args[0].readUtf8String();
  },
  onLeave: perform(retval) {
    if (retval != nullptr && this.title.contains('libDexHelper.so'))
      primary();
  }
});

The result’s:

           0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F  0123456789ABCDEF
00000000  ef bd de 50 8b bb 81 c7 80 63 35 ca 95 6e 1d 1d  ...P.....c5..n..
00000010  36 d5 ef 02 df 2a 50 2b e8 88 03 c3 9b 45 da 5f  6....*P+.....E._

It matches the primary bytes of the courses.dgc file:


First bytes of the classes.dgc file

As with the decrypt_jar_128K perform, the essential block initializing S to
id permutation reveals the presence of a cross-reference to the important thing.

Encryption key technology

From the cross-references, it’s doable to find the important thing technology perform.
The CFG of the perform seems to be a bit just like the one used to generate the DEX
decryption key. Nevertheless, a barely extra complicated mechanism is used to generate
the important thing:


DGC RC4 key generation

First, the MD5 hash of a 4096-byte binary
blob in reminiscence is computed. MD5 is recognized by taking a look at a sub-function known as
within the earlier CFG. This sub-function corresponds to the
MD5 algorithm for calculating a
block (512 bits). The algorithm is flattened and comprises hardcoded Okay
constants (0xe8c7b756, 0xd76aa478, …).

The binary blob is loaded immediately from libDexHelper.so and will be discovered even
within the packed model of the library. This chunk seems to be preceded by a
form of header containing the title mthfilekey:


mthfilekey entry in libDexHelper.so

As soon as the MD5 has been calculated, a deterministic sequence is generated by
calling one other sub-function. Evaluation of the perform reveals that it’s a
Fibonacci sequence:


Fibonacci generator function CFG

Subsequent, the 16 bytes of the MD5 hash are XORed with 16 bytes retrieved immediately
from the 4096-byte chunk (mthfilekey) following a deterministic stroll primarily based
on the Fibonacci sequence beforehand generated.

See Also

We are actually in a position to statically generate the RC4 key that decrypts the primary
128KB of the courses.dgc file.

courses.dgc file format

As soon as decrypted, taking a look at courses.dgc reveals that the start of the
file comprises a desk indexing all the applying strategies
(code_item)
whose code has been stolen:


classes.dgc index layout

Every desk merchandise factors to the code_item of a technique:


classes.dgc index layout

Nevertheless, because it stands, the Dalvik opcodes current within the methodology our bodies appear
inconsistent and subsequently most likely obfuscated… At this stage, we have now all
the weather wanted to hyperlink the stolen bytecode (even when obfuscated for the
second, we are going to handle this later) to the applying’s varied broken
strategies. Initially, it is fascinating to know when the packer repairs
the strategies in order that the applying can run usually. This mechanism is
notably fascinating as a result of it makes use of an ART’s performance.

ART hijacking

ART in a nutshell

The Android Runtime (ART) is
Dalvik’s successor runtime in command of optimizing and executing code for
Android functions and different Android system parts. The Android Runtime — How Dalvik and ART work?
article by Paulina Sadowska is a good introduction to ART.

Class loading mechanism

When a technique is to be executed, the runtime should first examine that the category to
which the strategy belongs is loaded. If this isn’t the case, the runtime will
load and hyperlink the category. The linking course of includes a number of phases as
described within the Java Language Specification:

  1. Class verification;
  2. Class preparation;
  3. Decision.

The stage we’re serious about right here is the category verification as a result of it is
exactly this stage that’s instrumented by the packer. Amongst different issues,
this step checks the bytecode of the category’s varied strategies for
inconsistencies. It’s carried out within the ClassLinker::VerifyClass methodology of
ART.

One of many fascinating options of VerifyClass is that it calls the
UpdateClassAfterVerification methodology:

static void UpdateClassAfterVerification(Deal with<mirror::Class> klass,
                                         PointerSize pointer_size,
                                         verifier::FailureKind failure_kind)
    REQUIRES_SHARED(Locks::mutator_lock_) {

  // [...]

  // Now that the category has handed verification, attempt to set nterp entrypoints
  // to strategies that at the moment use the swap interpreter.
  if (interpreter::CanRuntimeUseNterp()) {
    for (ArtMethod& m : klass->GetMethods(pointer_size)) {
      if (class_linker->IsQuickToInterpreterBridge(m.GetEntryPointFromQuickCompiledCode())) {
        runtime->GetInstrumentation()->InitializeMethodsCode(&m, /*aot_code=*/nullptr);
      }
    }
  }
}

UpdateClassAfterVerification updates the entry factors of the varied strategies
of the verified class. So, it has to iterate over all
the strategies of the category and name the Instrumentation::InitializeMethodsCode
methodology:


InitializeMethodsCode callgraph

Anatomy of the hook

The Instrumentation::InitializeMethodsCode methodology offers a crossing level
on each methodology within the utility that may be executed. It’s exactly this
crossing level that’s exploited by the packer to restore strategies whose code has
been stolen. To do that, libDexHelper.so locations a hook on
InitializeMethodsCode:


Hook call graph

The prolog of the Instrumentation::InitalizedMethodsCode methodology is patched to
redirect the execution stream to a perform in libDexHelper.so that we name
PatchMethodCode :


PatchMethodCode CFG

A few moments later… we
can deduce the hook’s anatomy and the totally different operations carried out by
PatchMethodCode :


Hook anatomy with callgraph

As soon as the PatchMethodCode perform known as, it first masses the
obfuscated bytecode of the present methodology utilizing the debug_info_off as an
identifier with the strategy index desk of the courses.dgc file. The code is
handed to the perform we name right here DecryptMethodCode to be
de-obfuscated. Then code_item (dex::CodeItem)
of the strategy (art::Method)
is patched to level to the buffer containing the de-obfuscated bytecode.

This mechanism ensures that the broken code in every methodology is repaired earlier than
the strategy is executed. At this level, the very last thing we have to perceive is
how bytecode is obfuscated in courses.dgc. To do that, we have to analyze the
DecryptMethodCode perform.

Bytecode de-obfuscation

The perform is somewhat small, and an evaluation of some fundamental blocks offers a
good thought of the way it works:


DecryptMethodCode CFG

The perform iterates over every opcode. The obfuscated opcodes are XORed with
the low byte of the strategy’s info_debug_off offset. The results of this
operation is then used because the index of a substitution desk. The obfuscated
opcode is changed by the one obtained from the substitution desk:

opcode = S[obfuscated_opcode ^ info_debug_off & 0xff]

Because the substitution desk is theoretically a most of 256 bytes, one would possibly
assume that one of many RC4 KSA beforehand reversed is reused to generate it,
however… no.

The S substitution desk is just saved within the libDexHelper.so library
and will be immediately extracted from the packed binary. We’ve all the things we
want to repair all of the broken strategies and the unpacked DEX will be decompiled
correctly:


Fixed method

We are actually in a position to carry out static unpacking of the applying.

  • The strategy fixing step is carried out within the Dex class of DxFx.
  • The bytecode de-obfuscation is situated within the MethodCipher class of DxFx.

Conclusion

By way of the unfolding of the evaluation methodology used to create a static
unpacker, we will see the totally different encryption/obfuscation algorithms utilized by
the packer at totally different phases. As well as, we have been in a position to spotlight an
fascinating safety mechanism involving bytecode injection and
exploiting Android runtime hijacking.


If you need to be taught extra about our safety audits and discover how we might help you, get in touch with us!

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top