Main Blog

    This blog has been closed. We will publish Android related malware analysis posts on our main blog at:

    http://joe4security.blogspot.ch/

    Analyzing Android/SpyBanker

    Today we will take a look at an Android malware that is generally labeled as "Android/SpyBanker" (part of the Android.Gepew family). The sample we will be look at has the MD5 0d28fa54f9c0d41801e8fb5a7b0433dd. Attention was brought to us looking at Symantec's blogpost, so we were quite interested to see if Joe Sandbox Mobile, our Android analysis system, would allow similar or even more insights into the malware behavior without much reverse engineering effort and minimal time.

    To be more concrete, the sample is part of a larger attack vector that stems from Windows, in an attempt to infect an USB connected device with a fake Android App Store that acts as a banking trojan, but has some interesting functionality. This type of attack vector is rather new, albeit we have seen the opposite direction (mobile device trying to infect Windows) before (see Android.Backdoor.Ssucl found about a year ago). Finally, we may note that we've introduced a "harmful classification" of samples and Joe Sandbox Mobile was able to mark the APK as malicious straight away. A fresh VirusTotal result determined that only 29/50 AV engines detected the APK as malicious (timestamp 2014-02-18).

    Report Overview

    As always when analyzing an APK (or any file in another Joe Sandbox product) we start out looking at the head of the report, that is where the behavior signatures are located that give a good first idea of what the sample is capable of doing and offer entrypoints into the disassembly code.


    Besides the suspicious signatures dealing with the network part, the "Boot Survival" (starting a service on phone boot), SMS Sending and Phone call Blocking signatures are usually a strong indicator for malicious behavior, if they appear in combination. The "Deletes other packages" signature, as we will see later, is part of an automatic "app re-installation" functionality that overwrites local banking apps with malicious versions.
    As a next step, let's take a look at the "Static Info" area of the report to get a better idea of the permissions requested or additional entrypoints, such as the registered receivers or services. It is usually quite easy to find interesting code locations following these points.


    As we can see, the APK wants to be notified about BOOT_COMPLETED, PACKAGE_ADDED, PACKAGE_REMOVED and some Phonestate/SMS related events. All of which are probably used to intercept incoming banking SMS (TAN numbers) and for persistance functionality.  Also, we see that a service called "CoreService" is created where we will probably find a lot more interesting code. Seems like an easy deal so far. Let's dive into the code!

    Taking a deeper look

    Joe Sandbox Mobile uses Hybrid Code Analysis and allows the viewer to differentiate between executed and non-executed functions (at least for what is algorithmically determined as executed based on the instrumented subset), as well as annotated disassembly code for better understanding of in-out parameter data. Either way, it is easy for us to find the main entrypoint browsing the disassembly listings, as we know the main activity name from the prior static info analysis. So what does the APK do? Upon startup, the user is prompted to activate device administrator rights, if the phone is not rooted yet. Then, irregardless of the user choice, the CoreService is started.


    In the CoreService, sensitive user data is gathered and sent initially to a C&C server. The C&C server URL is read from a shared preference file (in this case it is http://www.slmoney.XXX):


    Sensitive phone information, such as the phone contacts, the IMSI and the IMEI are extracted and appended to an opening URL connection:


    Depending on the C&C server result, a variety of commands may be executed. The command handler expects a JSON string and has a variety of commands, among of which the "changeapp" command is probably the most interesting, as it is used to replace existing APKs with malicious versions (in this case, banking apps):


    The C&C "auto installer" functionality to replace existing apps can be found here:


    First, as we can see in the screenshot above, a list of installed APKs is queried (we have a signature hit on the call, thus the function name is highlighted). The list of installed APKs is compared against a list of known banking target package names (e.g. com.korea.kr_nhbank):

    An "app.auto.install" event is sent to the core service handler:


    ... and finally CoreService.autoChangeApk is called.

    The last parts were not executed, because the C&C server is down and not responding to our initial request. In a future release version of Joe Sandbox Mobile, we will extend the cookbook with a command to simulate custom APK specific broadcast events, that would allows us to simulate some internal C&C trojan commands in a second run, but it is out of the scope of this blogpost.

    Full Analysis Report:



    Happy New Year!


    The Joe Security team wishes you success, satisfaction and many pleasant moments in 2013!

    Analyzing "Android-Trojan/FakeInst": Plug & Play premium SMS fraud

    Introduction

    As may be known by now, Joe Security offers free services to analyze APKs and other files. We often check submitted files to see if we come across anything special. Today, we found an interesting sample (MD5 0123078fac53446ab5d9527b6da1ab14) that is a typical SMS fraud APK that sends premium SMS to a specific number. The software is labelled by AV as "Android-Trojan/FakeInst" (and similar), albeit it actually does implement a working installing mechanism and has some kind of "License Agreement" term that outlines the SMS costs. Nevertheless, what is interesting about it is the way it works, as it seems to implement a "Plug & Play" mechanism that makes it very easy for anyone really to take affect of the dubious functionality and "turn over" the APK to implement their own premium SMS "registration service". ;-) What we will learn in this blogpost is how easy it is to understand the sample in only a couple of minutes using the comprehensive Joe Sandbox report. Let us take a look at what information we can extract using the report alone and then get down to the configuration mechanisms.

    Getting an overview

    The first thing we always do when we look at a sample is take a look at the signatures and static information to get a quick idea of what the APK might be up to. Here is the matched signatures and permissions overview at the top of the report:


    As we can see in the signature overview, the APK does a couple of suspicious things. Most APKs that combine these behavior signatures are usually malicious:

    • Connects to the internet (posts data, downloads data)
    • Executes code after phone reboot
    • Sends SMS
    • Obfuscated method names/uses reflection
    Often malware comes with a lot more "bad stuff", in this case it is bad enough and looking at the signatures alone gives us a good idea of what this might be doing (e.g. potentially leaking sensitive data, sending premium SMS, trying to hide behavior). Taking a look at the static information overview:


    As highlighted, we have some typical patterns: the APK is "play store compatible", that means it is meant to be spread as much as possible, it is signed by a valid certificate from Russia (suits the cyrillic) and it requires some unusual permissions (like the "MOCK_LOCATION" permission).

    Next, we take a look at the network traffic, if there is anything generated. Here we go:


    Taking a look at the dependency graph and the first HTTP packets, we quickly understand that it tried to download something from "trashbox.ru", a legitimate russian file host/news site actually. On a side-note: we took a look at the downloaded APK and it is the "play store" app certificate-signed by Google actually. Obviously noone wants to pay money for something that you can have for free and that is pre-installed on Android to begin with (maybe that is where the "Fake Installer" name comes from). ;-)

    Getting into the nitty-gritty

    Everything we got to learn about the APK so far we might have been able to extract from competing sandbox systems, but let's take a deeper look now. Before we dive into the disassembly, let us see what our "automatic button interaction" engine clicked and check out some more screenshots.


    As we see at the top, Joe Sandbox Mobile did quite a few clicks on the buttons and managed to forward the "installation process" of the APK that way. Also, we can see that the APK created a "lasttime" file in a ".pay" directory on our SD card (which is downloadable as part of the full report in Joe Sandbox Mobile). That the "button clicking" did actually work we can see on the following screenshots:


    After clicking the "далее" (Continue) button (see first screenshot in the Introduction section).


    The final screen of the analysis. Google translates it as "Thank you for using our services. Now you will receive a SMS-message with a password to the private site. By clicking on the link you will be able to download the file.". The two buttons скачать and выход read "Download" and "Output". As we will see, the paid app was already downloaded and it is the playstore APK. Possibly an SMS really is received later on in the process, but unfortunately our analysis system does not really send SMS (or receive SMS) over a mobile carrier, so we could not follow this pathway. Nevertheless, a costly SMS was sent already.

    Now let us take a look at some interesting functions that outline the configuration.


    The above function reads the ".dat" files (you can find them in the APK under the "assets" folder), which basically resemble a variety of "configuration files" that determine the way the installer behaves. The most important asset files are:

    • link.dat: Contains the URL in cleartext of which the APK is downloaded, in this case hxxp://trashbox.ru/files2/74704_96f1f8/googleplaymarket_3.8.17.apk
    • data.res: Contains the "parameter data" for the "pay process". It is UTF-8 Base64 encoded.
    • command.dat, title.dat, etc.: Contain "Label Texts" that are printed on the buttons.
     

    Here we see the "data.res" file contents being "decoded" into the parameter data "SMSNum-1: 2011SMSText-1: PM04333000276". Essentially, it is a "framework" that can be used to generate premium SMS apps for different countries in slightly different fashions. Of course, the download APK could also be malware and does not need to be something legitimate as in this case.

    Now let us take a look at how the configuration is loaded and executed:


    As we can see at the top, a "JSON"-String is actually used to execute functions with parameters as defined in data.res. In this case the "pay process" is obfuscated as the following JSON command:

    {\"c0\":\"android.telephony.SmsManager\",\"m0\":\"getDefault\",\"m1\":\"sendTextMessage\",\"p0\":\"java.lang.String\",\"p1\":\"android.app.PendingIntent\"}

    Obviously, an entire program code could be encapsulated in this format. The cX variables indicate class lookup instructions, the mX variables indicate associated method lookup instructions and the pX variables indicate the parameter types for the lookups. The lookup happens purely using java reflection API. Luckily for us, we resolve reflective invokes automatically with Joe Sandbox Mobile, so that we can follow this tricky process quite easily.


    In the screenshot above we see the call to "getDefault" of "android.telephony.SmsManager", which returns a SmsManager object that is later used to send the text message (as indicated by the JSON command). A few lines further down we see what we were looking for:


    As we can see, an alleged premium SMS "PM04333000276" is sent to the phone number "2011". Finally, a "lasttime" file with the device timestamp is created under /mnt/sdcard/.pay/lasttime, which we again see quite nicely in the enriched disassembly listing.

    Conclusion

    What we basically learned in this blogpost is the Hybrid Code Analysis (HCA) technology implemented in Joe Sandbox Mobile that combines dynamic and static analysis is a fundamental key to understanding any kind of software and its mechanisms. Today, it is by far not enough to analyze malware purely statically (see the previous blogposts dealing with heavy obfuscation), nor is it enough to analyze malware purely dynamically. We can only understand targeted threats and malicious behavior if we get down into the "nitty gritty" and take a deep look at the inner workings. Of course, we do not want to spend a huge amount of time to analyze anything "by hand", which is why we need comprehensive reports and a fully automated system that takes care of the tedious work. As shown in this blogpost, we were able to fully understand the sample in just 15 minutes. Also, we learned that it is possible to dynamically execute code using the java reflection API. Understanding these threats requires a fine-grained instrumentation as implemented by Joe Sandbox Mobile.


    Fully-Automated String Decryption and Data Leakage Detection using Hybrid Code Analysis (HCA)

    Introduction

    In June earlier this year we demonstrated our generic instrumentation engine with Opfake.C (MD5: 001a42a555b4bd39bf6ecd8b11441870) and showed how it was easily possible to hook calls to local methods matching certain method signatures (see this blogpost). In this concrete case, we log all invokes to static methods that take a String as input parameter and return a String, e.g.
     
    public static String method(String s)

    We define these type of methods as "DecryptString" methods signatures. Often, mildly sophisticated malware stores their Strings in an encrypted form in order to hinder pattern-based matches from static analysis AV engines. Thus, malware authors do not put out the effort to implement complex decryption algorithms and use simple techniques, such as substitution based ciphers.

    Usually, the encrypted strings are spread throughout the entire package and need to be decrypted quickly on-the-fly. The decrypted payload is usually a class/method name used to lookup class objects, method objects, reflective invokes or often to hide C&C URLs. Also, samples using encrypted strings usually try to encrypt all possible strings, so that we can assume there is going to be a lot of "DecryptString" method calls overall during runtime. So we had an idea: what if we record all I/O Strings of all invokes matching the "DecryptString" method signature and build a character-based "conversion map" and use that to decrypt information to try to decrypt other, non-executed invokes to the same method? And if that succeeds, can we build behavior signatures off of that data? Afterall, combining dynamic analysis results with static analysis to obtain behavior data is what Hybrid Code Analysis (HCA) is all about. Let's get to work.


    Building Input/Output Character Maps

    The first step was to improve our engine to build input/output character maps for all runtime invokes to methods matching the "DecryptString" method signature as noted above. Of course, the character maps we build need to take into account overloaded method names so that we can reliably account the logged data on a per function scope. Also, we only considered input/output data if the input/output String has the same length and characters differ. In the case of Opfake.C, there is really only one method that gives good results and which is used heavily to decrypt Strings. Here is the calculated Input/Output Character Map for "public static String mkfkejkpu.mkfkejkpu.mkfkejkpu(String s)" based on 400+ observed runtime calls:
     
    Input Output Input Output Input Output
    n 0 + a Z m
    9 1 8 A 0 N
    C 2 2 B s n
    R 3 @ b 3 o
    ; 4 ] c l O
    , 5 o C F p
    E 6 < d Y P
    i 7 A D p q
    M 8 . e Q R
    7 - B E e r
    K ( k f t s
    * ) h F z S
    U * H g ^ T
    4 , w G j t
    : . r h 1 u
    ? / x H X U
    b : - i D V
    g ? u I J v
    ` [ _ j L w
    V _ m J S W
    G } N K c X
    W + P k a x
    O < 6 l > y
    ) = v L y Y
    f > 5 M ( Z




    [ z

    Wow! :-) With the exception of a few characters (like the number "9"), we have almost a complete table of the main ASCII human readable characters. Also, the conversion map does not seem to be a simple substition cipher as "ROT-13" or the likes. Before we take a look if we can generate some good results using the character map on other non-executed invokes to the same method, let us take a look at how a typical non-executed code sequence looks like:


    As we can see above, without reverse engineering the "Decryption"-method mkfkejkpu and implementing some custom decryption algorithm, it will not be possible to understand what is going on there. Using some data flow analysis for the parameter (which is easy) and our previously calculated character map, it is possible for Joe Sandbox Mobile to fully automatically decrypt Strings for these calls, even though the code is never executed. This is what the results look like for the same code sequence:


    Aha! The code seems to be part of a routine that is building a C&C URL http://m-l1g.net/q.php that is probably used to post some data. Scrolling down a bit, we find this code sequence in the same method:


    which confirms our assumption that an HTTP based request will be executed (the reflective invoke happens shortly after). The "synthetically" (or heuristically) calculated return values are marked as "Synthetic Return" instead of "Return", as usual.

    Creating Behavior Signatures based on Synthetic Strings

    The decryption mechanism applies fully-automated at every non-executed invoke to the same method, we were able to understand the entire payload of Opfake.C. Using the data, we built a proof of concept signature that detects SMS sending code, even if the code isn't executed and the lookup Strings are residing in the package fully encrypted. Here is the code sequence:


    .. and here is the Signature:


    The signature matches if the Strings "android.telephony.SmsManager", "sendTextMessage" and a reflective invoke happen within the same code context. Of course, the signature offers a "Source" link to quickly jump to the relevant code location. Besides the signature above, we came up with two more signatures to help getting an overview of decrypted strings in the package quickly, especially if decrypted Strings appear in the same code context as a reflective invoke (a good indicator for hidden payload):



    See the "Uses an encrypted string to lookup and invoke a method via reflection" Signature for "payload hiding" code locations and the "Probably tries to hide strings using a DecryptString routine" signature for a full list.

    Detecting Sensitive Information Leakage

    Besides the really cool "auto-decryption" feature that we added to Joe Sandbox Mobile, we also added a second signature that detects if sensitive phone information is possibly being leaked. As outlined in the Chuli.A blogpost from August, we have been creating signatures that are more context-aware and work on dynamic session data, such as critical phone identifying information being leaked. In that post we showed how sensitive phone information was being posted in a base64 encoded format as part of HTTP post parameters. Posting data to a C&C server PHP file is not new and a lot of malware uses encrypted payload and not only simple encodings. In the case of Opfake.C, the malware authors decided to encrypt sensitive phone information using the AES cipher algorithm. Here is the relevant code location:


    In the figure above, we see an AES cipher instance being initialized.


    Shortly after the initialization code, we see a call to Cipher.doFinal with a String that contains sensitive phone information, such as the IMEI/IMSI and other sensitive phone information. In the new version of Joe Sandbox Mobile, whenever a Cipher encrypts a payload that contains sensitive phone information, the following signature triggers:


    The "Leaked:" part of the comment indicates which sensitive phone information has been identified and a quick entrypoint to the relevance code location is provided, as well. Of course, implementing this signature would not have been possible without context-awareness (the session information) and full parameter data of the runtime invoke.

    Conclusion

    In this blogpost we demonstrated the power of Hybrid Code Analysis (HCA) that combines dynamic and static analysis in Joe Sandbox Mobile. Using HCA, it was possible to understand how Strings are decrypted in Opfake.C and re-apply the learned character mapping to other encrypted Strings on non-executed invokes (essentially "simulating" a decryption). That way, it was possible to understand the full payload of Opfake.C and create intelligent behavior signatures. Furthermore, we outlined that context-awareness and parameter-level instrumentation, as implemented in Joe Sandbox Mobile, can open doors to more complex signatures that detect data leakage.


    Analyzing "Chuli.A" even if the C&C server is down

    Introduction

    In the past weeks we have been working on Joe Sandbox Mobile implementing some engine improvements to enhance code coverage, adding more powerful behavior signatures and extracting even more dynamic data from APK sample analysis. In the previous blogposts we focused more around reflective invoke resolvement and string decryption techniques.

    In this post, we will introduce some of the new features and show how to trigger payload even if the C&C server of a trojan is down. Therefore, we will take a deeper look at "Chuli.A" (MD5 c4c4077e9449147d754afd972e247efc), an interesting Android Trojan that was found in some targeted attacks against Tibetian and Uyghur activists (see Android Trojan Found in Targeted Attack). The goal will be to see if we can extract the same (in the best case, even more) information as the three Kaspersky Lab Experts using only the Joe Sandbox Mobile report. This is a useful task we undergo regularly to quality test our sandboxing system, as it quickly reveals weaknesses and shows room for improvement, if the desired goal cannot be achieved.



    Taking a deeper look at Chuli.A

    Running the sample on our free apk-analyzer.net service does not show much activity (see first run here). We see some "relatively harmless" signatures, very little dynamic analysis data and no internet traffic, as can be seen in the following screenshots:



    In Kaspersky's blogpost the authors claim that "It is important to note that the data won't be uploaded to C&C server automatically. The Trojan waits for incoming SMS messages (...)". This is not quite true, as we will later see. So, why is the sample not showing its real behavior? Let's take a look at the receivers defined in the AndroidManifest.xml (see "Static File Info" - "Receivers" in the report):


    We noticed that none of the intents are simulated by our default cookbook and most of the intent actions are protected intents that are only sent by the operating system. Meaning that it is not possible to receive these intents by declaring components in the manifest, e.g. android.intent.TIME_TICK:

    "Broadcast Action: The current time has changed. Sent every minute. You can not receive this through components declared in manifests, only by exlicitly registering for it with Context.registerReceiver(). This is a protected intent that can only be sent by the system." (see the Android Reference Manual)

    Taking a look at the "onReceive" method of ScreenReceiver in the disassembly shows that it checks to see if the "com.google.services.PhoneService" is running and then starts it accordingly. Following this, a chain of receivers and services are registered and executed.


    Before we continued analysis here, we decided to quickly implement a new cookbook command "_JBSimulateTimeTick()" which sends the android.intent.TIME_TICK action to all components that have an intent-filter specified. Why? Because it is easier to understand malware if we combine static analysis with dynamic data. After rerunning the sample, the first glance at the detected signature and internet traffic looks very promising:



    Now that we obviously triggered a lot of behavior, let us take a look at the PhoneService. To summarize the analysis, when the PhoneService.onCreate() is executed this is what happens:

    • The service calls sendInfo using the "create" command (see first HTTP POST above): hxxp://64.78.161.133/android.php?create=phone<timestamp>
    • It checks if the sendInfo was successful
    • *IF* sendInfo was successful (status code 200), it continues execution and does this:
      • Another receiver "sendReceiver" is registered for the action "com.google.system.receiver":
      • Following PhoneService.serviceInit() is called, which registers an "AlarmService":

    Let's continue with AlarmService. AlarmService.onCreate() does the following:

    • Sets up a receiver for android.provider.Telephony.SMS_RECEIVED (the "alarmReceiver" mentioned in the blogpost by Kaspersky)
    • Gathers sensitive information, such as phone name, location, contacts and sms data
    • Creates a "com.google.system.receiver" action, stores the gathered data in a Bundle and sends a broadcast

    The last broadcast causes the previously registered "sendReceiver" to be triggered, which in turn sends the sensitive phone information. A good entrypoint into these code locations is the "APK behavior" -> "Installation" section of the report:


    As can be seen in the screenshot above, the started services and registered receivers that happen during runtime are outlined quite nicely. Clicking the links directly navigates into the associated disassembly code.
     

    Dynamic Data Refined: HTTP POST parameters

    As outlined previously, sensitive phone information is being sent to the C&C server. This data is sent using the following url sheme: hxxp://64.78.161.133/data/phone<timestamp>/process.php?datatype=<base64encodeddata>
    The actual data transmission is visible in the report's dynamic data column as one of the calls to DefaultHttpClient.execute() in SendInfo.run() (duplicate code exists in SendInfo.reSendInfo()):


    Besides presenting simple toString() output per parameter object/primitive, the Hybrid Code Analysis engine of Joe Sandbox Mobile also displays special meta-data for certain parameters. The "HttpPost" object is one of these. In this case, we also print the "getURI", "getEntity", "getEntity.getContentType" and a special base64 decode result for "getEntity", which also URL decodes the getEntity string (see EntityUtils.toString()). This way the POST variables can become visible to the analyst in some cases where a simple encoding, such as base64 is chosen by malware authors.

    Code Coverage Improved: Spoofing the C&C Server Status Code

    The reason why the action broadcast might have slipped by is that the "sendReceiver" registration and "AlarmService" registration depends fully on whether or not the HTTP POST request to hxxp://64.78.161.133/android.php succeeds (status code 200). It is a lot easier to detect these kind of checks when you see "live data" in the disassembly. Here is where the status code is checked (part of SendInfo.sendInfo()):


    As we can see, the status code prevents execution from properly advancing (the C&C server is down). That is why we added a new feature to our engine that allows us to spoof the status code to 200 (OK) no matter what the real status code was. This feature is turned off by default and can be enabled using the _JBSetEngineOption('spoofHttpStatusCode', 'true') command as part of a custom cookbook:


    Rerunning the sample with the new cookbook command provides the desired result:



    Using this spoofing technique, we were able to enhance code coverage and progress execution as far as possible. Without this trick (or patching the code manually) the APK would not execute its malicious payload.

    New Behavior Signature: Detecting Phone Information Leakage

    As the most important static and dynamic data is propagated and made available to the behavior signature interface of Joe Sandbox Mobile, we decided to write a new signature that checks whether or not phone information, such as the device id, sim serial number, phone number, etc. is "leaked" via HTTP POST parameters to web servers. In this case, not only the "raw" POST bytes are analyzed, but also the decoded base 64 version. Here is how the signature looks in the Chuli.A sample when triggered


    How easy is it to create signatures like that? Well, the open signature interface allows users to quickly implement and activate new signatures. In this case, the signature itself was not all that difficult to implement, here is an excerpt:


    As can be seen, the signature interface is quite straightforward and offers the ability to detect malicious behavior in a generic way and mark samples as malicious (only 1/46 antivirus solutions detect Chuli.A as malicious).

    Conclusion

    In this blogpost we were able to see that sophisticated sandboxing systems can be a powerful tool to get a deep understanding of malware without the need of being a reverse engineer expert. The comprehensive report offers a lot of entrypoints into the code off the shelf. In this context sophisticated sandbox system means triggering malicious behavior not only by simulating user interaction, but at times implementing data manipulation algorithms, such as optionally spoofing the HTTP status code allowing to analyze malware beyond the day of release (when C&C servers are down). Using refined (e.g. decoded) dynamic data, it is possible to achieve more accurate analysis results for the analysis system and the analysts even so. In this case, we were able to quickly detect the base64 decoded parameters and create a "phone information leakage" behavior signature that will be used to classify similar malware in the future.