Rock appround the clock, our research in DEFCON

Wednesday, August 29, 2018


In the world of Threat Intelligence, determining the attacker’s geographical location of is one of the most valuable data for attribution techniques, even if not perceived like that, this information may lead a research one way or another. One of the most wanted information is where the author comes from, where he lives in or where the computer was located at the time of an attack.

We focused our research in taking advantage of this kind of “time zone” bugs for tracking Android malware developers. We will describe two very effective ways to find out the developer's time zone. We have also calculated if these circumstances has some real relation with malware, diving in our 10 million APKs database.

AAPT time zone disclosure bug 
The Android app development kit (SDK for Android) comes with a tool called “aapt”. This program packs the files that are going to compose the application and generates an .apk file which basically corresponds to a zip format.

If the aapt tool is used directly by command line or maybe via a developing plugin outside Android Studio, the files composing the APK will be generated with a date following this format: 1980-01-01 {offset_GMT]:00:00. Where [offset_GMT] represents the time zone corresponding to the operative system configuration time zone where the app is being packed.
The figure represents a simple .apk generated by command line with aapt in a computer with time zone configured to GMT +3.

Offset GMT in the modified time field image
Offset GMT in the modified time field

As observed, the modification time of the files is 01-01-80 and “03”, which corresponds to GMT +3. We observed this issue with different real apps and time zones. Why?
During the process where aapt adds a new file to an .APK, (ZipFile.cpp – line 358), you may observe in line 500 a call to “setModWhen”, using variable “modWhen” as an argument.

Calling setModWhen in aapt source code image
Calling setModWhen in aapt source code

But when going back in the code, there is no part in the code where “modWhen” gets an useful value. It just keeps its “0” value, initially set in line 367 in the same file (ZipFile.cpp – line 367):

Setting modWhen in aapt code image
Setting modWhen in aapt code

Function setModWhen will then be called (always) like:

pEntry->setModWhen(0);

Inside this function (ZipEntry.cpp – line 340), the modWhen variable (from now on referenced as “when”) is used in line 351 as part of this operation:

even = (time_t)(((unsigned long)(when) + 1) & (~1));

Which will be called like this, taking into account “modWhen” value:

even = (time_t)(((unsigned long)(0) + 1) & (~1));

The result is (obviously) “0”. This value will be stored in the variable “even” that will be later on used as an argument for “localtime” function. This function allows to create the structure for “tm * ptm” date and will be used to set date an hour for the modified field in the files added to the .APK itself.

setModWhen function inside aapt source code image
setModWhen function inside aapt source code.

Because of the timestamp (“even” variable) used as an argument for localtime is not valid, the date generated for the files is not the real one, but 0. There is a correction for the years (it is set to 80 if lower) and it finally gets the format already described: “01-01-80 [offset_gmt]:00:00”.

Next figure shows how “even” is set to 0, just before localtime function receives it as an argument.


Variable “even” in runtime image
Variable “even” in runtime

The code goes on, and now it splits the data (day, month, year, hours, minutes and seconds) so they can be used separately (in this case printed separately in the screen). The order in which localtime returns the result is: seconds, minutes, hours, day, month, and year. That is, for example, in the first position (0x006A0E10) you may find 4 bytes for the seconds, and in the last one, (0x006A0E24) we can find another 4 bytes for the year.


Result from localtime function as in memory image
Result from localtime function as in memory


returned information example imagenFollowing the colours in the figure, the returned information goes like this:

So definitely, localtime function is the one returning this offset (in this case: +3), taking it from the operative system. Aapt will later round up the numbers to 01-01-80 because this is the “Epoch” for PKZip standard. The reason may be that localtime times to adapt every date to your own time zone where the computer is supposed to be located. 

Honoring the documentation of the localtime function, this should not happen because it is specified that if this function gets a null or “0” value as an argument, return value should be null. So, when is localtime getting the GMT offset and returning it? For Windows System, if TZ variable (time zone) is not set in the application itself, localtime function will try to extract time zone information from the system itself and the function will go for this data when receiving a (real or not) argument value. An invalid timestamp like “null” or “0”, will just be taken as a “0” hour and the returned value will contain the GMT offset, that ends up cleanly added to the place where the hours should be.
In UNIX/Linux this particularity exists as well. If a developer uses aapt by command line, the GMT offset for his/her time zone will be “added” to the modifying time for the files inside the APK. Focusing in aapt source code, setModWhen function uses localtime_r instead of localtime (code is the same, but depends on the system where it is run), but the argument passed to it is still “even” variable (with a value of 0). This function is basically the same as in Windows, but there is no TZ variable to decide: it will always add the time zone set in the operative system.

What to conclude then? Localtime is not handling errors as it should. When receiving a 0 or null argument, it should return null, not 0 plus whatever your GMT (TZ for Windows) is, added to this value. On the other hand, aapt makes a mistake using 0 as a “constant” argument for feeding this function.

GMT zone certificate calculation
As said, .APK (and jar, for this particular technique) follow the PKZIP standard. That is, they are .zip files for what is worth and share most of the PKZIP specifications. In the case the APK is built not directly using aapt, there will not be a chance to know the creator time zone and all the “modification time” fields for the files inside the zip should be the “right ones”. However (a few years ago), we have found another factor that will allow us to know the time zone where the developer compiled the application, just as interesting as the one mentioned and as a complementary method. The method is about calculating the difference between the right timestamp of the files and the timestamp of the certificate inside the APK to sign it (this date is stored in UTC, so we have references enough to calculate the time zone). 

UTC Time - ZIPs file gets the offset and thus, time zone (map from timeanddate.com) image
UTC Time - ZIPs file gets the offset and thus, time zone (map from timeanddate.com).

Relation with malware
We tried to stablish a relation between: 
  • Malware/adware creators and the way APKs are compiled (using aapt in command line). 
  • Malware/adware creators and the way ad-hoc and disposable certificates are created. 
For this experiment, we took 1000 files (unless stated otherwise) from the ones with the leakage in every flavor (1000 files leaking GMT+1, 1000 leaking GMT+2… etc) and checked for malware. AAPT disclosure bug: Hemos intentado establecer una relación entre:
With the AAPT bug:

samples with AAPT Timezone disclosure image

Green cloumns are not representative because of using too few samples.

Leaking because of a disposable certificate:

APKS with file/certificate datetimes (do not forget DST!) image

So we can conclude that, basically, GMT+4, GMT+5, GMT+8, GMT-6 and GMT-7 are the time zones producing more malware. Why this little difference between techniques? For example, with the first aapt bug, predominant time zones producing malware are: GMT+4, GMT+8 and GMT-7. With the certificate technique, GMT+5, GMT+8 and GMT-6 are the ones producing more malware. These GMTs correspond to some parts of Russia, China, and United States West Coast. We think that this difference is because of the Daylight Saving Time. These techniques are tied to DST so some countries may use +-1 hour difference depending of the season. China does not use DST (and Russia either since a few years ago).

Aside, we know our database contains about a 6% of malware in any set without these characteristics we may find. So, we will use this as a “correction factor” to compare, we finally get these numbers:

comparison table image


Metadata
As one of the techniques related with metadata, we show how all the strings automatically generated by Android Studio are in specific components created by the IDE itself, while the text strings written by the developer are found in other files, not associated to a specific component. For example, when executing:

./aapt dump --values resources app.APK | grep '^ *resource.*:string/' --after-context=1 > output.txt

Extracting all the resources of an Android application, filtering by text strings imagen
Extracting all the resources of an Android application, filtering by text strings.
We get the strings written by the developer directly that, very likely, will use his/her native language.

Conclusions and future work
We have presented two techniques to leakage time zone from an app. One of them, related to an aapt bug, does not only shows a bug in the way dates are handled, but a possible problem of a system function (localtime) not honoring the specifications. This may affect other programs in some other ways. By studying these techniques, we have a new way of possibly detecting automated malware creation by analyzing when and how certificates are created to sign these apps. Aside of the statistics about where the malware comes from analyzing its time zones, this may be used as an important feature in machine learning systems to early detect Android malware.

Aside, we have shown some tools and tricks for a quick view of all this useful information around APKs metadata. Future work should be more accurate about DST, taking the season into account to classify malware, and maybe using more samples to get better conclusions. This is just a briefing of the complete paper, that you may find here:

This is just a briefing, the whole paper is here:


Innovation and Labs

No comments:

Post a Comment