Android 伪加密和解决思路

Android 伪加密和解决思路

我们都知道Android的apk文件就是一个zip格式的文件。由于工作需要,经常要解压apk文件拿到里面的资源,可是最近很多apk通过各种解压软件解压的时候都会失败,但是却能够安装和使用aapt2工具查看包的内容。本来通过python的zip可以批量解压,现在都要安装怕不是要了老命,于是就研究一下Android 11源码中的zip解压库,看看有什么特殊的地方。

zip格式

https://pkware.cachefly.net/webdocs/APPNOTE/APPNOTE-6.2.0.txt 这里是官方文档,想要最详细的格式可以看这里。

image-20220614204601161.png
粗略来看zip可以分为这三个部分,第一部分保存文件数据,第二部分是核心目录保存的是第一部分中的文件的信息,最后是结束标志,他的作用首先是标志zip文件的结束,第二是存储了核心目录的信息,所以解析zip文件反而是从后往前来解析的。

end of central directory record(ECOD)
  I.  End of central directory record:
​end of central dir signature    4 bytes  (0x06054b50) //首先就是4个字节的标志位0x06054b50,用于找到EOCDnumber of this disk             2 bytes//当前的硬盘编号number of the disk with thestart of the central directory  2 bytes//核心目录开始的硬盘编号total number of entries in thecentral directory on this disk  2 bytes//当前磁盘中保存的核心目录entry总数total number of entries inthe central directory           2 bytes//核心目录entry总数size of the central directory   4 bytes//核心目录大小offset of start of centraldirectory with respect tothe starting disk number        4 bytes//核心目录开始位置相对于磁盘编号的偏移.ZIP file comment length        2 bytes//注释长度.ZIP file comment       (variable size)//注释内容的内容

解压zip的第一步操作就是在EOCD中找到核心目录开始的位置和大小。

central directory
 Central directory structure:
​[file header 1]... [file header n][digital signature] 
​File header:
​central file header signature   4 bytes  (0x02014b50)//魔数version made by                 2 bytes//压缩用的版本version needed to extract       2 bytes//解压需要的最低版本general purpose bit flag        2 bytes//通用位标记,如果最低位是1就是加密为0就是未加密compression method              2 bytes//压缩方法last mod file time              2 bytes//文件最后修改时间last mod file date              2 bytes//文件最后修改日期crc-32                          4 bytes//CRC-32算法compressed size                 4 bytes//压缩后大小uncompressed size               4 bytes//未压缩的大小file name length                2 bytes//文件名长度extra field length              2 bytes//扩展域长度file comment length             2 bytes//文件注释长度disk number start               2 bytes//文件开始位置的磁盘编号internal file attributes        2 bytes//内部文件属性external file attributes        4 bytes//外部文件属性relative offset of local header 4 bytes//本地文件header的相对位移。
​file name (variable size)。      //目录文件名extra field (variable size)     //扩展域file comment (variable size)   //文件注释内容 
​Digital signature:
​header signature                4 bytes  (0x05054b50)size of data                    2 bytessignature data (variable size)

核心目录由一个个file header组成,每一个file header描述了一个文件,可以拿到文件名。文件数据的位置和大小,接下来就可以去数据部分拿到文件解压了,其中general purpose bit flag & 0x01拿到最低位的值表示是否加密,将其改为1就可以实现最简单的伪加密,因为实际在打包时并没有加密设置密码只是修改了标识位,在android安装的时候不会去读这个标识位,而很多zip库和zip解压软件是会根据这个标识位来判断是否需要输入密码,从而实现了反解压的能力。

[local file header 1][file data 1][data descriptor 1]. ..[local file header n][file data n][data descriptor n]A.  Local file header:
​local file header signature     4 bytes  (0x04034b50) //标识位version needed to extract       2 bytes //能解压的最低版本general purpose bit flag        2 bytes //general purpose bit flagcompression method              2 bytes //加密方法last mod file time              2 bytes //文件最后修改时间last mod file date              2 bytes //文件最后修改日期crc-32                          4 bytes //CRC32校验码compressed size                 4 bytes //压缩后大小uncompressed size               4 bytes //未压缩的大小file name length                2 bytes //文件名长度extra field length              2 bytes //扩展域长度
​file name (variable size)//文件名extra field (variable size)//扩展区
​B.  File data
​Immediately following the local header for a fileis the compressed or stored data for the file. The series of [local file header][file data][datadescriptor] repeats for each file in the .ZIP archive. 
​C.  Data descriptor: //一般不会有
​crc-32                          4 bytescompressed size                 4 bytesuncompressed size               4 bytes

可以发现Local file header内容和核心目录中是几乎一样的,接在Local file header后面就是文件数据了,根据数据长度和加密方式就可以解压了。

Android 解压流程

在frameworks中可以通过frameworks/base/libs/androidfw/ZipUtils.cpp来解压文件。但是仔细看代码会发现这个类只是对ziparchive库的函数的封装,最终调用都进入了ziparchive中。这个库的源码路径是system/core/libziparchive/

system/core/libziparchive/zip_archive.cc
​
int32_t OpenArchive(const char* fileName, ZipArchiveHandle* handle) {const int fd = open(fileName, O_RDONLY | O_BINARY, 0);ZipArchive* archive = new ZipArchive(fd, true);*handle = archive;
​if (fd < 0) {ALOGW("Unable to open '%s': %s", fileName, strerror(errno));return kIoError;}
​return OpenArchiveInternal(archive, fileName);
}
  1. 首先通过路径打开文件拿到fd
  2. 生成ZipArchive对象
  3. 调用OpenArchiveInternal解析文件
static int32_t OpenArchiveInternal(ZipArchive* archive, const char* debug_file_name) {int32_t result = -1;if ((result = MapCentralDirectory(debug_file_name, archive)) != 0) { //解析ECOD拿到核心目录的位置和其他信息return result;}
​if ((result = ParseZipArchive(archive))) {//解析zip文件return result;}
​return 0;
}

到这里激动人心的核心目录已经出来了,下面就看看是怎么通过MapCentralDirectory拿到核心目录

​
/** Find the zip Central Directory and memory-map it.** On success, returns 0 after populating fields from the EOCD area:*   directory_offset*   directory_ptr*   num_entries*/
static int32_t MapCentralDirectory(const char* debug_file_name, ZipArchive* archive) {​//删除部分异常处理代码/** Perform the traditional EOCD snipe hunt.** We're searching for the End of Central Directory magic number,* which appears at the start of the EOCD block.  It's followed by* 18 bytes of EOCD stuff and up to 64KB of archive comment.  We* need to read the last part of the file into a buffer, dig through* it to find the magic number, parse some values out, and use those* to determine the extent of the CD.** We start by pulling in the last part of the file.*/off64_t read_amount = kMaxEOCDSearch;if (file_length < read_amount) {read_amount = file_length;}
​std::vector scan_buffer(read_amount);int32_t result =MapCentralDirectory0(debug_file_name, archive, file_length, read_amount, scan_buffer.data());return result;
}

里面只是做了一些异常处理,最终用的MapCentralDirectory0函数来解析。异常处理中出现了很熟悉EocdRecord,这个结构体就是用来描述EOCD的。

​
static int32_t MapCentralDirectory0(const char* debug_file_name, ZipArchive* archive,off64_t file_length, off64_t read_amount, uint8_t* scan_buffer) {const off64_t search_start = file_length - read_amount;
​if (!archive->mapped_zip.ReadAtOffset(scan_buffer, read_amount, search_start)) {ALOGE("Zip: read %" PRId64 " from offset %" PRId64 " failed", static_cast(read_amount),static_cast(search_start));return kIoError;}
​/** Scan backward for the EOCD magic.  In an archive without a trailing* comment, we'll find it on the first try.  (We may want to consider* doing an initial minimal read; if we don't find it, retry with a* second read as above.)*///循环查找ECODint i = read_amount - sizeof(EocdRecord);for (; i >= 0; i--) {if (scan_buffer[i] == 0x50) {uint32_t* sig_addr = reinterpret_cast(&scan_buffer[i]);if (get_unaligned(sig_addr) == EocdRecord::kSignature) {// kSignature = 0x06054b50;通过标志位找到EOCDALOGV("+++ Found EOCD at buf+%d", i);break;}}}if (i < 0) {ALOGD("Zip: EOCD not found, %s is not zip", debug_file_name);return kInvalidFile;}
​const off64_t eocd_offset = search_start + i;const EocdRecord* eocd = reinterpret_cast(scan_buffer + i);//生成EocdRecord对象,这个对象的作用就是根据zip的EOCD结构解析数据/** Verify that there's no trailing space at the end of the central directory* and its comment.*/const off64_t calculated_length = eocd_offset + sizeof(EocdRecord) + eocd->comment_length;if (calculated_length != file_length) {ALOGW("Zip: %" PRId64 " extraneous bytes at the end of the central directory",static_cast(file_length - calculated_length));return kInvalidFile;}
​/** Grab the CD offset and size, and the number of entries in the* archive and verify that they look reasonable.*/if (static_cast(eocd->cd_start_offset) + eocd->cd_size > eocd_offset) {ALOGW("Zip: bad offsets (dir %" PRIu32 ", size %" PRIu32 ", eocd %" PRId64 ")",eocd->cd_start_offset, eocd->cd_size, static_cast(eocd_offset));
#if defined(__ANDROID__)if (eocd->cd_start_offset + eocd->cd_size <= eocd_offset) {android_errorWriteLog(0x534e4554, "31251826");}
#endifreturn kInvalidOffset;}if (eocd->num_records == 0) {ALOGW("Zip: empty archive?");return kEmptyArchive;}
​//到这里各种异常判断结束,EOCD合法并可以拿到核心目录中File header的数量ALOGV("+++ num_entries=%" PRIu32 " dir_size=%" PRIu32 " dir_offset=%" PRIu32, eocd->num_records,eocd->cd_size, eocd->cd_start_offset);
​/** It all looks good.  Create a mapping for the CD, and set the fields* in archive.*///InitializeCentralDirectory创建相关变量保存起来if (!archive->InitializeCentralDirectory(debug_file_name,static_cast(eocd->cd_start_offset),static_cast(eocd->cd_size))) {ALOGE("Zip: failed to intialize central directory.\n");return kMmapFailed;}
​archive->num_entries = eocd->num_records;archive->directory_offset = eocd->cd_start_offset;
​return 0;
}
  1. 在文件 file_length - read_amount的地方开始找EOCD,read_amount是EOCD可能的最大长度,就是从文件最后read_amount这么长的区域中找到ECOD
  2. 各种异常处理之后,确定找到的ECOD合法,这里也是很多伪加密处理的地方,Android是直接从read_amount的区域查找,但是很多库和解压软件是默认没有注释和额外的数据
  3. InitializeCentralDirectory解析核心目录创建相关变量保存起来

回到OpenArchiveInternal调用MapCentralDirectory拿到相关信息之后就是调用ParseZipArchive解析了。

//函数比较长删掉了一部分异常处理的代码
static int32_t ParseZipArchive(ZipArchive* archive) {const uint8_t* const cd_ptr = archive->central_directory.GetBasePtr();const size_t cd_length = archive->central_directory.GetMapLength();const uint16_t num_entries = archive->num_entries;
​/** Create hash table.  We have a minimum 75% load factor, possibly as* low as 50% after we round off to a power of 2.  There must be at* least one unused entry to avoid an infinite loop during creation.*/archive->hash_table_size = RoundUpPower2(1 + (num_entries * 4) / 3); //创建hashtablearchive->hash_table =reinterpret_cast(calloc(archive->hash_table_size, sizeof(ZipStringOffset)));/** Walk through the central directory, adding entries to the hash* table and verifying values.*/const uint8_t* const cd_end = cd_ptr + cd_length;const uint8_t* ptr = cd_ptr;for (uint16_t i = 0; i < num_entries; i++) { //循环获取每一个CentralDirectoryRecordif (ptr > cd_end - sizeof(CentralDirectoryRecord)) {ALOGW("Zip: ran off the end (item #%" PRIu16 ", %zu bytes of central directory)", i,cd_length);
#if defined(__ANDROID__)android_errorWriteLog(0x534e4554, "36392138");
#endifreturn kInvalidFile;}
​const CentralDirectoryRecord* cdr = reinterpret_cast(ptr);if (cdr->record_signature != CentralDirectoryRecord::kSignature) { //kSignature = 0x02014b50;每次都会判断一下标志位ALOGW("Zip: missed a central dir sig (at %" PRIu16 ")", i);return kInvalidFile;}
​const off64_t local_header_offset = cdr->local_file_header_offset;
​const uint16_t file_name_length = cdr->file_name_length;const uint16_t extra_length = cdr->extra_field_length;const uint16_t comment_length = cdr->comment_length;const uint8_t* file_name = ptr + sizeof(CentralDirectoryRecord);// Add the CDE filename to the hash table.std::string_view entry_name{reinterpret_cast(file_name), file_name_length};//根据filename创建entry_nameconst int add_result = AddToHash(archive->hash_table, archive->hash_table_size, entry_name,archive->central_directory.GetBasePtr());//加入hashtable,key是entry_name,fvalue是当前CentralDirectoryRecord的地址ptr += sizeof(CentralDirectoryRecord) + file_name_length + extra_length + comment_length;}
​ALOGV("+++ zip good scan %" PRIu16 " entries", num_entries);
​return 0;
}
  1. 创建一个hashtable对象
  2. 通过EOCD中拿到的起始地址和数量循环解析每一个CentralDirectoryRecord
  3. 将解析出来的CentralDirectoryRecord全部存入hashtable中

到这里CentralDirectoryRecord的hashtable也创建好了,接下来要解压就是从hashtable中获取CentralDirectoryRecord,根据CentralDirectoryRecord找到对应数据的地址和长度截取数据就好了。

总结

zip解压的流程就到这里结束,android中解压还是通过标准的流程。找到ECOD解析CentralDirectory->根据CentralDirectory创建CentralDirectoryRecord的hashtable->最终通过CentralDirectoryRecord中的文件地址和长度压缩方式,拿到数据解压。后续如果再遇到修改了其他地方导致解压失败应该也很容易解决了。


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部