Improve compatibility with iOS / macOS 27 shared caches#8260
Conversation
These shared caches contain symbols pointing into address ranges that are no longer mapped, such as `objc_msgSend$stub` functions that are now merged into stub island regions.
…shared caches * `objc_msgSend$stub` functions no longer appear in the `__objc_stubs` section of their dylib. Instead they're coalesced across multiple dylibs and appear in a stub island region of the shared cache. This means that `AnalyzeStubFunction` can no longer determine the type of stub it is processing purely based on the containing section name. It now considers the target of the call to determine the type of the stub. * `objc_msgSend` and friends now have definitions in multiple dylibs throughout the shared cache (`/usr/lib/objc/libobjcMsgSendN.dylib`). This means that loading the target of `objc_msgSend` calls within `objc_msgSend$stub` functions is not sufficient to make selector definitions visible to analysis. Instead, we explicitly load `/usr/lib/libobjc.A.dylib` whenever we process a stub function that references `libobjcMsgSendN.dylib`.
…e to selector base address These show up in iOS 27 shared caches.
This helps for stripped binaries, and in cases such as the macOS 27 shared cache where the symbols are no longer accruate for stub functions since they are coalesced into stub island regions outside of any dylib.
| return Ok(()); | ||
| } | ||
|
|
||
| let func = ac.function(); |
There was a problem hiding this comment.
I think we need to update AnalysisContext::function to be an Option<Ref<Function>>, since it can be executed in the context of the binary view for a module level workflow.
This isn't an issue with your PR I just noticed it when reviewing.
| let Some(insn) = block.iter().last() else { | ||
| return Ok(()); | ||
| }; | ||
| let LowLevelILInstructionKind::TailCallSsa(call_op) = insn.kind() else { |
There was a problem hiding this comment.
Is this also suppose to capture jumps as well? https://github.com/Vector35/binaryninja-api/pull/8261/changes#diff-153013bd9df2d3608be397874180b01d8e12bf3415cfe2cf26f001ab287d4120R47 I see this and it seems like it should? Not sure.
There was a problem hiding this comment.
The code you link to deals with cross-image stub functions into an image that is not yet loaded. In that case the call will often be represented as a jump. This case is a little different.
objc_msgSend$stub functions always end with a call to objc_msgSend. AnalyzeStubFunction in SharedCacheWorkflow.cpp automatically loads libobjc (and/or the new /usr/lib/objc/libobjcMsgSendN.dylib) when it processes one of these stub functions, so any jump instruction that may have existed will have been converted to a tailcall.
objc_msgSend$stubfunctions no longer appear in the__objc_stubssection of their dylib. Instead they're coalesced across multiple dylibs and appear in a stub island region of the shared cache. This means thatAnalyzeStubFunctioncan no longer determine the type of stub it is processing purely based on the containing section name. It now considers the target of the call to determine the type of the stub.objc_msgSendand friends now have definitions in multiple dylibs throughout the shared cache (/usr/lib/objc/libobjcMsgSendN.dylib). This means that loading the target ofobjc_msgSendcalls withinobjc_msgSend$stubfunctions is not sufficient to make selector definitions visible to analysis. Instead, we explicitly load/usr/lib/libobjc.A.dylibwhenever we process a stub function that referenceslibobjcMsgSendN.dylib.objc_msgSend$stubstub functions. They no longer have symbol information since they live in stub island regions outside of images. This also helps with regular stripped Mach-O binaries.