I’ve recently analyzed a malware sample that parsed and modified raw network packet data. That means I had to deal with many register relative offsets in IDA Pro. The most practical way for this is to define and apply structs.
While IDA’s type libraries contain some of the packet structures (i.e., ETHERNET_FRAME, IP, TCP, and UDP_HEADER), other protocols (i.e., ARP) are missing. Additionally, I did not find structures encompassing multiple communication layers – for example ETHERNET_FRAME, IP, and TCP grouped in one structure.
Highlighting important and suspicious instructions helps me tremendously to understand and quickly navigate a disassembled binary. Every time I browse a freshly opened binary in IDA Pro I feel that something is missing. Reversing without colors is less fun!
Look at the two screenshots below. I’m having a much easier time navigating the highlighted disassembly on the right.
When triaging malicious executable files I always try the FireEye Labs Obfuscated String Solver (FLOSS) to quickly decode obfuscated strings. In short, FLOSS uses heuristics to identify decoding routine candidates and emulates them using vivisect’s disassembly and emulation modules.
While vivisect is an awesome tool, it sometimes is not as robust as IDA Pro in parsing and disassembling binaries. In addition, IDA Pro provides the Fast Library Identification and Recognition Technology (FLIRT) that helps to distinguish standard library functions and functions written by the program’s author.
One of IDA Pro’s most important features is that it allows us to interactively modify the disassembly – hence, the I in IDA. This includes renaming of function names, variable names, and names of addresses. IDA Pro refers to these names as identifiers and enforces a certain naming scheme on them. After working with IDA Pro for a couple of weeks most people develop a good understanding of valid names and what to avoid when renaming identifiers. However, I wanted to know how IDA Pro checks identifiers and describe my findings in this blog post. In addition to this, I discuss the character encoding used for comments in IDA Pro. Adding comments to a disassembled program is another useful feature many reverse engineers take advantage of. While users normally don’t have to worry about the comment encoding this information can be handy in certain situations – especially when dealing with comments in IDAPython scripts.
In this blog post I am going to discuss how you can interact with basic blocks in IDAPython. Before we jump into the technical details, I want to provide some context and show why I became interested in exploring this feature of IDA Pro.
Background and Motivation
The other day I reverse engineered a backdoor that was heavily armored with two classic anti-disassembly techniques. The first technique substitutes jmp instructions with sequences of push and retn instructions. Figure 1 shows how this hinders the program’s control flow analysis. First, IDA Pro interprets the retn instruction to mark a function’s end. Second, IDA Pro is not able to identify the target addresses as code and hence does not disassemble them.