Introduction
Let's assume that your program is running on Linux and is not going to terminate for a long period of time, like UNIX daemons. However you want to upgrade the program in some simple way but you do not want to terminate the program execution. What comes to your mind is to somehow upgrade some known function in your program so that it will do some additional job without compromising the function's usual behavior and without terminating your program. You think about injecting some new code into the code of your program so that it will be triggered when another already existing function in your program is called. This may be a bit imaginary example but it demonstrates the idea, why it is sometimes needed to inject some code in the running program. It is also relevant to mention the virus injection techniques into the running code.
In this article, I'll explain how it is possible to inject a C function into the running program on Linux without terminating the program. We'll talk about Linux object files Executable and Linkable Format (ELF), about object file sections, symbols and relocations.
Working Example Overview
I will explain step by step the code injection technique using the following simple example. The example consists of 3 components:
- Dynamic (shared) library libdynlib.so that is built from dynlib.hpp and dynlib.cpp C++ source files.
- Application
app
that is built from app.cpp source file and is linked with libdynlib.so library. - The injection function located in injection.cpp file.
Let us review the components code.
print();
The dynlib.hpp header defines the print()
function.
<< endl; }
The dynlib.cpp implements the print()
function that just prints a counter (that is incremented at every function call), the program process id and a message.
The application app.cpp calls the print()
function (from the libdynlib.so dynamic library, then sleeps for a few seconds and continues doing the same in the infinite loop.
The injection()
function call is going to replace the print()
function call in the application main()
function. The injection()
function will first call the original print()
function and then do some additional job. For example, it can run some external executable file using system()
function call or just print the current date as I do.
Compile and Run the Application
Let us first compile the components with g++
C++ compiler and gcc
C compiler.
g++ -ggdb -Wall dynlib.cpp -fPIC -shared -o libdynlib.so g++ -ggdb app.cpp -ldynlib -ldynlib -L./ -o app gcc -Wall injection.cpp -c -o injection.o -rwxr-xr-x 1 gregory ftp 52248 Feb 12 02:05 app -rw-r--r-- 1 gregory ftp 1088 Feb 12 02:05 injection.o -rwxr-xr-x 1 gregory ftp 52505 Feb 12 02:05 libdynlib.so
Note that the dynamic library libdynlib.so is compiled and linked with -fPIC
flag that produces position independent code and the injection object is compiled with C compiler. We can now run the application app
executable.
[lnx63:code_injection] ==> ./app : In print() Going to sleep ...
Getting into Debugger
The application app
passed few loop iterations but we pretend that it's already running few weeks so it's now time to inject our new code without terminating the applications. We'll use Linux gdb
debugger during the injection process. First we need to attach gdb
to the application process 4184
, see the PID (application process id) printed above.
[lnx63:code_injection] ==> gdb app 4184 GNU gdb 6.3 Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-.so.2 (gdb)
Loading the Injection Code into the Executable Process Memory
As I mentioned above, injection.o object file is not initially included in the app
executable process p_w_picpath. We first need to load injection.o into the process memory address space. This can be done with mmap()
system call that will map the injection.o file into the app
process address space. Let us do it in the debugger.
(gdb) call open("injection.o", 2) $1 = 3 (gdb) call mmap(0, 1088, 1 | 2 | 4, 1, 3, 0) $2 = 1073754112 (gdb)
We first open the injection.o file with O_RDWR
(value 2) read/write permissions. We need write permission because later we'll make changes in the loaded injection code. The returned allocated file descripter for the opened file is 3. Then we bring the file into the process address space with mmap()
call. The mmap()
call accepts the file size (1088
bytes), the file mapping permissions - PROT_READ
| PROT_WRITE
| PROT_EXEC
(for reading/writing and executing, 1 | 2 | 4) and opened file descriptor - 3. and returns the starting address of the mapped file within the process address space - 1073754112
. We can verify that the injection.o was indeed mapped into the process address space by looking into /proc/[pid]/maps (where pid
is the executable process id - 4184
in our example) file that on Linux is the file that contains information about running process memory layout.
[lnx63:code_injection] ==> cat /proc/4184/maps 006e1000-006f6000 r-xp 00000000 fd:00 394811 /lib/ld-2.3.4.so 006f6000-006f7000 r-xp 00015000 fd:00 394811 /lib/ld-2.3.4.so 006f7000-006f8000 rwxp 00016000 fd:00 394811 /lib/ld-2.3.4.so 006ff000-00824000 r-xp 00000000 fd:00 394812 /lib/tls/libc-2.3.4.so 00824000-00825000 r-xp 00124000 fd:00 394812 /lib/tls/libc-2.3.4.so 00825000-00828000 rwxp 00125000 fd:00 394812 /lib/tls/libc-2.3.4.so 00828000-0082a000 rwxp 00828000 00:00 0 00832000-00853000 r-xp 00000000 fd:00 394813 /lib/tls/libm-2.3.4.so 00853000-00855000 rwxp 00020000 fd:00 394813 /lib/tls/libm-2.3.4.so 0096e000-00975000 r-xp 00000000 fd:00 394816 /lib/libgcc_s-3.4.6-20060404.so.1 00975000-00976000 rwxp 00007000 fd:00 394816 /lib/libgcc_s-3.4.6-20060404.so.1 00978000-00a38000 r-xp 00000000 fd:00 45535 /usr/lib/libstdc++.so.6.0.3 00a38000-00a3d000 rwxp 000bf000 fd:00 45535 /usr/lib/libstdc++.so.6.0.3 00a3d000-00a43000 rwxp 00a3d000 00:00 0 08048000-08049000 r-xp 00000000 00:34 30468731 /store/fileril104/project/gregory/ code_injection/app 08049000-0804a000 rwxp 00000000 00:34 30468731 /store/fileril104/project/gregory/ code_injection/app 0804a000-0806b000 rwxp 0804a000 00:00 0 40000000-40001000 r-xp 00000000 00:34 30468725 /store/fileril104/project/gregory/ code_injection/libdynlib.so 40001000-40002000 rwxp 00000000 00:34 30468725 /store/fileril104/project/gregory/ code_injection/libdynlib.so 40002000-40003000 rwxp 40002000 00:00 0 40003000-40004000 rwxs 00000000 00:34 30468724 /store/fileril104/project/gregory/ code_injection/injection.o 4000f000-40011000 rwxp 4000f000 00:00 0 bfffe000-c0000000 rwxp bfffe000 00:00 0 ffffe000-fffff000 ---p 00000000 00:00 0
You can verify that /store/fileril104/project/gregory/code_injection/injection.o starts at address 0x40003000
(decimal 1073754112
) and ends at address 0x40004000
within the process address space. Other dynamic libraries mapping is also shown in the above output. Well, we now have all the components loaded in the executable process memory.
Relocations
Now it's time to inspect the application binary executable in ELF
format from inside. We'll use readelf
Linux utility that displays different data from ELF
format object files (i.e. any object, library or executable files on Linux). We look at the symbol relocations in the app
executable. We are interested in print()
function call relocation.
[lnx63:code_injection] ==> read R_386_JUMP_SLOT 0804874c _ZNSt8ios_base4InitD1E
As you can see, the print
symbol relocation is located at the absolute (virtual) address (offset) 0x08049d24
in theapp
executable and the type of this relocation is R_386_JUMP_SLOT
. The relocation address is an absolute virtual address of the executable after it is loaded in the memory prior to its run. Note that this relocation resides in the.rel.plt
section of the executable binary p_w_picpath. The PLT
stands for Procedure Linkage Table, that is the table that provides indirect call for a function. This means that when you call a function you don't directly jump to the function location, but first jump to an entry in the Procedure Linkage Table and then from the PLT
jump to the actual function code. This is necessary when you call a function that resides in a dynamic library (libdynlib.so in our example) because you do not know in advance at what address in the executable process space the dynamic libraries will be loaded and in what dynamic library you will first find the required function (print()
in our example). All this knowledge is available only at the moment of loading application into the memory prior to its run and at that time it's the job of dynamic linker (ld-linux.so on Linux) to resolve relocations so that the requested function will be correctly called. In our example the dynamic linker will load the libdynlib.so library into the executable process address space, find the address of the print()
function in the library and set this address into the relocation address 0x08049d24
.
Our goal is to replace the address of the print()
function with the address of function injection()
from theinjection.o
object file that was not initially included in the executable process p_w_picpath when it started running.
More information on ELF
format, relocations and dynamic linker can be found in Executable and Linkable Format (ELF)
document.
We can check that the address 08049d24
currently contains the address of function print()
.
(gdb) p & print $4 = (void (*)(void)) 0x40000be8 <print> (gdb) p/x * 0x08049d24 $5 = 0x40000be8 (gdb)
The address of the injection()
function can be found by running readelf -s
(displays object file symbol table) on the injection.o file.
[lnx63:code_injection] ==> read -s injection.o Symbol table '.symtab' contains 13 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FILE LOCAL DEFAULT ABS injection.cpp 2: 00000000 0 SECTION LOCAL DEFAULT 1 3: 00000000 0 SECTION LOCAL DEFAULT 3 4: 00000000 0 SECTION LOCAL DEFAULT 4 5: 00000000 0 SECTION LOCAL DEFAULT 5 6: 00000000 0 SECTION LOCAL DEFAULT 6 7: 00000000 0 SECTION LOCAL DEFAULT 8 8: 00000000 0 SECTION LOCAL DEFAULT 9 9: 00000000 25 FUNC GLOBAL DEFAULT 1 injection 10: 00000000 0 NOTYPE GLOBAL DEFAULT UND system 11: 00000000 0 NOTYPE GLOBAL DEFAULT UND print 12: 00000000 0 NOTYPE GLOBAL DEFAULT UND __gxx_personality_v0
The function (symbol) injection
is located at the offset 0 in the .text
section in the injection.o object file. But the .text
section starts at the offset 0x000034
in the injection.o object file.
[lnx63:code_injection] ==> read -S injection.o There are 13 section headers, starting at offset 0x104: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 000019 00 AX 0 0 4 [ 2] .rel.text REL 00000000 000418 000018 08 11 1 4 [ 3] .data PROGBITS 00000000 000050 000000 00 WA 0 0 4 [ 4] .bss NOBITS 00000000 000050 000000 00 WA 0 0 4 [ 5] .rodata PROGBITS 00000000 000050 000005 00 A 0 0 1 [ 6] .eh_frame PROGBITS 00000000 000058 000038 00 A 0 0 4 [ 7] .rel.eh_frame REL 00000000 000430 000010 08 11 6 4 [ 8] .note.GNU-stack NOTE 00000000 000090 000000 00 0 0 1 [ 9] .comment PROGBITS 00000000 000090 000012 00 0 0 1 [10] .shstrtab STRTAB 00000000 0000a2 00005f 00 0 0 1 [11] .symtab SYMTAB 00000000 00030c 0000d0 10 12 9 4 [12] .strtab STRTAB 00000000 0003dc 00003b 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific)
Replacing the print() Function with injection() Function
I would like to remind you that the injection.o file was loaded into the executable process memory at address0x40003000
(see above). So the final absolute address of the injection()
function within the executable process.is 0x40003000 + 0x000034
.
We now set this address into the print()
function relocation address 0x08049d24
.
(gdb) set * 0x08049d24 = 0x40003000 + 0x000034 (gdb)
At this point, we successfully replaced the call to the print()
with the call to the injection()
function.
Resolving injection() Function Relocations
However we still need some work to be done. The code of the injection()
function is not ready to run yet because it has 3 unresolved relocations.
[lnx63:code_injection] ==> read -r injection.o Relocation section '.rel.text' at offset 0x418 contains 3 entries: Offset Info Type Sym.Value Sym. Name 00000009 00000501 R_386_32 00000000 .rodata 0000000e 00000a02 R_386_PC32 00000000 system 00000013 00000b02 R_386_PC32 00000000 print Relocation section '.rel.eh_frame' at offset 0x430 contains 2 entries: Offset Info Type Sym.Value Sym. Name 00000011 00000c01 R_386_32 00000000 __gxx_personality_v0 00000024 00000201 R_386_32 00000000 .text
The first .rodata
relocation points to the "date"
constant string stored in the .rodata
read-only data section, the second system
relocation refers to the system()
function call and the third print
relocation refers to theprint()
function call. Note that all the three relocations reside in the .rel.text
section that is their offsets are relative to the beginning of the .text
section.
We resolve all the above three relocations manually and set appropriate addresses to these three memory locations. The addresses of these relocations within the executable process address space are calculated by summing up:
- The injection.o starting address (
0x40003000
) within the process address space. - The
.text
section starting offset0x000034
within the injection.o object file. - The relocation offset relative to the
.text
section (0x00000009
- for.rodata
,0x0000000e.
forsystem
and00000013
forprint
).
Note that system
and print
relocations are of R_386_PC32
type. This means that the value (resolved address) to be set into the relocation location should be calculated relatively to the PC
program counter, that is relatively to the relocation location. Also R_386_PC32
relocation requires that the value that was stored in the relocation location before relocation resolution (addend
) should be added to the resolved address. The R_386_32
.rodata
relocation also adds the addend to its resolved address.
(gdb) p & system $7 = (<text> *) 0x733650 <system> // Address of the system() function (gdb) p * (0x40003000 + 0x000034 + 0x0000000e) $8 = -4 // Addend of the system relocation (gdb) set * (0x40003000 + 0x000034 + 0x0000000e) = 0x733650 - (0x40003000 + 0x000034 + 0x0000000e) - 4 (gdb) p & print $9 = (void (*)(void)) 0x40000be8 <print> // Address of the print() function (gdb) p * (0x40003000 + 0x000034 + 0x00000013) $10 = -4 // Addend of the print relocation (gdb) set * (0x40003000 + 0x000034 + 0x00000013) = 0x40000be8 - (0x40003000 + 0x000034 + 0x00000013) - 4 (gdb) p * (0x40003000 + 0x000034 + 0x00000009) $11 = 0 // Addend of the .rodata relocation (gdb) set * (0x40003000 + 0x000034 + 0x00000009) = 0x40003000 + 0x000050 // 0x000050 is // the offset of .rodata section within injection.o object file.
We just resolved all the three relocations within injection()
function code. Well, we are done. We exit the debugger. The application will continue running and now do additional job of printing the current date.
gdb) quit The program is running. Quit anyway (and detach it)? (y or n) y Detaching from program: /store/fileril104/project/gregory/code_injection/app, process 4184 [lnx63:code_injection] ==> // The application execution continues Waked up ... Thu Feb 12 20:09:40 IST 2009 4: PID 4184: In print() Going to sleep ... Waked up ... Thu Feb 12 20:09:43 IST 2009 5: PID 4184: In print() Going to sleep ... Waked up ... Thu Feb 12 20:09:46 IST 2009 6: PID 4184: In print() Going to sleep ... Waked up ... Thu Feb 12 20:09:49 IST 2009 7: PID 18138: In print() Going to sleep ... Waked up ...
That's it.
Conclusion
I showed how one can inject a C function into the running program on Linux without terminating the program. Note that process memory manipulations that were demonstrated are allowed only for processes for which you are either owner or have appropriate permissions.