,, MMP""MM""YMM `7MM P' MM `7 MM MM MMpMMMb. .gP"Ya MM MM MM ,M' Yb MM MM MM 8M"""""" MM MM MM YM. , .JMML. .JMML JMML.`Mbmmd' `7MMF' `7MF' `7MMF' `7MMF' `MA ,V MM MM VM: ,V `7M' `MF' MM MM .gP"Ya ,6"Yb.`7M' `MF'.gP"Ya `7MMpMMMb. MM. M' `VA ,V' MMmmmmmmMM ,M' Yb 8) MM VA ,V ,M' Yb MM MM `MM A' XMX MM MM 8M"""""" ,pm9MM VA ,V 8M"""""" MM MM :MM; ,V' VA. MM MM YM. , 8M MM VVV YM. , MM MM VF .AM. .MA..JMML. .JMML.`Mbmmd' `Moo9^Yo. W `Mbmmd'.JMML JMML. ,, ,, ,, .g8"""bgd `7MM `7MM mm db .dP' `M MM MM MM dM' ` ,pW"Wq. MM MM .gP"Ya ,p6"bo mmMMmm `7MM ,pW"Wq.`7MMpMMMb. MM 6W' `Wb MM MM ,M' Yb 6M' OO MM MM 6W' `Wb MM MM MM. 8M M8 MM MM 8M"""""" 8M MM MM 8M M8 MM MM `Mb. ,'YA. ,A9 MM MM YM. , YM. , MM MM YA. ,A9 MM MM `"bmmmd' `Ybmd9'.JMML..JMML.`Mbmmd' YMbmd' `Mbmo.JMML.`Ybmd9'.JMML JMML. -- Contact -- https://twitter.com/vxunderground email@example.com
In this paper, I'll discuss how to make a linux virus. Of course you won't use this to make one.
Before let's play, lemme tell ya that the enligsh language is not my native one. There are few code here, as many things are theorical (but applicable, don't fear). Some asm skillz is required, and *nx knowleadge is better. Of course this paper wuz written with vi :)
Here i'll describe what you need, and how to use it.
you can use strip to get some stripped ELFs (for testing purposes) for oldschooler like me, vi is a decent asm editor. many values, infos, structures are in your kernel srcs. use grep. The syntax to compil is quiet easy:
nasm -f elf virus.asm cc virus.o -o virus
Linux assembly is 32 bits, and use syscalls. It's something between win32 and dos - that's why i love it. The syscalls are functions called by int 80h. It's a gate to C libz. eax holds the function number and the others hold the arguments in order: ebx 1st, ecx 2nd, and so on up to ebp, the 7th. You have the list of the function numberz in /usr/include/asm/unistd.h For example, opening a file looks like:
The return is an handle used for subsequent calls, like for dos or win32. Finding syscall usage is not always an easy task. The few docs i found about it are mentionned at the 4.3 section. Return values are stored in eax, or structures pointed by argz. Some syscalls require you to have superuser priviledges.
I hope you like brut infos :) I won't describe everything here, as I ripped the stuff from better docs. You'll find where to get them at the 4.3 section.
ELF format is really flexible; the only fixed part is the ELF header : at BOF. In practice, most of the ELFs you'll find looks like:
elf_header (34h bytes) program_header (6*20h btyes) sections 1 sections 2 ... sections x sections_header (x*28h bytes)
Basically, elf_header is used to give general informations about other headers.
|0||e_ident||holds the magic values 0x7f,'ELF' and some flags|
|+10||e_type||this word contains the file type (core, exe, ...)|
|+12||e_machine||give the machine needed for running (3 = x86)|
|+14||e_version||ELF header version. Currently 1.|
|+18||e_entry||virutal address of entry point. I can see you smiling.|
|+1c||e_phoff||program header offset (see below)|
|+20||e_shoff||sections header offset (see below)|
|+24||e_flags||some other flags (processor specific nfos)|
|+28||e_ehsize||size of the ELF header|
|+2A||e_phentsize||size of one entry in the program header|
|+2C||e_phnum||number of entrys in the program header|
|+2E||e_shentsize||size of one entry in the section header|
|+30||e_shunum||number of entrys in the section header|
|+32||e_shstrndx||give the entry number of the name string section (if exists)|
|0||sh_name||contains a pointer to the name string section giving the|
|+4||sh_type||give the section type [name of this section|
|+8||sh_flags||some other flags ...|
|+c||sh_addr||virtual addr of the section while running|
|+10||sh_offset||offset of the section in the file|
|+14||sh_size||zara white phone numba|
|+18||sh_link||his use depends on the section type|
|+1c||sh_info||depends on the section type|
|+24||sh_entsize||used when section contains fixed size entrys|
|0||p_type||type of segment|
|+4||p_offset||offset in file where to start the segment at|
|+8||p_vaddr||his virtual address in memory|
|+c||p_addr||physical address (if relevant, else equ to p_vaddr)|
|+10||p_filesz||size of datas read from offset|
|+14||p_memsz||size of the segment in memory|
|+18||p_flags||segment flags (rwx perms)|
As i said above, i won't waste your HD space with tons of flag descriptions. You'll find them in the - nice - ELF documention (urlz at 4.3). Everything is document. It's the linux world. So use the source dudez !!!
A lotta things happen when executing the file. (don't forget to check you kernel srcs) The kernel reads the program header and compute segments from the file, then the sections are defined and allocated. You'll notice the modifications while debugging with gdb: before and after 'run' the memory is not in the same state, so set a breakpoint at the virutal entry point to read allocated datas. Finally, miscellanous dynamic values are set.
Sorry for this ego abuse, but after all, that's a nice way to infect ELFs that i'll describe here, and here's the story of how i elaborated it.
I tryied a lotta things. At first i though that adding a section at EOF and writing my viral code right after should work. The code is no more accessible at run time, so it segmentation faulted :(
Then i noticed the hole in virutal memory between the code segment and the data one. But i wuzn't able to use it (and believe me, i tryied many things). Overwriting not really used section is not safe, so forget it.
Finally I decided to look aroung the segments, not the section. Adding a segment wuz possible, but when i tryied to relocate the program header i came in a lot of troubles. So i decided to enlarge one. As i wanted to code a non-destructive virus, i choose to append to the file, not to overwrite a part of it. The only way then wuz to enlarge a segment. As i append to the file, the only enlargable segment was the data one. The new virtual entry point is easy to calculate from the virutal address of the data segment. I wuz able to run code before the host. I used the old virutal entry point to get back to the host, but it segf'd once again.
The data segment size is not the same in memory and in the file. It now contains the rest of the file in memory, with our viral code appended. I thought the segfs came from that. And i wuz right. (for once) The .bss section problem arised. Normally, this section's flag is SHT_NOBITS. As you're not talking kernelian, i'll tell ya what it means: this section should not contain bytes from the programs. But i know understand why the data segment stops just before it in the file; the program bytes are copied in memory up to our code. And the .bss section has to be filled of zero. So i decided to overwrite it w/ zeros at runtime before returning to the host. Once again it segf'd. Damn.
With further testing I discovered the role of the .rel sections. Especially the .rel.bss section. At runtime, the beginning of the .bss section containes relocated values. And i did overwrite then. My first attempt to bypass this problem wuz to check the other sections between the .bss file offset and our viral code, and to overwrite it with zeroes. No, i didn't overwrite the section headers. I'm not so dumb :) When the kernel execute the file, the .bss is already zeroed and the recolation won't be overwritten. This should work! The code wuz bigger (computing zeroable section). But once again i got a lotta troubles ; the .bss section if often bigger than the rest of the file. Yes I overwritten my viral code before discovering it.
Crashing half of the hosts is not a solution (at least for me). So i took the other solution (may be you already got it) : getting back to the runtime patch and overwrite the .bss section before returning, but avoiding the relocations. Hopefully the relocations things are quiet simple. I included a rutine to get the farest relocation in memory and started to zero the .bss section from there up to the viral code. No. This should have been to easy. Once again i overwritten my viral code.
May be you're clever than me and already guess why; because of the .bss size. The last thing to do wuz to add a rutine to migrate behind the .bss section in memory before patching it. et voila.
This is a nice story, but it's not yet ended. A bad point is that's only applicable to regular ELFs: compiled C, (6 entry in the program header, a .bss section, regular entrys sizes, ...) And if you strip an infected ELF, it won't work anymore. I also encountered some problemz w/ proggyz using the kdelib.
Now that we can infect files, let's find some.
For linux, a directory is a file like the others. This is a nice string, but that's not really true :) You can use the open function to access it, but you can't read it with the 'read' function. You have to use 'readdir'. Close it with 'close'. You can't open it with r/w flags. Open it r/o. Many syscalls which requires a file descriptor are applicable with a directory descriptor. Thoses descriptorz are called handle in dos/win32. In case it's not allowed, the syscall return EISDIR (-21). This is a polite way to say "get the hell outta here, it's a fuckin directory."
As I not yet ended the way to stay TSR, do it yourself or stay a runtimer linux virus writer. Being a RLVW :) obliges you to seek directories. You can seek the current directory by opening '.' . You can of course access the '..' directory, but beware to not loop when reaching the root. Don't waste the user time by seeking the common directories filled of ELFs if you're not suid, so check it before.
Please don't waste this promised land with lame viruses. Respect the linux users and don't fill the world of shit. So add plenty of nice functions. Here are the ten (old) gold rules;
As you found a lotta great ideas while reading the ralf's int list, you'll have some nice one while reading syscalls descriptions. It's a pitie that it's not better documented, but i don't think that many ppl thought it could be usefull. "who will code in asm under linux anyway ?" The umask syscall can be funny in a payload. Adding socket operation can be nice too. Adding a security breach to the system is really tempting. I'm currently looking aroung the ptrace syscall to make a kind of TSRing... But it may has many other interesting implementations. Thanx to kernel panic who pointed it. (in xine 2 or 3, i don't remember) I'm sure they are a lotta things to do w/ the various signals.
It's still small for now :/
Most of the stuff will be found here: (be sure to get the ELF documentation and the syscall descriptions) http://lightning.voshod.com/asm
Some other interesting (?) piece of paperz can be found at: http://bewoner.dma.be/janw/eng.html
And the unix-virus mailing list that i just discovered (thnx to Quantum): http://virus.beergrave.net
Greetz to the whole FSA, and in misorder: Doctor L, mist, darkman, yesna, reptile, raid, billy b., SSR, mammoth, T2, buzz, acidbytes, owl, evul, morphine, gigabyte, and all others.. And to you, linux users: forgive me for having written this paper.
A special though to all arrested vxers. past, present, and futur.
A special handshake to a non-coder friend of mine, YOGI.
Linux is not what it used to be anymore.