天天看点

Hooking the native API and controlling process creation on a system-wide basis

Introduction

Recently I came across the description of a quite interesting security product, called ​​Sanctuary​​. This product prevents execution of any program that does not appear on the list of software that is allowed to run on a particular machine. As a result, the PC user is protected against various add-on spyware, worms and trojans - even if some piece of malware finds its way to his/her computer, it has no chance of being executed, and, hence, has no chance of causing any damage to the machine. Certainly, I found this feature interesting, and, after a bit of thinking, came up with my own implementation of it. Therefore, this article describes how process creation can be programmatically monitored and controlled on a system-wide basis by means of hooking the native API.

This article makes a "bold" assumption that the target process is being created by user-mode code (shell functions, ​

​CreateProcess()​

​, manual process creation as a sequence of native API calls, etc). Although, theoretically, a process may be launched by kernel-mode code, such possibility is, for practical purposes, negligible, so we don't have to worry about it. Why??? Try to think logically - in order to launch a process from the kernel mode, one has to load a driver, which, in turn, implies execution of some user-mode code, in the first place. Therefore, in order to prevent execution of unauthorized programs, we can safely limit ourselves to controlling process creation by user-mode code on a system-wide basis.

Defining our strategy

First of all, let's decide what we have to do in order to monitor and control process creation on a system-wide basis.

Process creation is a fairly complex thing, which involves quite a lot of work (if you don't believe me, you can disassemble ​

​CreateProcess()​

​, so you will see it with your own eyes). In order to launch a process, the following steps have to be taken:

  1. Executable file has to be opened for ​

    ​FILE_EXECUTE​

    ​ access.
  2. Executable image has to be loaded into RAM.
  3. Process Executive Object (​

    ​EPROCESS​

    ​​, ​

    ​KPROCESS​

    ​​ and ​

    ​ PEB​

    ​ structures) has to be set up.
  4. Address space for the newly created process has to be allocated.
  5. Thread Executive Object for the primary thread of the process (​

    ​ETHREAD​

    ​​, ​

    ​KTHREAD​

    ​​ and ​

    ​TEB​

    ​ structures) has to be set up.
  6. Stack for the primary thread has to be allocated.
  7. Execution context for the primary thread of the process has to be set up.
  8. Win32 subsystem has to be informed about the new process.

In order for any of these steps to be successful, all previous steps have to be accomplished successfully (you cannot set up an Executive Process Object without a handle to the executable section; you cannot map an executable section without file handle, etc). Therefore, if we decide to abort any of these steps, all subsequent ones will fail as well, so that process creation will get aborted. It is understandable that all the above steps are taken by means of calling certain native API functions. Therefore, in order to monitor and control process creation, all we have to do is to hook those API functions that cannot be bypassed by the code that is about to launch a new process.

Which native API functions should we hook? Although ​

​NtCreateProcess()​

​​ seems to be the most obvious answer to the question, this answer is wrong - it is possible to create a process without calling this function. For example, ​

​CreateProcess()​

​​ sets up process-related kernel-mode structures without calling ​

​NtCreateProcess()​

​​. Therefore, hooking ​

​NtCreateProcess()​

​ is of no help to us.

In order to monitor process creation, we have to hook either ​

​NtCreateFile()​

​​ and ​

​NtOpenFile()​

​​, or ​

​NtCreateSection()​

​​ - there is absolutely no way to run any executable file without making these API calls. If we decide to monitor calls to ​

​NtCreateFile()​

​​ and ​

​NtOpenFile()​

​​, we have to make a distinction between process creation and regular file IO operations. This task is not always easy. For example, what should we do if some executable file is being opened for ​

​FILE_ALL_ACCESS​

​​??? Is it just an IO operation or is it a part of a process creation??? It is hard to make any judgment at this point - we need to see what the calling thread is about to do next. Therefore, hooking ​

​NtCreateFile()​

​​ and ​

​NtOpenFile()​

​ is not the best possible option.

Hooking ​

​NtCreateSection()​

​​ is a much more reasonable thing to do - if we intercept a call to ​

​NtCreateSection()​

​​ with the request of mapping the executable file as an image (​

​SEC_IMAGE​

​​ attribute), combined with the request of page protection that allows execution, we can be sure that the process is about to be launched. At this point we are able to take a decision, and, in case if we don't want the process to be created, make ​

​NtCreateSection()​

​​ return ​

​STATUS_ACCESS_DENIED​

​​. Therefore, in order to gain full control over process creation on the target machine, all we have to do is to hook ​

​NtCreateSection()​

​ on a system-wide basis.

Like any other stub from ntdll.dll, ​

​NtCreateSection()​

​​ loads ​

​EAX​

​ with the service index, makes ​

​EDX​

​ point to function parameters, and transfers execution to ​

​KiDispatchService()​

​​ kernel-mode routine (this is done by the ​

​0x2E​

​ instruction under Windows NT/2000 and ​

​SYSENTER​

​​ instruction under Windows XP). After validating function parameters, ​

​KiDispatchService()​

​ transfers execution to the actual implementation of the service, the address of which is available from the Service Descriptor Table (pointer to this table is exported by ntoskrnl.exe as the ​

​KeServiceDescriptorTable​

​ variable, so it is available to kernel-mode drivers). The Service Descriptor Table is described by the following structure:

Collapse

|

​​​Copy Code​​

struct SYS_SERVICE_TABLE { 
    void **ServiceTable; 
    unsigned long CounterTable; 
    unsigned long ServiceLimit; 
    void **ArgumentsTable; 
};      

The ​

​ServiceTable​

​ field of this structure points to the array that holds addresses of all the functions that implement system services. Therefore, all we have to do in order to hook any native API function on a system-wide basis is to write the address of our proxy function to the i-th entry (i is the service index) of the array, pointed to by the ​

​ServiceTable​

​​ field of ​

​KeServiceDescriptorTable​

​.

Looks like now we know everything we need to know in order to monitor and control process creation on a system-wide basis. Let's proceed to the actual work.

Controlling process creation

Our solution consists of a kernel-mode driver and a user-mode application. In order to start monitoring process creation, our application passes the service index, corresponding to

​NtCreateSection()​

​, plus the address of the exchange buffer, to our driver. This is done by the following code:

Collapse

|

​​​Copy Code​​

//open device
device=CreateFile("\\\\.\\PROTECTOR",GENERIC_READ|GENERIC_WRITE, 
       0,0,OPEN_EXISTING, FILE_ATTRIBUTE_SYSTEM,0);

// get index of NtCreateSection, and pass it to the driver, along with the
//address of output buffer
DWORD * addr=(DWORD *)
   (1+(DWORD)GetProcAddress(GetModuleHandle("ntdll.dll"), 
                                     "NtCreateSection"));
ZeroMemory(outputbuff,256);
controlbuff[0]=addr[0];
controlbuff[1]=(DWORD)&outputbuff[0];
DeviceIoControl(device,1000,controlbuff,256,controlbuff,256,&dw,0);      

The code is almost self-explanatory - the only thing that deserves a bit of attention is the way we get the service index. All stubs from ntdll.dll start with the line ​

​MOV EAX, ServiceIndex​

​, which applies to any version and flavour of Windows NT. This is a 5-byte instruction, with ​

​MOV EAX​

Now let's look at what our driver does when it receives IOCTL from our application:

Collapse

|

​​​Copy Code​​

NTSTATUS DrvDispatch(IN PDEVICE_OBJECT device,IN PIRP Irp)
{
    UCHAR*buff=0; ULONG a,base;

    PIO_STACK_LOCATION loc=IoGetCurrentIrpStackLocation(Irp);

    if(loc->Parameters.DeviceIoControl.IoControlCode==1000)
    {
        buff=(UCHAR*)Irp->AssociatedIrp.SystemBuffer;
        
        // hook service dispatch table
        memmove(&Index,buff,4);
        a=4*Index+(ULONG)KeServiceDescriptorTable->ServiceTable;
        base=(ULONG)MmMapIoSpace(MmGetPhysicalAddress((void*)a),4,0);
        a=(ULONG)&Proxy;
        
        _asm
        {
            mov eax,base
            mov ebx,dword ptr[eax]
            mov RealCallee,ebx
            mov ebx,a
            mov dword ptr[eax],ebx
        }
        
        MmUnmapIoSpace(base,4);
        
        memmove(&a,&buff[4],4);
        output=(char*)MmMapIoSpace(MmGetPhysicalAddress((void*)a),256,0);
    }

    Irp->IoStatus.Status=0;
    IoCompleteRequest(Irp,IO_NO_INCREMENT);
    return 0;
}      

As you can see, there is nothing special here either - we just map the exchange buffer into the kernel address space by ​

​MmMapIoSpace()​

​​, plus write the address of our proxy function to the Service Table (certainly, we do it after having saved the address of the actual service implementation in the ​

​RealCallee​

​​ global variable). In order to overwrite the appropriate entry of the Service Table, we map the target address with ​

​MmMapIoSpace()​

​​. Why do we do it? After all, we already have an access to the Service Table, don't we? The problem is that the Service Table may reside in read-only memory. Therefore, we have to check whether we have write access to the target page, and if we don't, we have to change page protection before overwriting the Service Table. Too much work, don't you think? Therefore, we just map our target address with ​

​MmMapIoSpace()​

​, so we don't have to worry about page protection any more - from now on we can take write access to the target page for granted. Now let's look at our proxy function:

Collapse

|

​​​Copy Code​​

//this function decides whether we should 
//allow NtCreateSection() call to be successfull
ULONG __stdcall check(PULONG arg)
{

    HANDLE hand=0;PFILE_OBJECT file=0;
    POBJECT_HANDLE_INFORMATION info;ULONG a;char*buff;
    ANSI_STRING str; LARGE_INTEGER li;li.QuadPart=-10000;

    //check the flags. If PAGE_EXECUTE access to the section is not requested,
    //it does not make sense to be bothered about it
    if((arg[4]&0xf0)==0)return 1;
    if((arg[5]&0x01000000)==0)return 1;
    
    
    //get the file name via the file handle
    hand=(HANDLE)arg[6];
    ObReferenceObjectByHandle(hand,0,0,KernelMode,&file,&info);
    if(!file)return 1;
    RtlUnicodeStringToAnsiString(&str,&file->FileName,1);
    
    a=str.Length;buff=str.Buffer;
    while(1)
    {
        if(buff[a]=='.'){a++;break;}
        a--;
    }
    ObDereferenceObject(file);
    
    //if it is not executable, it does not make sense to be bothered about it
    //return 1
    if(_stricmp(&buff[a],"exe")){RtlFreeAnsiString(&str);return 1;}
    
    //now we are going to ask user's opinion. 
    //Write file name to the buffer, and wait until
    //the user indicates the response 
    //(1 as a first DWORD means we can proceed)
    
    //synchronize access to the buffer
    KeWaitForSingleObject(&event,Executive,KernelMode,0,0);
    
    
    // set first 2 DWORD of a buffer to zero, 
    // copy the string into the buffer, and loop
    // until the user sets first DWORD to 1.
    // The value of the second DWORD indicates user's 
    //response
    strcpy(&output[8],buff);
    RtlFreeAnsiString(&str);

    a=1;
    memmove(&output[0],&a,4);
    while(1)
    {
        KeDelayExecutionThread(KernelMode,0,&li);
        memmove(&a,&output[0],4);
        if(!a)break;
    }
    memmove(&a,&output[4],4);
    KeSetEvent(&event,0,0);
    
    return a;
}

//just saves execution contect and calls check() 
_declspec(naked) Proxy()
{
    _asm{
        //save execution contect and calls check() 
        //-the rest depends upon the value check() returns
        // if it is 1, proceed to the actual callee. 
        //Otherwise,return STATUS_ACCESS_DENIED
        pushfd
        pushad
        mov ebx,esp
        add ebx,40
        push ebx
        call check
        cmp eax,1
        jne block
        
        //proceed to the actual callee
        popad
        popfd
        jmp RealCallee
        
        //return STATUS_ACCESS_DENIED
        block:popad
        mov ebx, dword ptr[esp+8]
        mov dword ptr[ebx],0
        mov eax,0xC0000022L
        popfd
        ret 32
    }
}      

​Proxy()​

​​ saves registers and flags, pushes a pointer to the service parameters on the stack, and calls ​

​check()​

​​. The rest depends on the value ​

​check()​

​​ returns. If ​

​check()​

​​ returns ​

​TRUE​

​​ (i.e. we want to proceed with the request), ​

​Proxy()​

​​ restores registers and flags, and transfers control to the service implementation. Otherwise, ​

​Proxy()​

​​ writes ​

​STATUS_ACCESS_DENIED​

​​ to ​

​EAX​

​, restores ​

​ESP​

​ and returns - from the caller's perspective it looks like ​​

​NtCreateSection()​

​​ call had failed with ​

​STATUS_ACCESS_DENIED​

​ error status.

How does ​

​check()​

​​ make its decision? Once it receives a pointer to the service parameters as an argument, it can examine these parameters. First of all, it checks flags and attributes - if a section is not requested to be mapped as an executable image, or if the requested page protection does not allow execution, we can be sure that ​

​NtCreateSection()​

​​ call has nothing to do with process creation. In such a case ​

​check()​

​​ returns ​

​TRUE​

​​ straight away. Otherwise, it checks the extension of the underlying file - after all, the ​

​SEC_IMAGE​

​ attribute and the page protection that allows execution may be requested for mapping some DLL file. If the underlying file is not a .exe file, ​

​check()​

​​ returns ​

​TRUE​

​. Otherwise, it gives the user-mode code a chance to take its decision. Therefore, it just writes the file name and the path to the exchange buffer, and polls it until it gets the response.

Before opening our driver, our application creates a thread that runs the following function:

Collapse

|

​​​Copy Code​​

void thread()
{
    DWORD a,x; char msgbuff[512];
    
    while(1)
    {
        memmove(&a,&outputbuff[0],4);
        
        //if nothing is there, Sleep() 10 ms and check again
        if(!a){Sleep(10);continue;}
            
        // looks like our permission is asked. If the file
        // in question is already in the white list,
        // give a positive response
        char*name=(char*)&outputbuff[8];
        for(x=0;x<stringcount;x++)
        {
            if(!stricmp(name,strings[x])){a=1;goto skip;}
        }
    
        // ask user's permission to run the program
        strcpy(msgbuff, "Do you want to run ");
        strcat(msgbuff,&outputbuff[8]);
        
        // if user's reply is positive, add the program to the white list
        if(IDYES==MessageBox(0, msgbuff,"WARNING",
           MB_YESNO|MB_ICONQUESTION|0x00200000L))
        {a=1; strings[stringcount]=_strdup(name);stringcount++;}
        else a=0;
    
        // write response to the buffer, and driver will get it
        skip:memmove(&outputbuff[4],&a,4);

        //tell the driver to go ahead
        a=0;
        memmove(&outputbuff[0],&a,4);
    }
}      

This code is self-explanatory - our thread polls the exchange buffer every 10 ms. If it discovers that our driver has posted its request to the buffer, it checks the file name and path against the list of programs that are allowed to run on the machine. If the match is found, it gives an OK response straight away. Otherwise, it displays a message box, asking the user whether he allows the program in question to be executed. If the response is positive, we add the program in question to the list of software that is allowed to run on the machine. Finally, we write the user response to the buffer, i.e., pass it to our driver. Therefore, the user gets the full control of processes creation on his PC - as long as our program runs, there is absolutely no way to launch any process on the PC without asking user permission.

As you can see, we make the kernel-mode code wait for the user response. Is it really a wise thing to do??? In order to answer this question, you have to ask yourself whether you are blocking any critical system resources -everything depends on the situation. In our case everything happens at IRQL ​

​PASSIVE_LEVEL​

​, dealing with IRPs is not involved, and the thread that has to wait for the user response is not of critical importance. Therefore, in our case everything works fine. However, this sample is written for demonstration purposes only. In order to make any practical use of it, it makes sense to rewrite our application as an auto-start service. In such a case, I suggest we should make an exemption for the LocalSystem account, and, in case if ​

​NtCreateSection()​

​ is called in the context of a thread with LocalSystem account privileges, proceed to the actual service implementation without performing any checks -after all, LocalSystem account runs only those executables that are specified in the Registry. Therefore, such an exemption is not going to compromise our security.

Conclusion

In conclusion I must say that hooking the native API is definitely one the most powerful programming techniques that ever existed. This article gives you just one example of what can be achieved by hooking the native API - as you can see, we managed to prevent execution of unauthorized programs by hooking a single(!!!) native API function. You can extend this approach further, and gain full control over hardware devices, file IO operation, network traffic, etc. However, our current solution is not going to work for kernel-mode API callers - once kernel-mode code is allowed to call ntoskrnl.exe's exports directly, these calls don't need to go via the the system service dispatcher. Therefore, in my next article we are going to hook ntoskrnl.exe itself.

This sample has been successfully tested on several machines that run Windows XP SP2. Although I haven't yet tested it under any other environment, I believe that it should work fine everywhere - after all, it does not use any structure that may be system-specific. In order to run the sample, all you have to do is to place protector.exe and protector.sys to the same directory, and run protector.exe. Until protector.exe's application window is closed, you will be prompted every time you attempt running any executable.

I would highly appreciate if you send me an e-mail with your comments and suggestions.

License