Accessing static variables from kernel modules

Accessing static variables from kernel modules

Elazar Leibovich elazarl at gmail.com
Wed Apr 1 08:33:36 IDT 2015


Hi,

I was extending perf counters to sample the stack of a KVM guest from
a module[0].

The current KVM profiling architecture, keeps a CPU local variable
current_vcpu of the current vcpu running before vm_enter, and removes
it after a vm_exit.

Then, when an NMI occurs, it could check the current_vcpu variable,
and get statistics of the guest from it, if it occurred during the
time the VM ran.

What I needed is, sampling the guest stack evey time an NMI occurs. I
needed two things.

1. A way to add code that would run when a PMI occurs.
-  possible with register_nmi_handler public API.
2. A way to access the CPU local variable current_vcpu.
- problematic, since current_vcpu is static.

What I eventually did is, since KVM expose a "setter" to current_vcpu,
I scanned the assembly code of the setter, and looked for a direct
move from register to gs (where CPU local variables are stored) plus
offset. Then take this offset and use it to access the current_vcpu
variable.

What can fail?

1. kvm performance implementation is completely changed.
2. Compiler would do use different instructions to set CPU local
variables (e.g., access CPU local variable by "mov $offset, %r2; mov
$value, (%r2)").

I think both cases are unlikely. This mechanism was written in 2010,
and had a cosmetic change in 2011 (access function to CPU local
variables). I think that there are a few years until this approach
could fail.

Obviously, the correct approach is to fix perf counters in the kernel
to support stack sampling (not trivial). But sometimes you need a
solution now, without patching all your host kernels.

I would be grateful for feedback of this approach, and especially
possible pitfalls I haven't considered.

The gist of the code is[1]:

    for (;;) {
        u8 *p;
        c = memchr(c, GS_SEG_OVERRIDE, end - c);
        if (c == NULL)
            return -1;
        c++;
        p = c;
        if (!IS_RX_W(*p))
            continue;
        p++;
        if (*p != MOV_M_TO_R_OPCODE)
            continue;
        /* We need direct access to memory with displacement */
        /* Don't care which registers are used */
        p++;
        if (MOD(*p) != 0 || RM(*p) != 0b100)
            continue;
        p++;
        if (BASE(*p) != 0b101 || INDEX(*p) != 0b100)
            continue;
        p++;
        /* grab displacement32 value */
        return *(u32 *)p;
    }


[0] https://github.com/elazarl/gueststack
[1] https://github.com/elazarl/gueststack/blob/master/module.c#L114



More information about the Linux-il mailing list