For future reference.<div><br></div><div>I examined what perf does when sampling the stack, (e.g. "-g").</div><div><br></div><div>0. Indeed, it does not support callchain when sampling guest KVM OS. Probably because it's not trivial to find out safely where the stack starts</div><div><br></div><div><a href="http://lxr.free-electrons.com/source/arch/x86/kernel/cpu/perf_event.c#L1965">http://lxr.free-electrons.com/source/arch/x86/kernel/cpu/perf_event.c#L1965</a></div><div><div>void</div><div>perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)</div><div>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {</div><div><b><span class="Apple-tab-span" style="white-space:pre"> </span>/* TODO: We don't support guest os callchain now */</b></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>return;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>perf_callchain_store(entry, regs->ip);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>dump_trace(NULL, regs, NULL, 0, &backtrace_ops, entry);</div><div>}</div></div><div><br></div><div>1. It assumes the kernel is compiled with a frame pointer, and would walk through frame pointer until it reaches an invalid one. Presumably there's an invalid frame pointer at the top of the stack.</div><div>(stack walking func is print_context_stack_bp):</div><div><br></div><div><a href="http://lxr.free-electrons.com/source/arch/x86/kernel/dumpstack.c#L122">http://lxr.free-electrons.com/source/arch/x86/kernel/dumpstack.c#L122</a><br></div><div><br></div><div><div>while (valid_stack_ptr(tinfo, ret_addr, sizeof(*ret_addr), end)) {</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>unsigned long addr = *ret_addr;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>if (!__kernel_text_address(addr))</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>break;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>ops->address(data, addr, 1);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>frame = frame->next_frame;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>ret_addr = &frame->return_address;</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>print_ftrace_graph_addr(addr, data, ops, tinfo, graph);</div><div>}</div></div><div><br><div class="gmail_quote">On Sun Dec 21 2014 at 9:28:01 AM Muli Ben-Yehuda <<a href="mailto:mulix@mulix.org">mulix@mulix.org</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Fri, Dec 19, 2014 at 02:19:07PM +0000, Elazar Leibovich wrote:<br>
<br>
> I know where the stack ends, but how can I know where it begins?<br>
<br>
What assumptions can you make? Can you run kernel code in the VM<br>
(e.g., by cloning and restarting it)? Can you assume it's running<br>
Linux and/or Windows? Can you assume the kernel was compiled with<br>
frame pointers? Or is it a completely black box VM and you can't make<br>
any assumptions about what's running inside?<br>
<br>
> I can check the memory mapping, and assume nothing would take the<br>
> virtual address before the start of the kernel's stack, but I don't<br>
> know if I can count on it for most mainstream OSes.<br>
<br>
That's a pretty good heuristic but see questions above.<br>
<br>
By the way, some OS's have separate interrupt stacks, so you may be on<br>
an interrupt stack or on a regular stack.<br>
<br>
> Maybe there's a known method I'm missing, I'll be happy for any<br>
> comments.<br>
<br>
Cheers,<br>
Muli<br>
</blockquote></div></div>