Skip to main content
  1. Posts/

eBPF for Linux Admins: Part VII

·6 mins· loading · loading ·
Ansil H
ebpf kernel kprobes
Author
Ansil H
DevOps Guy
eBPF - This article is part of a series.
Part 7: This Article

Let’s look at , kprobes.

Kprobes are one of the dynamic tracing functionality available in Linux Kernel.

But why we are learning kprobes and why that is related to eBPF ?

As I said earlier, things will get interesting going forward. Please be patient and keep learning

Kprobes enables you to dynamically break into any kernel routine and collect debugging and performance information non-disruptively. You can trap at almost any kernel code address, specifying a handler routine to be invoked when the breakpoint is hit. Read more about kprobes

Here is the basic working principle of kprobes

  • Identify the kernel function you want to probe.
  • Register kprobes in that function.
  • The first opcode of the function will be replaced with a break point and the original instruction gets copied.
  • The user defined routine, pre-handler gets called which can inspect all details coming to the original function.
  • Once pre-handler completes the execution, the original instruction that copied earlier gets executed.
  • After the execution, the optional post-handler gets executed which is a user defined routine.
  • Finally the control goes back to the original flow and the next instructions gets executed.

Most of our focus will be on pre-handler where we can examine the data coming to the function.

Let’s write a kernel module that probes the function openat and print the program, pid and the file name. We are not going to trace all programs, instead we will trace the function only when the program name matches with sample_write that we wrote earlier.

Let’s write the kprobe module and compile it.

mkdir -p lkmpg/kprobes
cd !$
vi kprobe_example.c
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/kprobes.h>
#include <linux/sched.h>

static struct kprobe kp = {
  // The actual function that implements openat syscall
  // Detailed explanation of this part is at the end of the article.
  .symbol_name = "do_sys_openat2",
};

static int
handler_pre (struct kprobe *p, struct pt_regs *regs)
{
  // we are interested in 'sample_write' program we wrote in previous chapter
  if (strncmp (current->comm, "sample_write", 12))
    return 0;
  // Access file name - "man syscall" and look for ABI for more details 
  char *param_fname_reg = (char __user *) regs->si;
  
  // Print the information to 'dmesg'
  printk ("do_sys_openat2 called by:%s pid=%i fname=%s\n", current->comm,
	  current->pid, param_fname_reg);
  return 0;
}

static void
handler_post (struct kprobe *p, struct pt_regs *regs, unsigned long flags)
{
  /* Optional handler */
}

static int __init
kprobe_init (void)
{
  kp.pre_handler = handler_pre;
  kp.post_handler = handler_post;
  register_kprobe (&kp);
  printk ("Kprobe attached to do_sys_openat2\n");
  return 0;
}

static void __exit
kprobe_exit (void)
{
  unregister_kprobe (&kp);
  printk ("Kprobe detached from do_sys_openat2\n");
}

module_init (kprobe_init);
module_exit (kprobe_exit);
MODULE_LICENSE ("GPL");
MODULE_AUTHOR ("Ansil H");
MODULE_DESCRIPTION ("Simple Kprobe to trace file open operations from sample_write program");

You might be wondering why we are tracing do_sys_openat2 instead of syscall openat. Long story short, when the openat syscall gets executed, the actual function inside the kernel that is responsible for doing the work is do_sys_openat2. More details are available here if you are interested.

Now compile the code and load it

vi Makefile
obj-m += kprobe_example.o 
PWD := $(CURDIR) 
all: 
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules 

clean: 
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
make

Load the module.

sudo insmod ./kprobe_example.ko 

Now execute journalctl -f in another terminal and you can see the message Kprobe attached to do_sys_openat2

Jan 22 18:23:19 ebpf sudo[4176]:    ansil : TTY=pts/0 ; PWD=/home/ansil/lkmpg/kprobes ; USER=root ; COMMAND=/usr/sbin/insmod ./kprobe_example.ko
Jan 22 18:23:19 ebpf sudo[4176]: pam_unix(sudo:session): session opened for user root(uid=0) by ansil(uid=1000)
Jan 22 18:23:19 ebpf kernel: Kprobe attached to do_sys_openat2
Jan 22 18:23:19 ebpf sudo[4176]: pam_unix(sudo:session): session closed for user root

Good, the module is loaded.

Now the next step is to execute our sample_write program which we wrote in our previous chapter.

./sample_write

The jounalctl output will show below which indicates that the openat syscall were made 3 time. Two calls were for loading the library and the last one to open our text file sample.txt

Jan 22 18:25:17 ebpf kernel: do_sys_openat2 called by:sample_write pid=4182 fname=/etc/ld.so.cache
Jan 22 18:25:18 ebpf kernel: do_sys_openat2 called by:sample_write pid=4182 fname=/lib/x86_64-linux-gnu/libc.so.6
Jan 22 18:25:18 ebpf kernel: do_sys_openat2 called by:sample_write pid=4182 fname=./sample.txt

Yay!!! 🎉

Now we know how to write a module that utilizes kprobes to trace a kernel function. Instead of tracing the program (like we did with strace), we traced the kernel function that implements the syscall.!!

You can unload the module using below command.

sudo rmmod kprobe_example

The journalctl will show below;

Jan 22 18:28:14 ebpf sudo[4183]:    ansil : TTY=pts/0 ; PWD=/home/ansil/lkmpg/kprobes ; USER=root ; COMMAND=/usr/sbin/rmmod kprobe_example
Jan 22 18:28:14 ebpf kernel: Kprobe detached from do_sys_openat2

Below topic is completely optional for you, but the understanding of how to navigate Linux kernel source code will make your life easier.

Finding the function call/symbol of a syscall.
#

Look at https://elixir.bootlin.com/linux/v6.5/source/include/linux/syscalls.h#L446 for all syscalls.

This will show below line.

asmlinkage long sys_openat(int dfd, const char __user *filename, int flags,

Then click on the sys_openat and click on the function definition that points to fs/open.c

https://elixir.bootlin.com/linux/v6.5/source/fs/open.c#L1433

Here you will see;

SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags,
		umode_t, mode)
{
	if (force_o_largefile())
		flags |= O_LARGEFILE;
	return do_sys_open(dfd, filename, flags, mode);
}

In that macro, the return is coming from do_sys_open. Now click on do_sys_open.

That will take you to the function definition on same file https://elixir.bootlin.com/linux/v6.5/source/fs/open.c#L1419

long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
{
	struct open_how how = build_open_how(flags, mode);
	return do_sys_openat2(dfd, filename, &how);
}

The function again returns another one called do_sys_openat2.

If you click on do_sys_openat2, you can see that there is no call to another function in the return statement.

static long do_sys_openat2(int dfd, const char __user *filename,
			   struct open_how *how)
{
	struct open_flags op;
	int fd = build_open_flags(how, &op);
	struct filename *tmp;

	if (fd)
		return fd;

	tmp = getname(filename);
	if (IS_ERR(tmp))
		return PTR_ERR(tmp);

	fd = get_unused_fd_flags(how->flags);
	if (fd >= 0) {
		struct file *f = do_filp_open(dfd, tmp, &op);
		if (IS_ERR(f)) {
			put_unused_fd(fd);
			fd = PTR_ERR(f);
		} else {
			fd_install(fd, f);
		}
	}
	putname(tmp);
	return fd;
}

As a final step, we can check this function in kernel symbol table to make sure our module can access it.

sudo grep -w do_sys_openat2 /proc/kallsyms

Output:-

ffffffff8e2aadf0 t do_sys_openat2

Yes, it’s available. So we are good to use do_sys_openat2.

So this confirms that the function that kernel executes during syscall openat is do_sys_openat2!! 🎉

If it’s too much to digest, you can comeback to this article later. There are tools made to make this steps easier and we will see those going forward.

eBPF - This article is part of a series.
Part 7: This Article

Related

eBPF for Linux Admins: Part VI
·4 mins· loading · loading
Ansil H
ebpf kernel kprobes
eBPF for Linux Admins: Part V
·3 mins· loading · loading
Ansil H
ebpf
eBPF for Linux Admins: Part IV
·3 mins· loading · loading
Ansil H
ebpf