Skip to main content
  1. Posts/

eBPF for Linux Admins: Part VI

·4 mins· loading · loading ·
Ansil H
ebpf kernel kprobes
Author
Ansil H
DevOps Guy
Table of Contents
eBPF - This article is part of a series.
Part 6: This Article

In previous chapters, we have seen how XDP and eBPF were used to filter packets.

Now we will see what is syscall, how we can use kprobes to trace a syscall etc.

Yes, from this chapter onwards, we are not dealing with network. I’ve started with network stack so that as a Linux admin you can easily connect the concepts of eBPF.

Hmm.. syscall? The kprobes,syscall,routine,breakpoints etc. are like alien language to me Don’t worry, it was same for me too, but we will cover the fundamentals of syscall before we move on to kprobes

As a Linux admin, you should know syscall and if not, then this article is for you.

Below diagram shows how an application interact with the system.

The entrypoint for an application to the kernel space is the syscall interface.

You can use the strace command to see the syscalls made by a process.

Use below command to install strace if it’s not installed

sudo apt-get install strace

The below strace command shows the syscall (last column) made by the echo command.

ansil@ebpf:~$ strace -c echo Hello
Hello
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0.00    0.000000           0         1           read
  0.00    0.000000           0         1           write
  0.00    0.000000           0        18           close
  0.00    0.000000           0        21           mmap
  0.00    0.000000           0         3           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         2           pread64
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2         1 arch_prctl
  0.00    0.000000           0         1           futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0        30        14 openat
  0.00    0.000000           0        17           newfstatat
  0.00    0.000000           0         1           set_robust_list
  0.00    0.000000           0         1           prlimit64
  0.00    0.000000           0         1           getrandom
  0.00    0.000000           0         1           rseq
------ ----------- ----------- --------- --------- ----------------
100.00    0.000000           0       107        16 total
ansil@ebpf:~$ 

Sample Write
#

To further understand the syscall, let’s write a simple C program that will write a line to a file.

vi sample_write.c
#include <stdio.h>
#include <errno.h>
#include <string.h>

int main (void)
{
  FILE *fp = fopen ("./sample.txt", "w");
  if (fp != NULL)
    {
      if (fprintf (fp, "Random text\n") < 0)
	{
	  fprintf (stderr, "err=%d: %s\n", errno, strerror (errno));
	  fclose (fp);
	  return errno;
	}
      fclose (fp);
    }
  return 0;
}

Compile the program.

gcc sample_write.c -o sample_write

You can execute it and see examine the file content.

ansil@ebpf:~$ ./sample 
ansil@ebpf:~$ cat sample.txt 
Random text
ansil@ebpf:~$ 

From the user’s perspective, the program and the outcomes looks simple, but from a kernel point of view, there is a lot of things in play.

As a user, you are creating a file on the disk. The transactions goes through different layers like, the standard library, syscall, virtual file system, the file system driver, the disk driver and finally the disk.

The complexity of those interactions were abstracted away for the user by the kernel using syscall interface.

As an user, your application will be interacting with the syscall interface and everything else is taken care by the kernel.

Now, let’s see how many syscalls were made by our program.

ansil@ebpf:~$ strace -c ./sample_write 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 33.54    0.000912         912         1           execve
 13.90    0.000378          47         8           mmap
  9.64    0.000262          87         3           close
  8.16    0.000222          74         3           openat
  6.84    0.000186          62         3           mprotect
  6.36    0.000173         173         1           munmap
  4.30    0.000117          39         3           newfstatat
  4.05    0.000110          55         2           pread64
  3.38    0.000092          30         3           brk
  2.43    0.000066          66         1           write
  1.62    0.000044          22         2         1 arch_prctl
  1.10    0.000030          30         1         1 access
  0.96    0.000026          26         1           getrandom
  0.88    0.000024          24         1           read
  0.85    0.000023          23         1           prlimit64
  0.70    0.000019          19         1           set_robust_list
  0.66    0.000018          18         1           set_tid_address
  0.63    0.000017          17         1           rseq
------ ----------- ----------- --------- --------- ----------------
100.00    0.002719          73        37         2 total
ansil@ebpf:~$ 

You can even examine individual calls too. Here I’m interested in openat syscall.

ansil@ebpf:~$ strace -e openat ./sample_write 
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "./sample.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
+++ exited with 0 +++

There are 3 openat syscalls, two for loading libraries and the final one for opening our text file sample.txt.

We can clearly see the syscall made by our program to read the file in the output.

Now you know what is syscall and how to trace a program.

Let’s take a scenario where you want to see the openat syscall happening in the system without strace and without even interacting with the program 🤯

In next chapter, we will discuss how to do it using dynamic tracing with kprobes.

Please re-visit if you want to brush up the kernel module concepts.

eBPF - This article is part of a series.
Part 6: This Article

Related

eBPF for Linux Admins: Part V
·3 mins· loading · loading
Ansil H
ebpf
eBPF for Linux Admins: Part IV
·3 mins· loading · loading
Ansil H
ebpf
eBPF for Linux Admins: Part III
·4 mins· loading · loading
Ansil H
ebpf