FANOTIFY

Section: Linux Programmer's Manual (7)
Updated: 2014-03-22
Index Return to Main Contents
 

NAME

fanotify - monitoring filesystem events  

DESCRIPTION

The fanotify API provides notification and interception of filesystem events. Use cases are virus scanning and hierarchical storage management.

The following system calls are used with this API: fanotify_init(2), fanotify_mark(2), poll(2), ppoll(2), read(2), write(2), and close(2).

fanotify_init(2) creates and initializes a fanotify notification group and returns a file descriptor referring to it.

A fanotify notification group is an internal object of the kernel which holds a list of files, directories and mount points for which events shall be created.

For each entry two bit masks exist. One mask (the mark mask) defines for which file activities an event shall be created. Another mask (the ignore mask) defines for which activities no event shall be created. Having these two types of masks allows that a mount point or directory is marked for receiving events, but no event is raised for specified contained file system objects.

A possible usage of the ignore mask is for a file cache. Events of interest for a file cache are modification of a file and closing of the same. Hence the cached directory or mount point is to be marked to receive these events. After receiving the first event informing that a file has been modified, the corresponding cache entry will be invalidated. No further modification events for this file are of interest until the file is closed. Hence the modify event can be added to the ignore mask. Upon receiving the closed event the modify event can be removed from the ignore mask and the file cache entry can be updated.

Two types of events exist. Notification events are only informative and require no action to be taken by the receiving application except for closing the file descriptor passed in the event. Permission events are requests to decide whether permission for a file access shall be granted. For these events a response has to be sent.

When all file descriptors referring to the fanotify notification group are closed, the group is released and the resources are freed for reuse by the kernel.

fanotify_mark(2) adds a file, a directory, or a mount to the group and specifies which events shall be reported (or ignored), or removes or modifies such an entry.

When a fanotify event occurs the fanotify file descriptor indicates as readable when passed to epoll(7), poll(2), or select(2).

Calling read(2) for the file descriptor returned by fanotify_init(2) blocks (if flag FAN_NONBLOCK is not set in the call to fanotify_init(2)) until either a file event occurs or it is interrupted by a signal (see signal(7)).

The return value of read(2) is the length of the filled buffer or -1 in case of an error. In case of success the read buffer contains one or more of the following structures:

struct fanotify_event_metadata {
    __u32 event_len;
    __u8 vers;
    __u8 reserved;
    __u16 metadata_len;
    __aligned_u64 mask;
    __s32 fd; 
    __s32 pid;
};

event_len
This is the length of the data for the current event and the offset to the next event in the buffer. This length might be longer than the size of structure fanotify_event_metadata. Therefore it is recommended to use a larger buffer size when reading, e.g. 4096 bytes.
vers
The structures fanotify_event_metadata and fanotify_response have been changed repeatedly. This field holds the version information about the structures. It must be compared to FANOTIFY_METADATA_VERSION to verify that the structures at runtime match the structures at compile time. In case of a mismatch the fanotify file descriptor has to be closed.
reserved
This field is not used.
metadata_len
This is the length of the structure. The field was introduced to facilitate the implementation of optional headers per event type.
mask
This is a bitmask describing the event.
fd
This is an open file descriptor for the object being accessed or FAN_NOFD if a queue overflow occurred. The reading application is responsible for closing this file descriptor.
pid
This is the ID of the process that caused the event. A program listening to fanotify events can compare this pid to the pid returned by getpid(2) to detect if the event is caused by the listener itself or is due to a file access by another program.

The bitmask in mask signals which events have occured for a single file system object. More than one of the following flags can be set at once in the bitmask.

FAN_ACCESS
A file was accessed.
FAN_OPEN
A file was opened.
FAN_MODIFY
A file was modified.
FAN_CLOSE_WRITE
A writable file was closed.
FAN_CLOSE_NOWRITE
An read only file was closed.
FAN_Q_OVERFLOW
The event queue exceeded the limit of 16384 entries. This limit can be overriden in the call to fanotify_init(2) by setting flag FAN_UNLIMITED_QUEUE.
FAN_ACCESS_PERM
An application wants to access a file. A decision has to be taken if the permission to access the file is granted.
FAN_OPEN_PERM
An application wants to open a file. A decision has to be taken if the permission to open the file is granted.
FAN_ONDIR
The event concerns a monitored directory.
FAN_EVENT_ON_CHILD
The event concerns the child of a monitored directory.

To check for any close event the following bitmask may be used:

FAN_CLOSE
A file was closed (FAN_CLOSE_WRITE | FAN_CLOSE_NOWRITE).

The following macros are provided to iterate over a buffer with fantify event metadata.

FAN_EVENT_OK(meta, len)
This macro checks the remaining length len of the buffer meta against the length of the metadata structure and the event_len field of the first metadata structure in the buffer.
FAN_EVENT_NEXT(meta, len)
This macro lets the pointer meta point to the next metadata structure using the length indicated in the event_len field of the metadata structure and reduces the remaining length of the buffer len.

For permission events, the application must write(2) a structure of the following form to the fanotify file descriptor

struct fanotify_response {
        __s32 fd;
        __u32 response;
};

fd
This is the file descriptor from structure fanotify_event_metadata.
response
This field contains the decision if the permission is granted. It's value must be either FAN_ALLOW to allow the file operation or FAN_DENY to deny the file operation.

To end listening, it is sufficient to close(2) the fanotify file descriptor. The open permission events will be set to allowed, and all resources will be returned to the kernel.

The file /proc/<pid>/fdinfo/<fd> contains information about fanotify marks for file descriptor fd of process pid. See Documentation/filesystems/proc.txt for details.  

ERRORS

The following errors may occur when reading from the fanotify file descriptor:
EAGAIN
A nonblocking call did not return any data.
EFAULT
The read buffer is outside of the accessible address space.
EINTR
The call was interrupted by a signal handler.
EINVAL
The buffer is too short to hold the event.

The following errors may occur when writing to the fanotify file descriptor:

EFAULT
The write buffer is outside of the accessible address space.
EINVAL
Fanotify access permissions are not enabled or the value of response in the response structure is not valid.
ENOENT
The file descriptor fd in the response structure is not valid. This might occur because the file was already deleted by another thread or process.
 

VERSIONS

The fanotify API was introduced in version 2.6.36 of the Linux kernel and enabled in version 2.6.37. Fdinfo support was added in version 3.8.  

CONFORMING TO

The fanotify API is Linux-specific.  

NOTES

The notification is based on the kernel filesystem notification system fsnotify.

To enable the fanotify API the following setting in the Kernel configuration is needed: CONFIG_FANOTIFY=y. For permission handling CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y must be set.  

EXAMPLE

The following program demonstrates the usage of the fanotify API. It marks the mount passed as argument and waits for events of type FAN_PERM_OPEN and FAN_CLOSE_WRITE. When a permission event arises a FAN_ALLOW response is given.

The following output was recorded while editing file /home/user/temp/notes. Before the file was opened a FAN_OPEN_PERM event occurred. After the file was closed a FAN_CLOSE_WRITE event occurred. The example program ended when hitting the enter key.  

Example output

# ./fanotify_example /home
Press enter key to terminate.
Listening for events.
FAN_OPEN_PERM: File /home/zfsdt/temp/notes
FAN_CLOSE_WRITE: File /home/zfsdt/temp/notes

Listening for events stopped.
 

Program source

#define _GNU_SOURCE // needed for O_LARGEFILE
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <poll.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/fanotify.h>
#include <unistd.h>

// Handle available events.
static void
handle_events(int fd)
{
    const struct fanotify_event_metadata *metadata;
    char buf[4096];
    int len;
    char path[PATH_MAX];
    int path_len;
    struct fanotify_response response;

    // Loop while events can be read from fanotify file descriptor.
    for(;;) {

        // Read next events.
        len = read(fd, (void *) &buf, sizeof(buf));
        if (len > 0) {

            // Point to the first event in the buffer.
            metadata = (struct fanotify_event_metadata *) buf;

            // Loop over all events in the buffer.
            while (FAN_EVENT_OK(metadata, len)) {

                // Assure that run time and compile time structures match.
                assert(metadata->vers == FANOTIFY_METADATA_VERSION);

                // Check event contains a file descriptor.
                if (metadata->fd >= 0) {

                    // Handle open permission event.
                    if (metadata->mask & FAN_OPEN_PERM) {
                        printf("FAN_OPEN_PERM: ");

                        // Allow file to be opened.
                        response.fd = metadata->fd;
                        response.response = FAN_ALLOW;
                        write(fd, &response, sizeof(
                                  struct fanotify_response));
                    }

                    // Handle closing of writable file event.
                    if (metadata->mask & FAN_CLOSE_WRITE) {
                        printf("FAN_CLOSE_WRITE: ");
                    }

                    // Determine path of the file accessed.
                    sprintf(path, "/proc/self/fd/%d", metadata->fd);
                    path_len = readlink(path, path, sizeof(path) - 1);

                    // Write path.
                    if (path_len > 0) {
                        path[path_len] = '\0';
                        printf("File %s", path);
                    }

                    // Close the file descriptor of the event.
                    close(metadata->fd);
                    printf("\n");
                }

                // Forward pointer to next event.
                metadata = FAN_EVENT_NEXT(metadata, len);
            }
        } else {

            // No more events are available.
            break;
        }
    }
}

int
main(int argc, char *argv[])
{
    char buf;
    int fd, ret;
    nfds_t nfds;
    struct pollfd fds[2];

    // Check mount point is supplied.
    if (argc != 2) {
        printf("Usage: %s MOUNT\n", argv[0]);
        return EXIT_FAILURE;
    }

    printf("Press enter key to terminate.\n");

    // Create the file descriptor for accessing the fanotify API.
    fd = fanotify_init(FAN_CLOEXEC | FAN_CLASS_CONTENT | FAN_NONBLOCK,
                       O_RDONLY | O_LARGEFILE);
    if (fd == -1) {
        perror("fanotify_init");
        return EXIT_FAILURE;
    }

    // Mark the mount for
    // - permission events before opening files
    // - notification events after closing a write enabled file descriptor.
    if (fanotify_mark(fd, FAN_MARK_ADD | FAN_MARK_MOUNT,
                      FAN_OPEN_PERM | FAN_CLOSE_WRITE, FAN_NOFD,
                      argv[1]) == -1) {
        perror("fanotify_mark");
        close(fd);
        return EXIT_FAILURE;
    }

    // Prepare for polling.
    nfds = 2;
    // Console input.
    fds[0].fd = STDIN_FILENO;
    fds[0].events = POLLIN;
    fds[0].revents = 0;
    // Fanotify input.
    fds[1].fd = fd;
    fds[1].events = POLLIN;
    fds[1].revents = 0;

    // This is the loop to wait for incoming events.
    printf("Listening for events.\n");
    for(;;) {
        ret = poll(fds, nfds, -1);
        if (ret > 0) {
            if (fds[0].revents & POLLIN) {
                // Console input is available. Empty stdin and quit.
                while(read(STDIN_FILENO, &buf, 1) > 0 && buf != '\n');
                break;
            }
            if (fds[1].revents & POLLIN) {
                // Fanotify events are available.
                handle_events(fd);
                fds[1].revents = 0;
            }
        } else if (ret == -1 && errno != EINTR) {
            perror("poll");
            break;
        }
    }

    // Close fanotify file descriptor.
    close(fd);
    printf("Listening for events stopped.\n");
    return EXIT_SUCCESS;
}
 

SEE ALSO

fanotify_init(2), fanotify_mark(2), inotify(7)


 

Index

NAME
DESCRIPTION
ERRORS
VERSIONS
CONFORMING TO
NOTES
EXAMPLE
Example output
Program source
SEE ALSO

This document was created by man2html using the manual pages.