I/O Buffering in Standard Libraries of Perl and Python

The other day, while setting up a logging system for Deadlock Events using pt-deadlock-logger, I encountered a peculiar issue. Even though the connection to the database was clearly successful, the Deadlock Events were not immediately displayed on STDOUT (instead, they appeared with a delay).

Interestingly, this issue only occurred when running the command via docker run or when deploying it to an ECS cluster. However, when running docker run -t (assigning a terminal) or executing pt-deadlock-logger directly from Terminal.app on my local machine (where a terminal is assigned), the output appeared immediately after execution.

After investigating, I discovered the reason behind this behavior. Since pt-deadlock-logger is implemented in Perl, and Perl buffers I/O events like STDOUT by default, log output appears delayed when the output volume is low.

Implementing I/O event buffering at the standard library level in programming languages makes sense. It is a common optimization in languages like Ruby and Python, helping improve CPU efficiency by reducing the number of syscalls. Moreover, Perl, Ruby, and Python all seem to enable or disable this buffering dynamically based on whether a terminal is attached (which explains why docker run -t worked).

Of course, disabling buffering arbitrarily in production applications is not advisable. However, there are cases where disabling it under specific conditions is necessary.

In Python, this can conveniently be controlled via environment variables. In contrast, Perl requires modifying the source code. For instance, using $io->autoflush() can effectively disable buffering.

use IO::Handle;
STDOUT->autoflush(1);

Since I'm not deeply familiar with Perl, I’m not sure if this is the best practice, but I’ve also submitted a PR to pt-deadlock-logger to address this issue.

2021-01-12