Author
Andreas FastAndreas Fast has more than 15 years of experience in the technology industry, starting as a developer to leading high performance teams at international organizations. As Principal at Moove It he works with business leaders to help define their technology needs and objectives.
The Linux Process in Ruby
Creating a new process
In linux we use fork to create a new child process. In Ruby we have access to this by using Process::fork
It accepts an optional block. We can use either Process.fork or simply fork. When a block is given the child process runs the block and terminates with a status of zero. When no block is given the fork call returns twice, once in the parent process returning the child pid and once in the child process returning nil. When using fork only the thread calling fork will be running in the child process, no other thread will be copied. Since ruby 2 there’s been an improvement to copy on write with relation to the GC. Instead of duplicating all the data it just copies the data when the shared memory is modified by one of the processes.
Files & Network Connections
When using a fork the open files and network connections are shared between processes. This can be a good or a bad thing depending on what you are doing. Sometimes you are interested in writing to the same file from both processes at the same time. But in case of a database connection you could run into weird scenarios. So remember to re-connect and close/open files if you are not interested in using the same connection as the father process did.
Fork Example
def puts_with_pid(string) puts "[#{Process.pid}]$ #{string}" end puts_with_pid "Fork Example" child_pid = Process.fork { puts_with_pid "I'm the child process in a fork block" } Process.wait(child_pid) child_pid = fork if child_pid.nil? puts_with_pid "This is the child process because child_pid.nil? => true and the PID is #{Process.pid}" else puts_with_pid "This is the parent process with pid #{Process.pid} and the child's pid is #{child_pid}" end puts_with_pid "Both processes print this and child_pid is the way to differentiate between them (child_pid=#{child_pid.inspect})" Process.wait(child_pid) if child_pid
Example run
$ ruby fork.rb [5109]$ Fork Example [5111]$ I'm the child process in a fork block [5109]$ This is the parent process with pid 5109 and the child's pid is 5114 [5109]$ Both processes print this and child_pid is the way to differentiate between them (child_pid=5114) [5114]$ This is the child process because child_pid.nil? => true and the PID is 5114 [5114]$ Both processes print this and child_pid is the way to differentiate between them (child_pid=nil)
Process ID (PID)
To get a process’ PID we can use Process::pid and to get the parent’s pid we use Process::ppid
$ ruby -e'puts "pid=#{Process.pid} and parent pid=#{Process.ppid}"' pid=5251 and parent pid=2220
Reaping a child’s status
When using fork in linux the child’s exit status needs to be collected, otherwise the operating system will accumulate zombies. There are a number of ways to do this in ruby, here’s a list:
- Process.wait(pid=-1, flags=0)
- pid > 0 Waits for the child whose process ID equals pid.
- 0 Waits for any child whose process group ID equals that of the calling process.
- -1 Waits for any child process (the default if no pid is given).
- < -1 Waits for any child whose process group ID equals the absolute value of pid.
- Process.waitall
- Waits for all children, returning an array of pid/status pairs (where status is a Process::Status object).
- Process.detach
- Sets up a separate Ruby thread whose sole job is to reap the status of the process pid when it terminates.
- Use detach only when you do not intent to explicitly wait for the child to terminate.
Spawn
Spawn executes the specified command and return its pid. It does not wait for the command to finish and the parent process should wait for it to finish or use detach if they don’t care about the return status.
pid = spawn(RbConfig.ruby, "-eputs'Hello, world!'") Process.wait pid
Daemonize
Process::daemon allows a ruby process to detach from the controlling terminal and run as a system daemon.
Signals
Trapping Signals
Signals play a big part in IPC. To handle a signal in ruby you can use
Signal::trap(signal, command) Signal::trap(signal) {||block}
It receives the signal and a command or block. There are special commands that you can look up in the Signal::trap documentation. The more interesting scenario here is to pass a block to the trap and execute whatever it is we want to do when receiving the signal. Note that signal handlers need to be reentrant and that signals are deferred. This means that your process might receive the signal only once if it was triggered many times in quick succession. There’s no way to tell beforehand.
Sending Signals
To send a signal in ruby use
Process::kill(signal, pid)
With kill you indicate the signal and the pid. This will send the signal to the process indicated by pid.
Example
pid = fork do trap 'TERM' do puts 'OK, I\'m done' exit end loop { sleep } end Process.kill 'TERM', pid Process.wait
Execution
$ ruby signal.rb OK, I'm done
Reentrant
When trapping signals you need to make sure your code is reentrant. This means that it has to be able to be executed many times and even if it’s being executed, because basically it may be interrupting the handling of an interruption. A good practice here is to store the signal in a queue and work through the queue in another thread or in the main process.
Deferred
Not all signals will make it to the process, because signals are not queued, they are pending. So if you send a signal and the same signal is still pending it won’t make it to the process.
Pipes
In Linux on the command line we like to combine commands using the pipe operator “|”. This is very useful to connect one process with the other and have small programs to specific stuff and use pipes to combine and get more complex stuff done. In ruby we can access the pipe functionality using IO::pipe
reader, writer = IO.pipe writer.puts 'Hello' writer.puts 'World' puts reader.gets puts reader.gets # reader.gets # would block! require 'io/wait' puts reader.ready? # => false writer.puts '!' puts reader.ready? # => true puts reader.gets
Pipes are write atomic up until 512 bytes, but there are limits to how much data a pipe can buffer. The size is provided by the Kernel and is 64 kb, after that the pipe start losing data.
IPC using pipes
Now the interesting part is how to communicate processes using pipes. After a fork the pipes remain open and are shared. So it’s only a matter of closing the writer on the child’s end and the reader on the parent’s end and we have a communication channel from the parent to the child.
reader, writer = IO.pipe 3.times do fork do writer.puts "I'm child with pid #{Process.pid}" end end writer.close 3.times { puts reader.gets } reader.close
$ ruby pipe_ipc.rb I'm child with pid 13434 I'm child with pid 13437 I'm child with pid 13440
We can start opening more processes and keep communication back and forth through pipes. One per direction per process.
Conclusion
Linux has small but very powerful Inter Process Communication tools and they are usable through the ruby API. It’s up to us finding the right tool for the right job and leverage proven tools that work fast and have been around for years. They’re going nowhere so we can definitely rely on them.