Moove It is now Qubika. For more information, visit our new website
Visit Qubika
conferences
5  min read
December 1, 2014

Just Enough Linux to be Dangerous

Author

Andreas Fast

Andreas Fast has more than 15 years of experience in the technology industry, starting as a developer to leading high performance teams at international organizations. As Principal at Moove It he works with business leaders to help define their technology needs and objectives.

This post is inspired on a talk about Ruby & Linux at the RubyConf Uruguay 2014.

The Linux Process in Ruby

Creating a new process

In linux we use fork to create a new child process. In Ruby we have access to this by using Process::fork

It accepts an optional block. We can use either Process.fork or simply fork. When a block is given the child process runs the block and terminates with a status of zero. When no block is given the fork call returns twice, once in the parent process returning the child pid and once in the child process returning nil. When using fork only the thread calling fork will be running in the child process, no other thread will be copied. Since ruby 2 there’s been an improvement to copy on write with relation to the GC. Instead of duplicating all the data it just copies the data when the shared memory is modified by one of the processes.

Files & Network Connections

When using a fork the open files and network connections are shared between processes. This can be a good or a bad thing depending on what you are doing. Sometimes you are interested in writing to the same file from both processes at the same time. But in case of a database connection you could run into weird scenarios. So remember to re-connect and close/open files if you are not interested in using the same connection as the father process did.

Fork Example

def puts_with_pid(string)
  puts "[#{Process.pid}]$ #{string}"
end

puts_with_pid "Fork Example"

child_pid = Process.fork { puts_with_pid "I'm the child process in a fork block" }
Process.wait(child_pid)

child_pid = fork

if child_pid.nil?
  puts_with_pid "This is the child process because child_pid.nil? => true and the PID is #{Process.pid}"
else
  puts_with_pid "This is the parent process with pid #{Process.pid} and the child's pid is #{child_pid}"
end

puts_with_pid "Both processes print this and child_pid is the way to differentiate between them (child_pid=#{child_pid.inspect})"

Process.wait(child_pid) if child_pid

Example run

$ ruby fork.rb 
[5109]$ Fork Example
[5111]$ I'm the child process in a fork block
[5109]$ This is the parent process with pid 5109 and the child's pid is 5114
[5109]$ Both processes print this and child_pid is the way to differentiate between them (child_pid=5114)
[5114]$ This is the child process because child_pid.nil? => true and the PID is 5114
[5114]$ Both processes print this and child_pid is the way to differentiate between them (child_pid=nil)

Process ID (PID)

To get a process’ PID we can use Process::pid and to get the parent’s pid we use Process::ppid

$ ruby -e'puts "pid=#{Process.pid} and parent pid=#{Process.ppid}"'
pid=5251 and parent pid=2220

Reaping a child’s status

When using fork in linux the child’s exit status needs to be collected, otherwise the operating system will accumulate zombies. There are a number of ways to do this in ruby, here’s a list:

  • Process.wait(pid=-1, flags=0)
    • pid > 0  Waits for the child whose process ID equals pid.
    • 0   Waits for any child whose process group ID equals that of the calling process.
    • -1  Waits for any child process (the default if no pid is given).
    • < -1  Waits for any child whose process group ID equals the absolute value of pid.
  • Process.waitall
    • Waits for all children, returning an array of pid/status pairs (where status is a Process::Status object).
  • Process.detach
    • Sets up a separate Ruby thread whose sole job is to reap the status of the process pid when it terminates.
    • Use detach only when you do not intent to explicitly wait for the child to terminate.

Spawn

Spawn executes the specified command and return its pid. It does not wait for the command to finish and the parent process should wait for it to finish or use detach if they don’t care about the return status.

pid = spawn(RbConfig.ruby, "-eputs'Hello, world!'")
Process.wait pid

Daemonize

Process::daemon  allows a ruby process to detach from the controlling terminal and run as a system daemon.

Signals

Trapping Signals

Signals play a big part in IPC. To handle a signal in ruby you can use

Signal::trap(signal, command)
Signal::trap(signal) {||block}

It receives the signal and a command or block. There are special commands that you can look up in the Signal::trap documentation. The more interesting scenario here is to pass a block to the trap and execute whatever it is we want to do when receiving the signal. Note that signal handlers need to be reentrant and that signals are deferred. This means that your process might receive the signal only once if it was triggered many times in quick succession. There’s no way to tell beforehand.

Sending Signals

To send a signal in ruby use

Process::kill(signal, pid)

With kill you indicate the signal and the pid. This will send the signal to the process indicated by pid.

Example

pid = fork do
 trap 'TERM' do
 puts 'OK, I\'m done'
 exit
 end

 loop { sleep }
end

Process.kill 'TERM', pid
Process.wait

Execution

$ ruby signal.rb 
OK, I'm done

Reentrant

When trapping signals you need to make sure your code is reentrant. This means that it has to be able to be executed many times and even if it’s being executed, because basically it may be interrupting the handling of an interruption. A good practice here is to store the signal in a queue and work through the queue in another thread or in the main process.

Deferred

Not all signals will make it to the process, because signals are not queued, they are pending. So if you send a signal and the same signal is still pending it won’t make it to the process.

Pipes

In Linux on the command line we like to combine commands using the pipe operator “|”. This is very useful to connect one process with the other and have small programs to specific stuff and use pipes to combine and get more complex stuff done. In ruby we can access the pipe functionality using IO::pipe

reader, writer = IO.pipe

writer.puts 'Hello'
writer.puts 'World'

puts reader.gets
puts reader.gets
# reader.gets # would block!

require 'io/wait'
puts reader.ready? # => false

writer.puts '!'
puts reader.ready? # => true
puts reader.gets

Pipes are write atomic up until 512 bytes, but there are limits to how much data a pipe can buffer. The size is provided by the Kernel and is 64 kb, after that the pipe start losing data.

IPC using pipes

Now the interesting part is how to communicate processes using pipes. After a fork the pipes remain open and are shared. So it’s only a matter of closing the writer on the child’s end and the reader on the parent’s end and we have a communication channel from the parent to the child.

reader, writer = IO.pipe

3.times do
  fork do
    writer.puts "I'm child with pid #{Process.pid}"
  end
end

writer.close
3.times { puts reader.gets }
reader.close
$ ruby pipe_ipc.rb 
I'm child with pid 13434
I'm child with pid 13437
I'm child with pid 13440

We can start opening more processes and keep communication back and forth through pipes. One per direction per process.

Conclusion

Linux has small but very powerful Inter Process Communication tools and they are usable through the ruby API. It’s up to us finding the right tool for the right job and leverage proven tools that work fast and have been around for years. They’re going nowhere so we can definitely rely on them.

Get our stories delivered from us to your inbox weekly.