You know what! in one of our (tekSymmetry LLC) projects, we have so many background calculations,
which usually takes so many hours to get fully completed. ever since we have introduced those processes,
we were having problem with it’s execution time. sometimes it get’s in nerve
as you know a single ruby process can use a single processor’s core at a time.
this is probable one of the reasons why muli processes based deployment
strategy is picked by ruby on rails community.
anyway, these days our servers got more than one core! more precisely,
in our case each of our production server got 8 cores based intel xeon processor.
so you see the question rose if we could run those long running expensive process in multicores
our system could have better chance to get faster!.
well this blog post is intended for showing you the technique how we have done it in ruby on rails.
for better understanding, let me give you some hints so you can get the context -
- we have big database table rows!
- processing a single row doesn’t require anything from the same database table.
- we are using linux (in our case debian lenny)
so here is the way we have done it -
- we took the max rows count for the main query
- and divided by the number of cores we have
- then we forked child process with each subset of the rows
- and executed the logic and related stuffs!
- on the parent process we initiated a loop where it was checking the newly forked process status
- if all the pid files (which are generated by the newly forked children) are removed,
parent process will flag it as successful execution thus it will end the loop.
so you see, it is damn! simple
_) and it is working for us
_),
it has improved our execution time 8x faster, because of getting 8 cores in new server.
here is the code in ruby how we did it. (we created a helper “multicore_execution_helper.rb“ and included in model, thus execute_in_multicores became usable)
1 module MulticoreExecutionHelper 2 3 def execute_in_multicores( 4 p_cores, p_total_rows, p_model, p_conditions = {}, &block) 5 6 p_cores == 2 if p_cores.to_i == 0 7 total_items_per_core = p_total_rows / p_cores 8 logger.info "[BATCH-PROCESS-LOG] Total processes - #{p_cores}, " + 9 "total rows - #{p_total_rows} [#{total_items_per_core} / 1 core]" 10 11 # Create job id for each process 12 job_ids = p_cores.times.collect{|i| rand.to_s } 13 14 # Fork process for each core and execute the block 15 p_cores.times do |offset| 16 Process.fork do 17 logger.info "[BATCH-PROCESS-LOG] Starting thread - #{offset} " + 18 "assigned # #{job_ids[offset]}" 19 20 # Keep job track through the created process pid file. 21 pid_file = File.join(RAILS_ROOT, 'tmp/pids/', "#{job_ids[offset]}.pid") 22 File.open(pid_file, 'w') {|f| f.puts Process.pid.to_s} 23 24 # Since fork process is created from the sample of the parent 25 # process's memory so we need to reconnect all live connections. 26 begin 27 ActiveRecord::Base.connection.reconnect! 28 29 # Retrieve data from the specific row through the defined 30 # offset and limit 31 teams = p_model.find( 32 :all, { 33ffset => (offset * total_items_per_core), 34 :limit => total_items_per_core}.merge(p_conditions)) 35 36 block.call(teams) 37 rescue => $e 38 logger.error "[BATCH-PROCESS-LOG] Exception raised during " + 39 "execution - #{$e.inspect}" 40 end 41 42 # Remove pid since we are done here! 43 FileUtils.rm(pid_file) 44 end 45 end 46 47 # monitor whether the process is completed or still in progress 48 # don't return this method unless all the forked processes have 49 # completed their job 50 sleep(2) 51 52 while 1 do 53 fully_completed = true 54 for job_id in job_ids 55 pid_file = File.join(RAILS_ROOT, 'tmp/pids/', "#{job_id}.pid") 56 if fully_completed && File.exists?(pid_file) 57 fully_completed = false 58 break 59 end 60 end 61 62 break if fully_completed 63 sleep(2) 64 logger.debug '[BATCH-PROCESS-LOG] again...' 65 end 66 end 67 68 end 69
here is the usages code -
143 execute_in_multicores(p_total_cores, SomeStuff.count, SomeStuff) do |some_stuffs| 144 # Do.. whatever you wanna do with the stuff here! these are gonna run on multicores! 151 end see it is really simple!_) if you like it let me know! how much you like it
_) here you can find the code on github best wishes!






Recent Comments