Ruby process & ActiveRecord data set executing in multi cores

You know what! in one of our (tekSymmetry LLC) projects, we have so many background calculations,
which usually takes so many hours to get fully completed. ever since we have introduced those processes,
we were having problem with it’s execution time. sometimes it get’s in nerve

as you know a single ruby process can use a single processor’s core at a time.
this is probable one of the reasons why muli processes based deployment
strategy is picked by ruby on rails community.

anyway, these days our servers got more than one core! more precisely,
in our case each of our production server got 8 cores based intel xeon processor.

so you see the question rose if we could run those long running expensive process in multicores
our system could have better chance to get faster!.

well this blog post is intended for showing you the technique how we have done it in ruby on rails.

for better understanding, let me give you some hints so you can get the context -

  • we have big database table rows!
  • processing a single row doesn’t require anything from the same database table.
  • we are using linux (in our case debian lenny)

so here is the way we have done it -

  1. we took the max rows count for the main query
  2. and divided by the number of cores we have
  3. then we forked child process with each subset of the rows
  4. and executed the logic and related stuffs!
  5. on the parent process we initiated a loop where it was checking the newly forked process status
  6. if all the pid files (which are generated by the newly forked children) are removed,
    parent process will flag it as successful execution thus it will end the loop.

so you see, it is damn! simple :) _) and it is working for us :) _),
it has improved our execution time 8x faster, because of getting 8 cores in new server.

here is the code in ruby how we did it. (we created a helper “multicore_execution_helper.rb“  and included in model, thus execute_in_multicores became usable)

1    module MulticoreExecutionHelper
2    
3      def execute_in_multicores(
4          p_cores, p_total_rows, p_model, p_conditions = {}, &block)
5    
6        p_cores == 2 if p_cores.to_i == 0
7        total_items_per_core = p_total_rows / p_cores
8        logger.info "[BATCH-PROCESS-LOG] Total processes - #{p_cores}, " +
9                    "total rows - #{p_total_rows} [#{total_items_per_core} / 1 core]"
10   
11       # Create job id for each process
12       job_ids = p_cores.times.collect{|i| rand.to_s }
13   
14       # Fork process for each core and execute the block
15       p_cores.times do |offset|
16         Process.fork do
17           logger.info "[BATCH-PROCESS-LOG] Starting thread - #{offset} " +
18                       "assigned # #{job_ids[offset]}"
19   
20           # Keep job track through the created process pid file.
21           pid_file = File.join(RAILS_ROOT, 'tmp/pids/', "#{job_ids[offset]}.pid")
22           File.open(pid_file, 'w') {|f| f.puts Process.pid.to_s}
23   
24           # Since fork process is created from the sample of the parent
25           # process's memory so we need to reconnect all live connections.
26           begin
27             ActiveRecord::Base.connection.reconnect!
28   
29             # Retrieve data from the specific row through the defined
30             # offset and limit
31             teams = p_model.find(
32                 :all, {
33                     :o ffset => (offset * total_items_per_core),
34                     :limit => total_items_per_core}.merge(p_conditions))
35   
36             block.call(teams)
37           rescue => $e
38             logger.error "[BATCH-PROCESS-LOG] Exception raised during " +
39                          "execution - #{$e.inspect}"
40           end
41   
42           # Remove pid since we are done here!
43           FileUtils.rm(pid_file)
44         end
45       end
46   
47       # monitor whether the process is completed or still in progress
48       # don't return this method unless all the forked processes have
49       # completed their job
50       sleep(2)
51   
52       while 1 do
53         fully_completed = true
54         for job_id in job_ids
55           pid_file = File.join(RAILS_ROOT, 'tmp/pids/', "#{job_id}.pid")
56           if fully_completed && File.exists?(pid_file)
57             fully_completed = false
58             break
59           end
60         end
61   
62         break if fully_completed
63         sleep(2)
64         logger.debug '[BATCH-PROCESS-LOG] again...'
65       end
66     end
67   
68   end
69   

here is the usages code -

143        execute_in_multicores(p_total_cores, SomeStuff.count, SomeStuff) do |some_stuffs|
144          # Do.. whatever you wanna do with the stuff here! these are gonna run on multicores!
151        end

see it is really simple! :) _) if you like it let me know! how much you like it :) _)
here you can find the code on github 

best wishes!

BDD(behavior driven development) with easyb

hi,
just wondering is there anyone who started using easyb?
which is behavior driven development framework. if you are not familiar with BDD here is my explanation.

as you heard and practicing TDD (test driven development), (if you follow test first approach) you keep your specification up front through test case.
for example -

public void testShouldCreateAnUserWhenItHasValidData() {…}

as you can see, you are actually writing test case for behavior(specification) for your expected code.
and based on that test case you are implemented your logic in code. this how TDD works. for more explanation google IT :)

in BDD, this process is more simplified, for example if you look at my previous example -
public void testShouldCreateAnUserWhenItHasValidData() {…}

you can find, i have written one scenario when user object has valid data.
same test can be in different scenarios such as, when user object has no valid data. or the caller is not authorized and so on.

so to make such thing clear in java code, it requires code like the following. ie.
public void testShouldCreateAnUserWhenItHasNoValidData();
public void testShouldCreateAnUserWhenItHasValidDataButCallerIsNotPermitted();
public void testShouldCreateAnUserWhenItHasValidDataButCallerIsPermitted();

here how BDD is proposing a new approach of making this thing more fluent through a simplified test framework.
like JUnit, easyb is also another test framework, where you are writing your test context, and behavior in groovy code.

actullay the beauty is test scenario are written following the user story convention
which is similar with the ideal convention
as an Author
i want to write book
so that user can understand me”.
you can also generate user story from the groovy code which you can’t do with JUnit.
so you don’t need to maintain separate document for maintaing user stories.

so when you are preparing user story and you can use easyb and groovy to format your user story rather than using ms word, excel or notepad text file ;)
ie.

import com.somewherein.bdd.UserService
import com.somewherein.bdd.UserServiceImpl
import com.somewherein.bdd.Userscenario “create a new user with valid data”, {

given “an user with the valid data”, {
user = new User()
userService = new UserServiceImpl()
state = false
}

when “creating a new user”, {
state = userService.createUser(user)
}

then “returned state should be true”, {
state.shouldBe true
}

and “newly created user should be found”, {
userService.exists(user)
}
}

when i run this test it says -

Running user service story (UserServiceStory.groovy)
Scenarios run: 1, Failures: 0, Pending: 0, Time Elapsed: 0.649 sec1 behavior run with no failures

so if i ask for generating the user story – it generate the following text

1 scenario executed successfullyStory: user service

scenario create a new user with valid data
given an user with the valid data
when creating a new user
then returned state should be true
then newly created user should be found

this type of practice is very common in ruby on rails based development.
in ruby we have several options, but RSpec is the early comer who showed how cool it could be.

anyway, this is something you should work try EiD vacation, happy test first development.

easyb makes it easy, man

here is an article from javalobby
Is easyb Easy? | Javalobby

you can use it with spring framework, here is the example -
best wishes,

test first development, does it run fast on ruby on rails or java environment?

i have been practicing test first development (from TDD) approach for last 1.5 years. i wrote a lot of junit test cases and wrote a lot of jmock based mock objects. it was amazing work with it these tools. specially jmock was fantastics and my most favorite mocking tool.

anyway, while i seriously started ruby on rails it is about 2/3 months ago, when i formed my own rails team and started working on first commercial rails project.

at the beginning i was working on IntelliJ IDEA 6M2 on windows environment, where i found running runit test cases are taking longer than we suppose to expect. it was really annoying.

good luck i got my mac book pro where i set up all rails stuff with IntelliJ IDEA 6M2. now i could see a lot more significant difference. now i am really feeling my test environment for rails project over this environment is running more than i expected. it really leads me to belief test driven approach over rails environment is really fast.

it must be some platform related significant changes, which only the expert on that platform can assure me.

by the way, i am really loving intelliJ IDEA and rails over my mac environment.

A RESTful url using Stripes.

RESTful url may contain the content type with in the url string. For example: http://host/action/xml/bookmark. Here “xml” is content type, this content type can be changed to “text, json” or etc…

So following urls are valid too. http://host/action/text/bookmark, http://host/action/json/bookmark etc…

these days I am playing with “Stripes” web framework a lot. Those who didn’t get in touch with “Stripes” have a look on the following URL – http://stripes.mc4j.org, hope you will enjoy it.

stripes has fantastic flexibility, it enables a lot of customization. Now my tutorial will demonstrate how you can enable this url pattern, where your single “ActionBean” will be invoked on the following url patterns:

http://host/action/xml/bookmark, http://host/action/text/bookmark, http://host/action/json/bookmark etc…

My target audience:
Who has already been introduced with Stripes or those who are interested on stripes framework.

Routing plan:
Request Dispatch
[visitor] —  > http://host/action/xml/bookmark — > [web server] — > [Stripe servlet] — > [Custom Action resolver] — > [remove user defined content type “xml” and retrieve action bean]

Response Dispatch
[clean url (after removing content type)] — > http://host/action/bookmark — > [BookmarkActionBean] — > [Resolution] — >

Pre requisite:
Stripes servlet is configured and web context is up and running.

ActionBean code:
*  create an ActionBean, for example “BookmarkActionBean”

@UrlBinding(“/action/bookmark”)

public class BookmarkAction implements ActionBean {

    private final Log LOG = LogFactory.getLog(getClass());

    private ActionBeanContext mActionBeanContext;

    private String mContentType = “text/xml”;

    public void setContext(ActionBeanContext actionBeanContext) {

        mActionBeanContext = actionBeanContext;

        // retrieve user requested content type constant for example “text,json, xml”

        // convert consts to mime type “text, text/xml, application/JSON”

        mContentType = URLHelper.getMimeType(URLHelper.getContentType(getContext().getRequest().getPathInfo()));

    }

    public ActionBeanContext getContext() {

        return mActionBeanContext;

    }

    @DefaultHandler

    public Resolution save() {

        if (LOG.isDebugEnabled())

            LOG.debug( “storing bookmark object on the data store, ” +

                       “response content type is requested – ” + mContentType );

        return new StreamingResolution( mContentType, new StringReader(“”) );

    }

}

Description:
Here i have set our action on “/action/bookmark” url, but our user requested url pattern is “/action//actionBean”

Look on the “setContext” method, I have added an extra line

mContentType = URLHelper.getMimeType(URLHelper.getContentType(getContext().getRequest().getPathInfo()));

now have a look on the URLHelper class.

public static String getContentType( String pRequestedUrl ) {

        String contentType = CACHED_URL_CONTENT_TYPE.get(pRequestedUrl);

        if (contentType == null) {

            contentType = DEFAULT_CONTENT_TYPE;

            Perl5Util util = new Perl5Util();

            if (util.match( REGEX_ACTION_CONTENT_TYPE, pRequestedUrl ))

                contentType = util.group(1)+”";

            CACHED_URL_CONTENT_TYPE.put( pRequestedUrl, contentType );

        }

        return contentType;

    }

This method is responsible to retrieve user requested content type constants.
for example “text, xml, json”

public static String getMimeType( String pContentType ) {

        // convert content type to MIME Type

        if (pContentType.equalsIgnoreCase(CONTENT_TYPE_CONST_XML))

            pContentType = CONTENT_TYPE_XML;

        else if (pContentType.equalsIgnoreCase(CONTENT_TYPE_CONST_JSON))

            pContentType = CONTENT_TYPE_JSON;

        return pContentType;

    }

This method is also converting constant to MIME content type.
For example “xml –> text/xml”

Now return to my base action bean. I have added the following method, which is defined as default handler.

@DefaultHandler
public Resolution save() {

        ….

        return new StreamingResolution( mContentType, new StringReader(“”) );

    }

Here “mContentType” is repopulate when “setContext” is invoked.

ActionResolver code:
This is my action resolver.

public class NBookmarkSystemActionResolver extends NameBasedActionResolver {

     // ………   

    @Override

    public ActionBean getActionBean(ActionBeanContext pActionBeanContext, String pRequestedUrl) throws StripesServletException {

        String cleanUrl = URLHelper.getCleanUrl(pRequestedUrl);

        if (LOG.isDebugEnabled())

            LOG.debug( “user requested url – ” + pRequestedUrl + ” clean url – ” + cleanUrl);

        return super.getActionBean(actionBeanContext, cleanUrl);

    }

}

Now let’s have a look on “getCleanUrl” method.

public static String getCleanUrl( String pRequestedUrl ) {

        return pRequestedUrl.replaceAll(“/”+getContentType(pRequestedUrl),”");

    }

Custom ActionResolver configuration:

        Stripes Filter

        StripesFilter

        net.sourceforge.stripes.controller.StripesFilter

       

            ActionResolver.UrlFilters

            /WEB-INF/classes/

       

       

            ActionResolver.PackageFilters

            com.we4tech.nBookmarkSystem.service.webService/*

       

       

            ActionResolver.Class

            com.we4tech.nBookmarkSystem.configuration.NBookmarkSystemActionResolver

       

   

That’s all for today,

Hope you will enjoy stripes. Best of luck…

Nice Stripes web framework

hi,
at least i could manage some time to write about Stripes (http://stripes.mc4j.org/)

stripes is a web framework, it has very few dependencies, configuration over convention is highly adopted on Stripes.

you don’t need to bother about any separate XML file or properties file, it is as simple as web.xml servlet configuration. just put your all configuration over

more over it has default convention centric url discovery… for example “HelloAction”.. this action can be accessed over “http:///Hello.htm” or whatever u map on url mapping.

very flexible system. it has very simple work flow management, (who already familiar with JSF navigation or Spring web flow will be interested on its work flow)

by default it has multi action on each class, and you can override the url or any settings over @annotation. which is the most powerful feature over stripes.

web form validation is one of the nice and simple approaches over stripes. i am really loving it..
you can apply validation rule over @annotation. here is few example.. hope it will give you clean understanding.

@ValidateNestedProperties({
@ValidateNestedProperties({
@Validate( field = “accountName”, required = true, on = {“register”}),
@Validate( field = “password”, required = true, minlength = 6, on = {“register”})})
private User user;

@Validate( expression = “user.password == this”, on = “register”, required = true)
private String confirmPassword;

it has support over spring container, so spring developer won’t get you out…

best wishes…

http://hasan.we4tech.com

my tweets

 

February 2012
S S M T W T F
« Aug    
 123
45678910
11121314151617
18192021222324
2526272829  

Flickr Photos

@kamalapur over bridge

@kamalapur station

cox's bazaar trip oct 09

cox's bazaar trip oct 09

cast ur vote!

More Photos
Follow

Get every new post delivered to your Inbox.