Mauro Morales

software developer

Tag: Ruby

  • Running a Patched Ruby on Heroku

    You use a PaaS because you want all the underlying infrastructure and configuration of your application to be hidden from you. However, there are times when you are forced to look deeper into the stack. In this article I want to share how simple it is to run a patched version of Ruby on Heroku.

    CONTEXT

    It all started while trying to upgrade Ruby in an application. Unfortunately, every newer version I tried made the application break. After some searching around, I came across a bug report from 3 years ago in Ruby upstream.

    The issue was actually not in Ruby but in Onigmo, the regular expressions library that Ruby uses under the hood. All versions since 2.4 where affected i.e. all supported versions including 2.5.8, 2.6.6 and 2.7.1 at the moment of writing. Lucky for me, Onigmo had been patched upstreambut the patch will only land in Ruby 2.7 later this year.

    This meant that, I was going to have to patch Ruby myself. For local development this is not a big deal, but I wasn’t sure if it was possible to do on Heroku. I remembered from CloudFoundry and Jenkins-X that the part of the platform taking care of the build and installation of the language were the buildpacks, so I decided to investigate about buildpacks on Heroku.

    HEROKU’S RUBY BUILDPACK

    Heroku’s Ruby buildpack is used to run your application whenever there’s a Gemfile and Gemfile.lock file. From parsing these, it figures out which version of Ruby it’s meant to use.

    Once it knows which version of Ruby to install, it runsbin/support/download_ruby, to download a pre built package and extracts it to be available for execution to your application. As a quick hack, I decided to modify this file to do what I did in my development environment to patch Ruby.

    1. First download the Ruby source code from upstream instead of the pre built version by Heroku.curl --fail --silent --location -o /tmp/ruby-2.6.6.tar.gz https://cache.ruby-lang.org/pub/ruby/2.6/ruby-2.6.6.tar.gz tar xzf /tmp/ruby-2.6.6.tar.gz -C /tmp/src cd /tmp/src/ruby-2.6.6
    2. Then apply a patch from a file I placed under bin/support/ (probably not the best place but OK while I was figuring things out).patch < "$BIN_DIR/support/onigmo-fix.diff"
    3. And finally build and install Rubyautoconf ./configure --disable-install-doc --prefix "$RUBY_BOOTSTRAP_DIR" --enable-load-relative --enable-shared make make install

    You can find an unpolished but working version of what I did here

    USING THE BUILDPACK IN YOUR APPLICATION

    Now all that is left is to tell your application to use your custom buildpack instead of Heroku’s supported one. You can do this in the command line by running

    heroku buildpacks:set https://github.com/some/buildpack.git -a myapp

    Or by adding a file called app.json at the root directory of your application sources (not in the buildpack sources). I ended up using this form since I prefer to have as much of the platform configuration in code.

    {
      "environments": {
        "staging": {
          "addons": ["heroku-postgresql:hobby-dev"],
          "buildpacks": [
            {
              "url": "https://github.com/some/buildpack.git"
            }
          ]
        }
      }
    }

    Now every time a deployment is made to this environment, the Ruby application will download the Ruby sources, patch, build and install them.

    This of course is not very optimal since you’ll be wasting a lot of time building Ruby. Instead you should do something similar to what Heroku is doing by pre building the patched version of Ruby, and downloading it from an S3 bucket. {: .notice–warning}

    CONCLUSION

    Using a patched version of Ruby comes with a heavy price tag, the maintenance. You should still apply updates until that patch is fixed upstream (at least security updates). And you also need to use the patched version in all your environments e.g. production, staging, et al. including your CI. Whether all this extra work is worth it, is something you’ll need to analyze. In the cases when the benefits outweigh the costs, it’s great to know that you don’t have to give up all the benefits of a platform like Heroku to run your own version of Ruby.

  • Ruby’s DATA Stream

    The STDIN and ARGF streams are commonly used in Ruby, however there’s also the less popular DATA one. This is how it works and some examples in the wild.

    HOW TO READ FROM DATA?

    Like with any other stream you can use gets and readlines. This behaviour is defined by the IO class. However there’s a caveat, your script needs to have a data section. To define it use the __END__ to separate code from data.

    $ cat hello_world.rb
    puts DATA.gets
    __END__
    hello world!
    
    $ ruby hello_world.rb
    hello world!

    Look at that, another way to code hello world in Ruby. Without the __END__keyword, you’ll get the following error:

    NameError: uninitialized constant DATA

    WHEN TO USE IT?

    You could use the data section of the script if you wanted to keep the data and code really close, or if you wanted to do some sort of pre processing to your sources. But to be honest, the only real benefit I can think of is performance. Instead of starting a second IO operation, to read a file containing the data, it’d get loaded at the same time than the script.

    EXAMPLES

    One thing I’ve learned while working with Go, is to check Go’s source files for good examples. Even though you cannot do this with Ruby at the same degree because the sources are in C, you can still check the parts of the sources that are in Ruby and the gems and tools maintained within the Ruby sources. Here are some examples:

  • Numbered Parameters in Ruby 2.7

    A new feature called “numbered parameters” will see the light of day in the Ruby 2.7 release at the end of the year. What caught my attention was not the feature itself but the mixed reception it got from the community.

    BLOCK PARAMETERS

    Whenever you open a block you have the chance to pass a list of numbered parameters

    object.method { |parameter_1, parameter_2, ... parameter_n| ... }

    For example if you were iterating over a hash to print its keys with matching values you’d do something like this:

    my_hash.each { |key, value| puts "#{key}: #{value}" }

    NUMBERED PARAMETERS

    With the new numbered parameters you are going to be able to save yourself some keystrokes and use @ followed by the number that represents the position of the parameter that you want do use so our previous code would now look like this:

    my_hash.each { puts "#{@1}: #{@2}" }

    NO DEFAULT VARIABLE NAME

    Other languages like Kotlin use it as the default variable name within a block.

    collection.map { println(it) }

    This is not the case with this new feature.

    object.method { p @1 }

    is syntactic sugar for

    object.method { |parameter_1,| p parameter } 

    and not for

    object.method { |parameter| p parameter } 

    So pay attention to the dataset you are passing because you might get some unexpected behaviour like this one:

    [1, ['a', 'b'], 3, {foo: "bar"}].map { @1 }
    => [1, "a", 3, {:foo=>"bar"}]

    As you can see 1 and 3 are taken as the first numbered parameter as expected. Each element of the array becomes one of the numbered parameters so @1 => 'a', @2 => 'b'. And the hash is treated as a single object so it won’t get split either.

    This shouldn’t come as a surprise since it’s the expected behaviour of doing

    [1, ['a', 'b'], 3, {foo: "bar"}].map { |x,| x }

    but in this case we make it clear to the reader when we say |x,|. There is no plan to make it a default variable name which is weird because that’s exactly what was requested in the original issue.

    BACKWARDS COMPATIBILITY IS A HIGH PRIORITY

    As I already mentioned this is what the person who requested the issue wanted to have but it was not accepted in its original form because of backwards compatibility. Introducing new keywords to the Ruby language is a no-go at the moment because Matz is not a fan of breaking developers’ old code with newer versions of Ruby.

    I appreciate that Matz takes such a strong stance on this matter, I think it’s important to update your code bases to use the latest version of Ruby but the harder it is to make an update, the less likely it is that you’ll end up doing it. So if I update to Ruby 2.7 and I start seeing breaking changes everywhere in my code base I’m just going to put it on hold for as long as possible. Instead this experience should be a welcoming one.

    PAIN OR GAIN?

    I don’t know how many times you pass a list of parameters to a block versus how many times you pass a single parameter, but I’m pretty sure in every code base you can find many more instances of the latter than the former. So the question is: How valuable is this new feature?

    Nobody seems to like the fact that numbered parameters start with @ and some community members are also saying that developers could get confused thinking that the numbered parameters are instance variables.

    There is currently an open issue requesting to reconsider numbered parameters because in it’s current state it brings more pain than value. What do you think? Do you like numbered parameters? Do you think they should be implemented in a different way? Would you rather not have them at all? There’s some informal voting happening in case you want to chip in.

  • Getting Started With Continuous Delivery

    More and more companies are requiring developers to understand Continuous Integration and Continuous Delivery but starting to implement it in your projects can be a bit overwhelming. Start with a simple website and soon enough you will feel more confident to do with more complex projects.

    THE RIGHT MINDSET

    TDD/BDD, CI/CD, XP, Agile, Scrum …. Ahhhhh, leave me alone I just want to code!

    Yes, all this methodologies can be a bit complicated at first, but simply because you are not used to them. Like a muscle you need to train them and the more you do so, the sooner you won’t feel like doing them is a total waste of time.

    Once you have made up your mind that CD is for you, your team or your project then you will need to define a process and follow it. Don’t make it easy to break the process and before you know it you and your team will feel like fish in the water.

    AUTOMATE A SIMPLE WEBSITE DEPLOYMENT

    There are many ways you can solve this problem. I will use a certain stack. If you don’t have experience with any of the tools, try to implement it with one you do have experience with.

    StackTool/ServiceAlternatives
    VPSDigitalOceanLinode or Vagrant
    Configuration ManagementAnsibleChef or Puppet
    Static site generatorMiddlemanJekyll or pure HTML
    CI/CD ServerSemaphoreCodeship or Jenkins

    The first thing is to create a new droplet in DO (you could also do this with Ansible but we won’t at this tutorial). Make sure there is a deployuser and to set up ssh keys for it (again something we could do with Ansible but we’ll leave that for another post) Setup your your domain to point to the new server’s IP address, I will use ‘example.com’.

    ANSIBLE

    Create a folder for your playbook and inside of it start with a file calledansible.cfg. There we will override the default configuration by pointing to a new inventory inside your playbook’s folder and specify the deploy user.

    [defaults]
    hostfile=inventory
    remote_user=deploy
    

    Now in our inventory file we specify a group called web and include our domain.

    [web]
    example.com
    

    Our tasks will be defined in simple-webserver.yml

    ---
    - name: Simple Web Server
      hosts: example.com
      sudo: True
      tasks:
        - name: Install nginx
          apt: pkg=nginx state=installed update_cache=true
          notify: start nginx
        - name: remove default nginx site
          file: path=/etc/nginx/sites-enabled/default state=absent
        - name: Assures project root dir exists
          file: >
            path=/srv/www/example.com
            state=directory
            owner=deploy
            group=www-data
        - name: copy nginx config file
          template: >
            src=templates/nginx.conf.j2
            dest=/etc/nginx/sites-available/example.com
          notify: restart nginx
        - name: enable configuration
          file: >
            dest=/etc/nginx/sites-enabled/example.com
            src=/etc/nginx/sites-available/example.com
            state=link
          notify: restart nginx
      handlers:
        - name: start nginx
          service: name=nginx state=started
        - name: restart nginx
          service: name=nginx state=restarted
    

    In it we make reference to a template called templates/nginx.conf.j2 where we will specify a simple virtual host.

    server {
            listen *:80;
    
            root /srv/www/example.com;
            index index.html index.htm;
    
            server_name example.com;
    
            location / {
                    try_files $uri $uri/ =404;
            }
    }
    

    I’ll show you in another post how to do this same setup but with multiple virtual hosts in case you run multiple sites.

    Run it by calling:

    ansible-playbook simple-webserver.yml
    

    MIDDLEMAN

    Middleman has a very simple way to deploy over rsync. Just make sure you have the following gem in your Gemfile

    gem 'middleman-deploy'
    

    And then add something like this to your config.rb

    activate :deploy do |deploy|
      deploy.method = :rsync
      deploy.host   = 'example.com'
      deploy.path   = '/srv/www/example.com'
      deploy.user  = 'deploy'
    end
    

    Before you can deploy you need to remember to build your site. This is prone to errors so instead we will add a rake task in our Rakefile to do this for us.

    desc 'Build site'
    task :build do
      `middleman build`
    end
    
    desc 'Deploy site'
    task :deploy do
      `middleman deploy`
    end
    
    desc 'Build and deploy site'
    task :build_deploy => [:build, :deploy] do
    end
    

    GIT FLOW

    Technically you don’t really need git flow for this process but I do believe having a proper branching model is key to a successful CD environment. Depending on your team’s process you might want to use something else but if you don’t have anything defined please take a look at git flow, it might be just what you need.

    For this tutorial I will oversimplify the process and just use the develop, master and release branches by following these three steps:

    1. Commit all the desired changes into the develop branch
    2. Create a release and add the release’s information
    3. Merge the release into master

    Let’s go through the steps in the command line. We start by adding the new features and committing them.

    git add Rakefile
    git commit -m 'Add rake task for easier deployment'
    

    Now we create a release.

    git flow release start '1.0.0'
    

    This would be a good time to test everything out. Bump the version number of your software (in my case 1.0.0), update the change log and do any last minute fixes.

    Commit the changes and let’s wrap up this step by finishing our release.

    git flow release finish '1.0.0'
    

    Try to write something significant for your message tag so you can easily refer to a version later on by it’s description.

    git tag -n
    

    Hold your horses and don’t push your changes just yet.

    SEMAPHORE

    Add a new project from Github or Bitbucket.

    For the build you might want to have something along the lines of:

    bundle install --path vendor/bundle
    bundle exec rake spec
    

    Now go into the projects settings inside the Deployment tab and add a server.

    Because we are using a generic option Sempahore will need access to our server. Generate an SSH key and paste the private in Semaphore and the public in your server.

    For the deploy commands you need to have something like this:

    ssh-keyscan -H -p 22 example.com >> ~/.ssh/known_hosts
    bundle exec rake build_deploy
    

    PUSH YOUR CHANGES

    Push your changes in the master branch and voilà, Semaphore will build and deploy your site.

    Once you get into the habit of doing this with your website you will feel more confident of doing it with something like a Rails application.

    If you have any questions please leave them below, I’ll respond to every single one of them.