Working With Ruby Libraries

Ruby has a rich and active ecosystem of libraries. These libraries are code snippets and class definitions that are helpful for a particular task. Building your program on top of an existing library is easier than re-inventing the wheel.

In Ruby parlance, a library is called a "gem". Each gem has a name (like "rake" or "middleman") and a version number (like 3.3.3).

This document is intended to give you a brief overview of how to obtain gems, manage their installation, and make use of them in your Ruby programs.

Read this entire guide before installing any gems. In particular, you should not install any gems until you have understood the section on Bundler and reviewed the step-by-step summary at the end.

Obtaining Gems

In order to use a gem in your program, you first have to obtain the gem by downloading it from some repository. Most gems are written in Ruby, so downloading the source code is enough to immediately make use of it. Some gems, however, include "native extensions" which are bits of C code (or some other language) and so need to be compiled after downloading.

Another complication is that many gems are, themselves, built on top of other gems. In order to use gem X, you may need to download gems Y and Z too.

The basic command for working with gems on your system is gem. For example, you already installed the rails gem as part of setting up your virtual machine:

$ gem install rails

You can also list all the gems currently installed on your machine:

$ gem list

The gems you see are those that come with Ruby (e.g., irb, pp, fileutils) as well as those that have been installed with the rails gem. The rails gem depends on several more gems (e.g., rake, sqlite3), which depend on others and so on. Some of these have native extensions. These were all downloaded and compiled as necessary by the one gem install rails command you used when setting up your virtual machine.

You can find descriptions of many popular gems at The Ruby Toolbox. This site organizes gems according to their application domain. The standard repository from which gems are downloaded is RubyGems.org.

Using Gems

Within a Ruby program, you use a gem by first issuing the require instruction. This instruction tells the Ruby interpreter to load the file containing the gem, making all its modules, classes and functions available to the program.

$ irb
> ap Array.new 3 # won't work: although 'amazing_print' gem is installed, \
# it hasn't been loaded
undefined method 'ap' for main (NoMethodError)
> require 'amazing_print' # loads the 'amazing_print' gem
=> true
> ap Array.new 3 # works now!
[
    [0] nil,
    [1] nil,
    [2] nil
]
=> nil

More information on gems is available here.

Bundler: Managing Gems

Gems make your code cleaner and more succinct. But their use introduces a dependency: Anyone wanting to use your program must also have the proper gems installed. Furthermore, they should have the proper versions of the proper gems installed.

Similarly, for groups collaborating on a project, every group member should have the same set of gems and the same versions of each gem installed on their machine.

This complication gets worse over time. Gems are frequently updated and sometimes these updates can introduce significant changes in behavior. The decision of whether or not to update a project to use a particular gem's new version must be made by the entire group: Either everyone updates to the new version, or no one does.

This complication also gets worse as you become involved in more projects. You might have two projects, for example, that both use the rails gem, but one uses version 7.0.2 while the other uses 5.2.8.

Bundler is the standard tool for managing this complexity. The core idea behind bundler is that every project includes 2 files: Gemfile and Gemfile.lock.

The first (Gemfile) lists the immediate dependencies of the program. These are the gems that are used directly (i.e., with a require instruction) in the program. In addition, the Gemfile can list constraints on the version numbers of these gems.

Here is a sample Gemfile:

source 'https://rubygems.org'

gem 'middleman', '~> 4.3'
gem 'middleman-autoprefixer', '~> 2.7'
gem 'tzinfo-data', platforms: [:mswin, :mingw, :jruby, :x64_mingw]
gem 'wdm', '~> 0.1', platforms: [:mswin, :mingw, :x64_mingw]

The Gemfile.lock file, on the other hand, lists all the gems on which the program depends, even the transitive dependencies. Furthermore, this file lists specific version numbers for each of these gems. The Gemfile.lock file, then, serves as the ultimate authority for which gems and which versions of these gems must be used when running the program.

While Gemfile can be created and edited by hand, you should use Bundler to manage the Gemfile.lock file.:

If a Gemfile.lock file does not exist, Bundler can generate one from the Gemfile file. Command: bundle install.
If a Gemfile.lock file does exist, Bundler enforces that the gems (and versions) listed there are used when running the program. Command: bundle exec.
Bundler can install the gems (and versions) listed in the Gemfile.lock file. Command: bundle install.
Bundler can update a Gemfile.lock file if you want to use a different version of a gem or if you make changes to the Gemfile. Command: bundle update.

The upshot is that both of these files become part of your project. They should both be under version control, for example, and shared on the central repository just like your source code.

For a 3901 project, you will probably generate the Gemfile.lock file once, then use it for the lifetime of that project. The basic life-cycle of a project that uses gems is:

One person (once) creates the Gemfile, generates the Gemfile.lock, and commits these files to version control (pushing these new files to the central repository, of course).
Everyone on the team (once, after having fetched the Gemfile and Gemfile.lock from the central repository) installs the specified gems on their machine.
Everyone on the team (always) runs their program using Bundler to ensure that the right version of each gem is used every time.

The Bundler documentation is available at bundler.io.

Step-by-step

The following instructions use the Mechanize gem to illustrate the steps for using a gem in a Ruby program.

Create a Gemlock file in the project's root directory. Modify this file, either directly using a text editor or via bundler's CLI, to specify the gems you want to use.
```
$ bundle init # creates a bare-bones Gemfile
$ bundle add "mechanize" # updates Gemfile, creates Gemfile.lock, installs gems
```

Commit both Gemlock files to repository.

$ git add Gemfile Gemfile.lock
$ git commit -m "add bundler files to manage gem versions"
$ git push origin main

Install gems. Use Bundler (not the gem command) to install the right gems (and the right versions). This step creates the Gemfile.lock file if it doesn't already exist. On the other hand, if Gemfile.lock is already present, this step installs the gems specified by that file.
```
$ bundle install # installs gems, respecting/creating Gemfile.lock
```

Use the gem in your program.

# scraper.rb
require 'mechanize'
agent = Mechanize.new
page = agent.get 'http://www.osu.edu'

Always run the program using the right gems.
```
$ bundle exec ruby scraper.rb
```