After spending some time working with Python and discovering a module called xlrd that can open Excel files, I wanted to see if Ruby had an equivalent to add to my repertoire. I should have known there would be an awesome gem to handle it…
Roo — https://github.com/roo-rb/roo — Works with xls, xml, ods, csv formats
The Roo gem makes it easy to open and work with spreadsheets in Ruby. I was impressed how fast it was to implement and quickly parse through the data I was working with. Using roo & pry gems, we can quickly load in an xlsx file and drop in a binding.pry to see what objects are created. Feel free to download the sample dataset to follow along with a few examples!
The example data set is a spreadsheet that has four sheets composed of instructions, orders, lists, and links. We are going to load this spreadsheet with a simple Ruby script, and turn orders into objects. I’ll make a new Ruby file called example.rb, and place the downloaded spreadsheet in the same folder. Next we will require the necessary gems, and then assign the contents of the xlsx file to a spreadsheet variable .
If you place a binding.pry below that last line and run your file, you will notice that our spreadsheet is a Roo::Excelx object. Calling .info on the spreadsheet variable will give us even more information like the name of the file, the number of sheets, and the names of each sheet.
Sheets with roo numerically start at 0, similar to a Ruby array. So to access the orders we want, we need to iterate over sheet 1. Call the .sheet method on the spreadsheet variable and then we can pass an argument of the number or name of the sheet we want to access.
You could get the same response by passing the sheet name to the .sheet method, and adding .row(row_number) on the end.
Using .each to iterate over the spreadsheet is pretty neat as well. The syntax is a bit different, since you need to pass in key/value pairs with the names of the columns. Doing this will generate a hash for each record with the key being the name you provided, and the value being the actual value of the row.
We can save the Orders sheet to a new variable, and call a few other methods on it. We can use methods to find the number of the first and last rows as well as the first and last columns. By calling .cell on our variable, we can also retrieve a specific cells information.
The ability to easily export to different formats is really helpful. The current formats supported by roo are csv, matrix, xml & yaml. To convert our example to a csv, it is as simple as calling the method .to_csv on our spreadsheet variable.
You can use the same options available to the CSV library to parse CSV files using the csv_options key. Here is the CSV Documentation for more info — https://ruby-doc.org/stdlib-2.6.1/libdoc/csv/rdoc/CSV.html