Downloading a File

The Problem

Just like with file uploads, we hit the same issue with downloading them -- a dialog box just out of Selenium's reach.

Conventional wisdom would posit that automating file downloads may be a foolish endeavor since in addition to the dialog box problem it is a deep rabbit hole to fall down. But here is a reasonable approach that will get you there without too much fuss.

A Solution

We are going to side-step the dialog box by telling our browser to ignore it and specify a location where to automatically download files to. Depending on the file's type, we will need to provide some additional configuration -- overriding some sensible defaults.

Once the file is downloaded we will make a quick and dirty assertion on it to see that it exists and is not empty.

Let's dig in with an example in Ruby using Selenium WebDriver against our test app.

An Example

First we will include the requisite libraries and wire-up our setup, teardown, and run actions in helper methods.

We are using the selenium-webdriver gem so we can demonstrate the browser profile configuration inline. And rspec-expectations is for our assertion action.

require 'selenium-webdriver'
require 'rspec-expectations'

def setup
  @download_dir = "/Users/more/Desktop/tmp/#{Time.now.strftime("%m%d%y_%H%M%S")}"
  profile = Selenium::WebDriver::Firefox::Profile.new
  profile['browser.download.folderList'] = 2
  profile['browser.download.dir'] = @download_dir
  profile['browser.helperApps.neverAsk.saveToDisk'] = 'image/jpeg, application/pdf'
  profile['pdfjs.disabled'] = true

  @driver = Selenium::WebDriver.for :firefox, :profile => profile

def teardown
  system("rm -rf #{@download_dir}")

def run

The setup method is where the magic is happening. In it we are creating an instance variable and setting it to an absolute path for the directory we would like to use (e.g. a uniquely named temp folder that exists on my Desktop). This will come in handy later when we need to access the downloaded file and make an assertion against it.

Next we create a Firefox profile object and ply it with some configuration parameters: + browser.download.folderList tells Firefox where to place its downloads. A 2 tells it to use a custom download path whereas a 1 is the browser's default path and a 0 is the Desktop. + browser.download.dir is where we set the custom download path. + browser.helperApps.neverAsk.saveToDisk tells Firefox when NOT to prompt us for a file download. It requires us to pass in a comma-separated string of MIME types that we care about. You can find a full list of MIME types here. For this example we have a PDF and JPG files to play with. + pdfjs.disabled is for when downloading PDFs. This overrides the sensible default set in Firefox that previews them in the browser.

After setting these bits we instantiate Selenium WebDriver and pass in the now configured profile object.

Our teardown and run methods are pretty straightforward. Teardown quits the Selenium object when done and cleans up the temp directory with a system call. And we wrap everything up neatly into a single run method that we can pass a block of commands to.

Now that we have setup Firefox appropriately we move on to our Selenium and assertion actions.

run {
  @driver.get 'http://the-internet.herokuapp.com/download'
  download_link = @driver.find_element(css: 'a')

  downloaded_files = Dir.glob("#{@download_dir}/**/*")
  sorted_downloads = downloaded_files.sort_by { |file| File.mtime(file) }
  File.size(sorted_downloads.last).should > 0

After loading the page we find the first link available and click it. Alternatively, we could have clicked the second link

  @driver.find_elements(css: 'a')[1].click

...or we could have clicked all of the links.

  download_links = @driver.find_elements(css: 'a')
  download_links.each do |download_link|

All of these actions will trigger an automatic download to the folder specified in setup. So we then check to see that a file is there and not empty.

This is done by getting a list of the files stored in the directory and sorting it from oldest to newest based on its modified date timestamp. This ensures that the last item in the list is the newest one.

Once we have it, we check its file size to make sure that it is greater than 0.

Expected Behavior

  • Load the page
  • Find the first download link
  • Click it
  • Download the file to a specified location on disk without prompting
  • Check that the downloaded file is there and not empty


Hopefully this tip saves you some time and sets you on a path that doesn't involve heavy GUI manipulation with a 3rd party tool.

The same approach can be applied to some browsers (e.g. Chrome) with a slightly different setup configuration, but not others (e.g. Internet Explorer). To that end, we'll cover a browser agnostic approach in a future tip.

Until then, Happy Testing!

Found this helpful?

Submit your e-mail in the form below to recieve tips like this!

One email every Tuesday. No Spam. Ever. Unsubscribe anytime.

Back to the archives