When the customer says: “I need a Download Report button”
Table Of Contents
- A bad story
- The true story
- Ruby Environment
- Routes
- Index route
- Generate route
- Report download
- Authorization
- Final thoughts
A bad story
Sometimes the user needs to schedule a complex report generation and download it once ready.
Putting this into a “user story” fashion it will become:
as logged User i want to generate a monthly report and download it.
This simple scenario hides several problems:
- UX problem
- Request timeout
- Scheduling a long running task
- Persisting the generated content
- Making the content available
- Setup ACL on the content
- Disk space of the system
You can’t let a user click a button and let him waiting five minutes or more to get the generated file, because he will have a bad user experience and probably will see a timeout error instead. He will eventually start hating and cursing you..
The true story
We need to find another way to accomplish this task. We could try splitting it in separate phases.
Lets’ reformulate the previous “user story”
as logged User i want to provide a one-time only password and generate a monthly report.
I want to receive an email with detailed instruction explaining how to download the archive.
I want to visit the URL inside the email, insert the previous password and download the archive.
The archive must be deleted after two hours.
The new phases are:
- Report generation
- Email delivery
- Report Download
- Report cleanup
Now we need to have new “actors” on the main stage.
- A job to generate the report, we will call it
ReportExporterWorker
- A Mailer to send email with download instruction, we will call it
ReportMailer
- A controller used to check user ACL and serve the report, we will call it
ReportsController
- A job used to cleanup the report after two hours, we will call it
ReportExportCleaner
Now we can try addressing all the previous problems.
Ruby Environment
The application and gems used in this post are:
- Ruby on Rails 3.x
- Devise Gem
- Sidekiq Gem
Routes
We need to add some new routes inside route.rb
file.
scope 'reports' do
get '/downloads', controller: 'reports', action: 'index'
post '/generate', controller: 'reports', action: 'generate', as: :generate
get '/downloads/:id/decode', controller: 'reports', action: 'decode', as: :decode
post '/downloads', controller: 'reports', action: 'download', as: :download
end
Index route
The first route is used to display a form with the email field inside.
scope 'reports' do
get '/downloads', controller: 'reports', action: 'index'
end
This is the index
action inside the ReportsController
def index
@download = DownloadRequest.new
end
The form will show two fields. The email will be filled with the User’s email but will be editable to allow the use of a different one.
The password is used to make data unreadable by others.
Here the form’s part of the view:
= simple_form_for @download, url: generate_path, html: { autocomplete: 'off', role: 'presentation' } do |f|
.form-inputs{style: 'margin-top: 50px'}
= f.input :email, input_html: {value: current_user.email}
= f.input :password, type: :password, input_html: {autocomplete: 'off'}
%div.control-group
%div.controls
%p
%strong
please provide set a one-time only password used to encrypt sensitive data
.actions
- if request.xhr?
= f.button :wrapped, :cancel => "#"
- else
= f.button :wrapped, :value => 'Send email with archive link'
This is the DownloadRequest
object used inside the form and controller (@download). It’s PORO object plus the methods needed to be used inside a form and some validation rules for its fields.
class DownloadRequest
extend ActiveModel::Naming
include ActiveModel::Conversion
include ActiveModel::Validations
attr_accessor :email
attr_accessor :password
validates :email, presence: true, format: /\w+@\w+\.{1}[a-zA-Z]{2,}/
validates :password, presence: true, length: 8..120
def persisted?
false
end
def new_record?
true
end
end
Generate route
The second route is triggered with the click on submit button inside the previous form.
scope 'reports' do
post '/generate', controller: 'reports', action: 'generate', as: :generate
end
Generate action
The generate action inside the ReportsController
initialize a new DownloadRequest
object with the request’s parameters and perform the validation
on email and password field.
def generate
@download = DownloadRequest.new
@download.email = params[:download_request][:email]
@download.password = params[:download_request][:password]
...
If everything is ok a new ReportExporterWorker
job will be scheduled.
...
if @download.valid?
ReportExporterWorker.perform_async(@download.email, @download.password)
...
This is the generate
action inside the ReportsController
ReportExporterWorker
def generate
@download = DownloadRequest.new
@download.email = params[:download_request][:email]
@download.password = params[:download_request][:password]
if @download.valid?
ReportExporterWorker.perform_async(@download.email, @download.password)
redirect_to root_path, notice: 'Check your email.'
else
flash[:error] = 'Email is invalid'
render :index
end
end
Generate the report
The ReportExporterWorker
has several steps into his perform
method. Let’s dive into it.
Scheduling the job
First we can create a Sidekiq job to handle the report creation. This job create a new file under "#{Rails.root}/tmp/reports"
folder.
The perform
method will invoke the LastMonthReportGenerator
, a service used to generate the report.
now_formatted = Time.now.strftime('%Y%m%d%H%M')
report_file_name = "#{now_formatted}_report.csv"
report_file = File.join(Rails.root, 'tmp', 'reports', report_file_name)
report_result = LastMonthReportGenerator.new(report_file_name).generate
When the file is created we compress it to lower the size.
if report_result
compressed_file_path = File.join(Rails.root, 'tmp', 'reports', "#{report_file_name}.gz")
Zlib::GzipWriter.open(compressed_file_path) do |gz|
gz.orig_name = "#{report_file_name}.gz"
gz.mtime = File.mtime(report_file)
gz.write IO.binread(report_file)
gz.close
end
Sending the content
Now we have a compressed file.
We setup the data that will be crypted.
payload = PayloadBuilder.compose_data(compressed_file_path)
We crypt the expiration and the file path with DownloadEncrypter
.
archive_path_for_mail = DownloadEncrypter.encrypt(payload, password)
The encoded path is passed to ReportMailer
mailer. The email contains the URL needed to download the report.
ReportMailer.notification(archive_path_for_mail, email).deliver!
The Mailer is pretty simple. It pick the path and email from the argument and deliver the email.
Full ReportMailer
source code:
class ReportMailer < ActionMailer::Base
default from: 'admin@evilcorp.com'
def notification(file_path, email)
@path = report_download_url(file_path)
mail(subject: "Report Generated at: #{Time.now}", :to => email)
end
end
and the email template
<H3>Here you will find your Report</H3>
click <%= link_to 'here', @path %> to Download the archive.
<p>Please note the that the link will be accessible only for 2 hours. After that period the file will be removed.</p>
<p>Kind Regards</p>
<p>Evil Corp<br/>
Cleanup our system
After the email delivery we’ll schedule another job named ReportExportCleaner
. It’s responsible for the report file deletion from the system.
ReportExportCleaner.perform_at(2.hours.from_now, File.basename(compressed_file_path, '.gz'))
We also delete the uncompressed report file from system.
File.delete(report_file)
Full ReportExporterWorker
source code:
require 'yaml'
require 'zlib'
class ReportExporterWorker
include Sidekiq::Worker
sidekiq_options backtrace: true, queue: :reports, unique: :until_executed
def perform(email, password)
now_formatted = Time.now.strftime('%Y%m%d%H%M')
report_file_name = "#{now_formatted}_report.csv"
report_file = File.join(Rails.root, 'tmp', 'reports', report_file_name)
report_result = LastMonthReportGenerator.new(report_file_name).generate
if report_result
compressed_file_path = File.join(Rails.root, 'tmp', 'reports', "#{report_file_name}.gz")
Zlib::GzipWriter.open(compressed_file_path) do |gz|
gz.orig_name = "#{report_file_name}.gz"
gz.mtime = File.mtime(report_file)
gz.write IO.binread(report_file)
gz.close
end
payload = PayloadBuilder.compose_data(compressed_file_path)
archive_path_for_mail = DownloadEncrypter.encrypt(payload, password)
ReportMailer.notification(archive_path_for_mail, email).deliver!
ReportExportCleaner.perform_at(2.hours.from_now, File.basename(compressed_file_path, '.csv.gz'))
File.delete(report_file)
end
end
end
Avoid leaking sensitive data
In this scenario we don’t use any external storage service, we just save the report in a temporary folder.
We don’t persist any report information into our database, so we need a way to pass the archive’s path between each step.
PayloadBuilder
will help us to create a data with the following information:
- report expiration
- report file path
The expiration will be within two hours after the report creation.
The file path is taken from the argument.
The extract_expiration
the extract_path
methods do the opposite.
Full PayloadBuilder
source code:
require 'base64'
class PayloadBuilder
TIME_FORMAT = "%Y-%m-%d-%R%z"
def self.compose_data(data)
expiration = 2.hours.from_now.utc.strftime(TIME_FORMAT)
payload = expiration + '|' + data
Base64.urlsafe_encode64(payload)
end
def self.extract_expiration(data)
raise ArgumentError if data.nil? || data.index('|').nil?
separator_index = data.index('|')
expiration_raw = data[0..separator_index-1]
expiration = expiration_raw[0..expiration_raw.size]
Time.strptime(expiration, TIME_FORMAT)
end
def self.extract_path(data)
raise ArgumentError if data.nil? || data.index('|').nil?
separator_index = data.index('|')
data[separator_index+1..data.size]
end
end
DownloadEncrypter
will encrypt/decrypt our data:
It uses AES 256 GCM to make our data safe.
Here you can find some details about it:
- “aes256 gcm can someone explain how to use it securely ruby” on crypto.stackexchange.com
- Galois Counter Mode on Wikipedia
The encrypt method will return the combination of IV + separator + our_secret_data + separator + tag in Base64
The decrypt method perform decode the Base64, then split the result by the separator and collect each needed piece of information to decrypt our data.
Full DownloadEncrypter
source code:
class DownloadEncrypter
def self.bin2hex(str)
str.unpack('C*').map {|b| "%02X" % b}.join('')
end
def self.hex2bin(str)
[str].pack "H*"
end
def self.encrypt(payload, password)
cipher = OpenSSL::Cipher::Cipher.new('aes-256-gcm')
cipher.encrypt
salt = hex2bin('SOME VERY VERY LONG string Used As salt to be safe. ')
key = OpenSSL::PKCS5.pbkdf2_hmac_sha1(password, salt, 20000, cipher.key_len)
cipher.key = key
iv = cipher.random_iv
cipher.iv = iv
cipher.auth_data = ''
encrypted_binary = cipher.update(payload) + cipher.final
tag = cipher.auth_tag
secret = Base64.urlsafe_encode64(bin2hex(iv) + bin2hex('$$$$$') + bin2hex(encrypted_binary) + bin2hex('$$$$$') + bin2hex(tag))
secret
end
def self.decrypt(encrypted_payload, password)
raw_data_array = Base64.urlsafe_decode64(encrypted_payload)
raw_data = raw_data_array.split(bin2hex('$$$$$'))
iv = hex2bin(raw_data[0])
data = hex2bin(raw_data[1])
tag = hex2bin(raw_data[2])
salt = hex2bin('SOME VERY VERY LONG string Used As salt to be safe. ')
cipher = OpenSSL::Cipher::Cipher.new('aes-256-gcm')
cipher.decrypt
key = OpenSSL::PKCS5.pbkdf2_hmac_sha1(password, salt, 20000, cipher.key_len)
cipher.key = key
cipher.iv = iv
cipher.auth_tag = tag
cipher.auth_data = ''
plaintext = cipher.update(data) + cipher.final
plaintext
end
end
Report cleanup
ReportExportCleaner
will delete the report from our system. The perform
method receive a file_name
as argument.
Sanitize the argument
This job could be exploited by malicious users trying passing path argument like '../../..some_file_name'
and deceiving the job and deleting file into our system so we had to find a way to sanitize the argument.
First step
The first step is checking if the file_name is valid.
unless is_valid?(file_name)
message = "wrong file_name argument: #{file_name}"
logger.error(message)
return
end
This is accomplished by sanitize_file_name
method called inside is_valid?
def is_valid?(file_name)
sanitize_file_name(file_name.dup) == file_name
end
The sanitize_file_name
method was taken from this blog post
of Gavin Miller (@gavin_miller).
I decide to use both whitelist and blacklist approach. To do this, I only use the basename of file without the extension part as argument (the . character is not allowed in the whitelist).
def sanitize_file_name(file_name)
# WHITELIST APPROACH
# Remove any character that aren't 0-9, A-Z, or a-z
file_name.gsub!(/[^0-9A-Z]/i, '_')
# BLACKLIST APPROACH
# Bad as defined by wikipedia: https://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words
# Also have to escape the backslash
bad_chars = ['/', '\\', '?', '%', '*', ':', '|', '"', '<', '>', '.', ' ']
bad_chars.each do |bad_char|
file_name.gsub!(bad_char, '_')
end
file_name
end
Second step
The second step is checking if a file "#{Rails.root}/tmp/reports/#{file_name}.gz"
exists (where file_name is the argument of perform).
name_complete = File.join(Rails.root, 'tmp', 'reports', file_name + '.gz')
unless File.exist?(name_complete)
message = "unable to found a valid file: #{name_complete}"
raise ArgumentError.new(message)
end
If all the steps are ok we can safely delete the file.
File.delete(name_complete)
Full ReportExportCleaner
source code:
class ReportExportCleaner
include Sidekiq::Worker
sidekiq_options backtrace: true, queue: :reports, unique: :until_executed
def perform(file_name)
unless is_valid?(file_name)
message = "wrong file_name argument: #{file_name}"
logger.error(message)
return
end
name_complete = File.join(Rails.root, 'tmp', 'reports', file_name + '.gz')
unless File.exist?(name_complete)
message = "unable to found a valid file: #{name_complete}"
raise ArgumentError.new(message)
end
File.delete(name_complete)
end
private
def is_valid?(file_name)
sanitize_file_name(file_name.dup) == file_name
end
def sanitize_file_name(file_name)
# WHITELIST APPROACH
# Remove any character that aren't 0-9, A-Z, or a-z
file_name.gsub!(/[^0-9A-Z]/i, '_')
# BLACKLIST APPROACH
# Bad as defined by wikipedia: https://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words
# Also have to escape the backslash
bad_chars = ['/', '\\', '?', '%', '*', ':', '|', '"', '<', '>', '.', ' ']
bad_chars.each do |bad_char|
file_name.gsub!(bad_char, '_')
end
file_name
end
def logger
@logger ||= begin
log = File.open(File.join(Rails.root, 'log', 'malicius_calls.log'), "a")
log.sync = true
log
end
end
Report download
The third route is triggered when the User click inside the report link inside the email.
scope 'reports' do
get '/downloads/:id/decode', controller: 'reports', action: 'decode', as: :decode
end
Decode action
The action reads the id
parameter and shows a view with a form containing a password field and an hidden field with the id
inside.
def decode
@download = Download.new
@download.id = params[:id]
end
The view lets the User input the previous password and click download to invoke the download action.
This is the view:
= simple_form_for @download, url: download_path, html: { autocomplete: 'off', role: 'presentation' } do |f|
.form-inputs{style: 'margin-top: 50px'}
%div.control-group
%div.controls
%p
%strong
Insert the archive password
= f.input :password, input_html: {autocomplete: 'off', type: 'password'}
= f.input :id, type: :hidden, input_html: {type: :hidden, autocomplete: 'off'}, label_html: {style: 'display: none'}
.actions
= f.button :wrapped, :value => 'Download'
Download action
The last route is called after the User has submitted the form after filling it with the previous password.
scope 'reports' do
post '/downloads', controller: 'reports', action: 'download', as: :download
end
The action reads id
and password
parameter from the request and tries to decrypt it using decrypt
method of DownloadEncrypter
.
After it will decode the base64 unencrypted data.
def download
@download = Download.new
@download.id = params[:download][:id]
@download.password = params[:download][:password]
begin
base64_data = DownloadEncrypter.decrypt(@download.id, @download.password)
data = Base64.urlsafe_decode64(base64_data)
rescue OpenSSL::Cipher::CipherError
flash[:error] = 'password is wrong'
render :decode
return
end
...
After it retrieves the expiration with extract_expiration
method of PayloadBuilder
and check for invalidation
parsed_expiration = PayloadBuilder.extract_expiration(data)
if Time.now > parsed_expiration
redirect_to root_path, status: :gone
return
end
The last part retrieves the report path using extract_path
of PayloadBuilder
and use send_file
to start the download.
file_path = PayloadBuilder.extract_path(data)
send_file file_path
This is the download
action inside the ReportsController
def download
@download = Download.new
@download.id = params[:download][:id]
@download.password = params[:download][:password]
begin
begin
base64_data = DownloadEncrypter.decrypt(@download.id, @download.password)
data = Base64.urlsafe_decode64(base64_data)
rescue OpenSSL::Cipher::CipherError
flash[:error] = 'password is wrong'
render :decode
return
end
parsed_expiration = PayloadBuilder.extract_expiration(data)
if Time.now > parsed_expiration
redirect_to root_path, status: :gone
return
end
file_path = PayloadBuilder.extract_path(data)
send_file file_path
rescue ArgumentError
redirect_to root_path, status: :unprocessable_entity, alert: 'something went wrong'
return
rescue ActionController::MissingFile
redirect_to root_path, status: :not_found, alert: 'archive not found'
return
end
end
This is the Download
object used inside the form and ReportsController
’s download
method.
It’s PORO object plus the methods needed to be used inside a form and some validation rules for his fields.
class Download
extend ActiveModel::Naming
include ActiveModel::Conversion
include ActiveModel::Validations
attr_accessor :password
attr_accessor :id
validates :password, presence: true
def persisted?
false
end
def new_record?
true
end
end
Authorization
We use Devise’s directive to check authorizations on each action:
before_filter :authenticate_user!
Full ReportsController
source code:
class ReportsController < ApplicationController
before_filter :authenticate_user!
def index
@download = DownloadRequest.new
end
def generate
@download = DownloadRequest.new
@download.email = params[:download_request][:email]
@download.password = params[:download_request][:password]
if @download.valid?
ReportExporterWorker.perform_async(@download.email, @download.password)
redirect_to root_path, notice: 'Check your email.'
else
render :index
end
end
def decode
@download = Download.new
@download.id = params[:id]
end
def download
@download = Download.new
@download.id = params[:download][:id]
@download.password = params[:download][:password]
begin
begin
base64_data = DownloadEncrypter.decrypt(@download.id, @download.password)
data = Base64.urlsafe_decode64(base64_data)
rescue OpenSSL::Cipher::CipherError
flash[:error] = 'password is wrong'
render :decode
return
end
parsed_expiration = PayloadBuilder.extract_expiration(data)
if Time.now > parsed_expiration
redirect_to root_path, status: :gone
return
end
file_path = PayloadBuilder.extract_path(data)
send_file file_path
rescue ArgumentError
redirect_to root_path, status: :unprocessable_entity, alert: 'something went wrong'
return
rescue ActionController::MissingFile
redirect_to root_path, status: :not_found, alert: 'archive not found'
return
end
end
end
Final thoughts
There is still space for improvements.
My solution is far from being the best way to address this task, but I hope it is a starting point to help you tackling this problem.
There are several topics that i’d like to improve which may be subject of next blog posts.
Strong Parameters
You should use strong parameters to validate the params inside each controller action. This example is based on an old application that needs to be updated.
If you have a similar scenario this answer could be a starting point.
Filename sanitization
I’m not sure about this solution. I fear there are additional ways to circumvent the checks done.
Initialization vector into our data
I had some doubts about putting components such as the initialization vector (IV) inside our output but, according to this answer, it should be legit.
Waiting for your feedbacks
Thanks for reading up here.
I’d like to hear some suggestions from you about my solution.
Leave A Comment