Wednesday, 29 May 2019

Fixing dataset download problems for uWSGI+nginx Galaxy configuration

We recently experienced problems downloading datasets via a web browser from one of our local Galaxy instances, which runs release 18.09 and uses a uWSGI+nginx configuration.

While small files (e.g. of the order of Mb) downloaded without problems, larger files (e.g. of the order of Gb) would fail with a dialog box appearing in the user's web browser complaining that "the source file can't be read". (The Galaxy logs also reported an IOError from uwsgi_response_write_body_do() function.)

The initial problem seemed to be with the temporary directory being used for managing the download on the server. Explicitly setting uwsgi_temp_path in the nginx configuration seemed to help, for example:

uwsgi_temp_path /tmp/uwsgi;

This got rid of the dialog box but the larger downloads still failed without completing. Although the user's browser didn't give any more information, the Galaxy logs now reported a timeout error. To address this we explicitly set the UWSGI timeout limits in the nginx configuration, e.g.:

uwsgi_read_timeout 600s;
uwsgi_write_timeout 600s;

The choice of 600s (10 minutes) was arbitrary but seemed long enough to allow the downloads to complete.

Finally as the temporary area on server is quite small, we also explicitly set the maximum size of temporary files to 1Mb:

uwsgi_max_temp_file_size 1024k;

Together these addressed the download problem in our local instance.