Thursday, 12 February 2015

FTP upload to Galaxy using ProFTPd and PBKDF2

Recently I've been looking at enabling FTP upload for a local instance of Galaxy (based on the latest_2014.08.11 version of galaxy-dist.)

The Galaxy documentation for integrating an FTP upload server can be found at https://wiki.galaxyproject.org/Admin/Config/UploadviaFTP, and I've been working with their recommended choice of ProFTPd. Overall the instructions are pretty straightforward, but I've encountered a few issues, mostly to do with Galaxy's change from SHA1 to PBKDF2 as its default choice of password authentication. This post details how I handled these to get the upload working.

Note that I'm assuming that Galaxy is using Postgres as its database engine.

1. Get ProFTPd

ProFTP is simple to install on Scientific Linux 6 via yum:
  • yum install proftpd proftpd-postgresql
with proftpd-postgresql providing the extensions required by ProFTPd to access PostgreSQL databases.

If you need to build ProFTPd manually from source (for example because the default version doesn't have features that you need such as handling PBKDF2 password encryption - see below) then download the code from the ProFTP website and do e.g.:

# yum install postgresql-devel openssl-devel
# tar zvxf proftpd-1.3.5.tar.gz
# cd proftpd-1.3.5
# ./configure --prefix=/opt/apps/proftpd/1.3.5 --disable-auth-file --disable-ncurses --disable-ident --disable-shadow --enable-openssl --with-modules=mod_sql:mod_sql_postgres:mod_sql_passwd
# make ; make install

Note that the final step must be performed with superuser privileges.

2. Check how your Galaxy installation handles password encryption

Galaxy appears to support two types of password encryption: older versions of Galaxy use SHA1 to encrypt its passwords, whereas newer versions use a more sophisticated protocol called PBKDF2.

If you're using SHA1 then configuring ProFTPd is pretty straightforward, and the instructions on the Galaxy wiki should work out of the box. If you're using PBKDF2 then the configuration is a little more involved.

You can configure Galaxy to explicitly revert to SHA1 by setting use_pbkdf2 = False in the configuration files, in the [app:main] section; however by default PBKDF2 is used, and this blog post assumes that this is the case.

3. (Optionally) Set up a database user specifically for FTP authentication

This is not critical but is recommended. When users try to upload files to the FTP server they will log in using their Galaxy username and password. In order to enable this ProFTPd needs to be able to query Galaxy's database to check these credentials, and doing it via a database user with limited privileges (essentially only SELECT on the galaxy_user table) is more secure than via the one that Galaxy itself uses.

For Postgresql the instructions given on the Galaxy wiki are fine.

4. Create an area where ProFTPd will put uploaded files (and point Galaxy to it)

This should be a directory on the system which is readable by the Galaxy user. The ftp_upload_dir parameter in the Galaxy config file should be set to point to this location.

(It appears that you also need to set a value for ftp_upload_site in order for the uploaded files to be presented to the user when they got to "Upload Files".)

5. Configure ProFTPd

ProFTPd's default configuration file is located in /etc/proftpd.conf (if using the default system installation), or otherwise in the etc subdirectory where you installed ProFTPd if you built your own.

5.1 Configuring ProFTPd to use SHA1 password authentication

The Galaxy documentation gives an example ProFTPd config file that should work for the old SHA1 password encryption. I don't cover using SHA1 any further in this post.

5.2 Configuring ProFTPd to use PBKDF2 password authentication

As this is not documented on the Galaxy wiki, I used a sample ProFTPd configuration posted by Ricardo Perez in this thread from the Galaxy Developers mailing list as a starting point: http://dev.list.galaxyproject.org/ProFTPD-integration-with-Galaxy-td4660295.html - his example was invaluable to me for getting this working.

Here's a version of the ProFTPd conf file that I created to enable PBKDF2 authentication:


Note that the SQLPasswordPBKDF2 directive is not available in ProFTPd before version 1.3.5rc3, so check which version you're using.

(It should also be possible to configure ProFTPd to use both SHA1 and PBKDF2 authentication, and there are hints on how to do this in Ricardo's message linked above. However I haven't tried implementing it yet.)

6. Test your ProFTPd settings

ProFTPd can be run as a system service but during initial setup and debugging I found it useful to run directly from a console. In particular:
  • proftpd --config /path/to/conf_file -t performs basic checks on the conf file and warns if there are any syntax errors or other problems
  • proftpd --config /path/to/conf_file -n starts the server in "no daemon" mode
  • proftpd --config /path/to/conf_file -n -d 10 runs in debugging mode with maximal output, which is useful for diagnosing problems with the authentication.
On our system I also needed to update the Shorewall firewall settings to allow external access to port 21 (the default port used by FTP services), by editing the /etc/shorewall/rules file.

You can then test by ftp'ing to the server and checking that you can log in using your Galaxy credentials, upload a file, see that it appears in the correct place on the file with the correct file ownership and permissions (it should be read/writeable by the user running the Galaxy process), and check that Galaxy's upload tool presents it as an option.

If any of these steps fail then running ProFTPd with the debugging option can be really helpful in understanding what's happening behind the scenes.

One other gotcha is that if the Galaxy user UID or GID is less than 999, then you will need to set SQLMinID (or similar) in the ProFTPd conf file to a suitable value, otherwise the uploaded files will not be assigned to the correct user (you can get the UID/GID using the "id" command).

7. Make ProFTPd run as a service

If everything appears to be working then you can set up ProFTP to run as a system service - if you're using the system installed version then there should already be be an /etc/init.d/proftpd file to allow you to do

service proftpd start

Otherwise you will need to make your own init.d script for ProFTPd - I used the one in the documentation at http://www.proftpd.org/docs/howto/Stopping.html as a starting point, put it into /etc/init.d/ and edited the FTPD_BIN and FTPD_CONF variables to point to the appropriate files for my installation.

Once this is done you should have FTP uploads working with Galaxy using PBKDF2 password authentication.

Updates: fixed typos in name of "PBKDF2" and clarify that SHA1 is not used (27/02/2015).

14 comments:

  1. Hello blogger,

    Thank you for this detailed procedure on FTP based file transfer in Galaxy..
    Since iam new to SQL/database handling, im wondering whether the term 'dbpassword' in the following sentence is a literal or it should be substituted for a real password.

    galaxydb=# ALTER ROLE galaxyftp PASSWORD 'dbpassword';
    ALTER ROLE

    Also, iam getting an error while running the following command

    galaxydb=# GRANT SELECT ON galaxy_user TO galaxyftp;

    ERROR: relation "galaxy_user" does not exist

    I appreciate any help in sorting out this problem...

    Thank you!

    ReplyDelete
    Replies
    1. Hello Jana

      Sorry you're having problems, to try and answer your questions:

      1. Yes, the real password should be substituted for 'dbpassword' in the examples

      2. I haven't encountered the error about "galaxy_user"' not existing before. This is a table in the Galaxy database that should be created when Galaxy is installed and run for the first time.

      Assuming you're following the instructions in the Galaxy documentation https://wiki.galaxyproject.org/Admin/Config/UploadviaFTP it suggests to me that there is a problem with the setup of the Postgres database.

      Searching around I've found this forum thread which might help:
      http://dev.list.galaxyproject.org/Database-and-FTP-setup-tp4663433p4663434.html

      If you're still having problems then my suggestion would be to post the details of your problem and your Galaxy setup to the Galaxy developer list (see http://dev.list.galaxyproject.org/ for more details).

      Sorry not to be able to help more with this one - good luck!

      Delete
  2. Hi pjb

    Great post, very helpful thanx a million.

    Following this, setting everything up using ProFTPd and MySQL, everything works perfectly fine except that the FTP upload option is not presented to me in the Galaxy interface. File is tranferred to crrect directory, permissions are correct etc. I just don't have the FTP upload option...

    Any tips?
    gordon

    ReplyDelete
    Replies
    1. Hello Gordon

      Thanks for your comment - the only thing I can think of is, did you also set the "ftp_upload_dir" and "ftp_upload_site" parameters in your Galaxy config file (i.e. galaxy.ini, or universe_wsgi.ini if using an older version of the Galaxy codebase)?

      I think if either are not set then you won't be able to access the uploaded files from Galaxy.

      Hope this helps!

      Delete
    2. It seems the error crept in from legacy job runner settings. The old [galaxy:tool_runners] format for setting them was being read incorrectly which caused the FTP settings to be read incorrectly as they are sequential in the galaxy.ini file.

      Anyhow, it's working now. Thanx again
      g

      Delete
    3. Great - glad you managed to work it out.

      Delete
  3. Hi Peter, I need change "password from 38 for 69" to "password from 38 for 32" in LookupGalaxyUser. Now, it's all work! Thanks for help!

    Regards.

    ReplyDelete
    Replies
    1. Hello Rodrigo

      Thanks for your comment and the additional information.

      In fact it looks as if the values you use should be generally correct, as the hashed password part of the 'password' database field does seem to be 32 characters long (not 69 as implied by my example).

      As the value of 69 works for us, I wonder if different versions of PostgreSQL handle the 'substring' function differently - with some returning an error if it tries to get more characters than the string contains. We have version 8.4.18.

      (For anyone else interested in what's happening here, the 'password' field in the database is constructed by the 'hash_password_PBKDF2' function in .../lib/galaxy/security/passwords.py - which stores various pieces of information - separated by dollar signs - in a single string which looks like e.g.:

      PBKDF2$sha256$10000$DRQfphKq7tL/TLFa$uZ31gvmzOP2Jqy00Wbbg4o0Cb2PG/ZTr

      So ProFTPd needs to cut this string up using the Postgres 'substring' function in order to get the actual hashed password - which in this example it starts at position 38 and is 32 characters long.)

      Thanks again!

      Delete
  4. Hi,

    I am new to Linux and just started to establish a local instance of galaxy.
    When I am trying to run this command

    sudo yum install proftpd proftpd-postgresql

    I get the following error.

    There are no enabled repos.
    Run "yum repolist all" to see the repos you have.
    You can enable repos with yum-config-manager --enable

    I don't know how to setup a repo for my case.

    Kindly help
    Thank you
    Mohan

    ReplyDelete
    Replies
    1. Hello Mohan

      Sorry you're having problems setting up a Galaxy instance. I'm not sure I can help with this specific problem however, as it sounds like a more general issue with the configuration of your server.

      You could try goggling the error message that you get from the "sudo yum install ..." command and see if anything comes up.

      Otherwise I suggest you try contacting the Galaxy developers mailing list to see if someone there can help: https://lists.galaxyproject.org/listinfo/galaxy-dev/

      Good luck!

      Delete
  5. Hi,
    I'm working on uploading big file via FTP. Except that, I must go to install FTP due to my boss's request while I just an intern. I wish i could get your help, please.
    Here is the thing, I installed the proFTPd via the command of `yum install proftpd proftpd-postgersql`. And there is no error within the installation process.And ProFTPd version 1.3.5e.
    According to the documentation, i need to achieve the part of `Allow your FTP server to read Galaxy's database`. I think the reason why we take this part is that ensuring FTP could use the username and password of Galaxy user. However, I just can't understand the documentation. It's too abstract for me...
    Regards,
    hangz

    ReplyDelete
    Replies
    1. Hello hangz

      I'm not sure how much help I can give you unless you're more specific about the parts which are failing.

      However it might be that you no longer need to enable FTP upload - if you're using Galaxy release 18.05 or newer then this supports unlimited browser upload size without any configuration, which "... should effectively eliminate browser-based limitations on the size of files that can be uploaded to Galaxy".

      See https://docs.galaxyproject.org/en/release_18.05/releases/18.05_announce.html for more details.

      HTH!

      Delete
  6. Hi, it it mandatory to be root to run proFTPD ? I would like to install a local instance on a cluster but without root permissions. Do i need specific activation in the conf file ?

    Thanks,
    Luc
    I have the following problem when i try proftpd:
    [galaxy@node01 ~]$ proftpd --config /media/vol2/home/galaxy/proftpd.conf -t
    Checking syntax of configuration file
    node01 - ROOT PRIVS: unable to seteuid(): Operation not permitted
    node01 - ROOT PRIVS: unable to setegid(): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to seteuid(PR_ROOT_UID): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to setegid(session.gid): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to seteuid(session.uid): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to seteuid(PR_ROOT_UID): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to setegid(session.gid): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to seteuid(session.uid): Operation not permitted
    Syntax check complete.
    node01 - RELINQUISH PRIVS: unable to seteuid(PR_ROOT_UID): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to setegid(session.gid): Operation not permitted
    node01 - RELINQUISH PRIVS: unable to seteuid(session.uid): Operation not permitted

    ReplyDelete
    Replies
    1. Hello Luc

      It's been a while since I set up proFTPD and I'm not sure I can remember all the details, however I'll try and help.

      In the configuration that we're using, proFTPD is run as a system service on the server and that requires root permissions to set up. However within the service proFTPD itself is run as user 'nobody'.

      So that suggests to me that you should in principle be able to run proFTPD manually without root privileges.

      For your specific errors the only thing I can think of if in your 'proftpd.conf' file you're specifying 'User' and/or 'Group' to be different from the (non-root) user you're trying to run proFTPD. If this is the case then maybe try removing those from the '.conf' file.

      Otherwise I'd suggest searching a bit on Google or ProFTPD forums to see if there are other suggestions.

      Good luck with resolving your issues!

      Delete