Installation

This section documents the process of deploying the GA4GH reference server in a production setting. The intended audience is therefore server administrators. If you are looking for a quick demo of the GA4GH API using a local installation of the reference server please check out the GA4GH API Demo. If you are looking for instructions to get a development system up and running, then please go to the Development section.

Deployment on Apache

To deploy on Apache on Debian/Ubuntu platforms, do the following.

First, we install some basic pre-requisite packages:

sudo apt-get install python-dev python-virtualenv zlib1g-dev libxslt1-dev libffi-dev libssl-dev libcurl4-openssl-dev

Install Apache and mod_wsgi, and enable mod_wsgi:

sudo apt-get install apache2 libapache2-mod-wsgi
sudo a2enmod wsgi

Create the Python egg cache directory, and make it writable by the www-data user:

sudo mkdir /var/cache/apache2/python-egg-cache
sudo chown www-data:www-data /var/cache/apache2/python-egg-cache/

Create a directory to hold the GA4GH server code, configuration and data. For convenience, we make this owned by the current user (but make sure all the files are world-readable).:

sudo mkdir /srv/ga4gh
sudo chown $USER /srv/ga4gh
cd /srv/ga4gh

Make a virtualenv, and install the ga4gh package:

virtualenv ga4gh-server-env
source ga4gh-server-env/bin/activate
pip install ga4gh-server
deactivate

Download and unpack the example data:

wget https://github.com/ga4gh/ga4gh-server/releases/download/data/ga4gh-example-data_4.6.tar
tar -xf ga4gh-example-data_4.6.tar

Create the WSGI file at /srv/ga4gh/application.wsgi and write the following contents:

from ga4gh.server.frontend import app as application
import ga4gh.server.frontend as frontend
frontend.configure("/srv/ga4gh/config.py")

Create the configuration file at /srv/ga4gh/config.py, and write the following contents:

DATA_SOURCE = "/srv/ga4gh/ga4gh-example-data/registry.db"

Note that it is expected that the user running the server, www-data, have write and read access to the directories containing data files.

(Many more configuration options are available — see the Configuration section for a detailed discussion on the server configuration and input data.)

Configure Apache. Note that these instructions are for Apache 2.4 or greater. Edit the file /etc/apache2/sites-available/000-default.conf and insert the following contents towards the end of the file (within the <VirtualHost:80>...</VirtualHost> block):

WSGIDaemonProcess ga4gh \
    processes=10 threads=1 \
    python-path=/srv/ga4gh/ga4gh-server-env/lib/python2.7/site-packages \
    python-eggs=/var/cache/apache2/python-egg-cache
WSGIScriptAlias /ga4gh /srv/ga4gh/application.wsgi

<Directory /srv/ga4gh>
    WSGIProcessGroup ga4gh
    WSGIApplicationGroup %{GLOBAL}
    Require all granted
</Directory>

Warning

Be sure to keep the number of threads limited to 1 in the WSGIDaemonProcess setting. Performance tuning should be done using the processes setting.

The instructions for configuring Apache 2.2 (on Ubuntu 14.04) are the same as above with thee following exceptions:

You need to edit /etc/apache2/sites-enabled/000-default

instead of /etc/apache2/sites-enabled/000-default.conf

And while in that file, you need to set permissions for the directory to

Allow from all

instead of

Require all granted

Now restart Apache:

sudo service apache2 restart

We will now test to see the server started properly by requesting the landing page.

curl http://localhost/ga4gh/ --silent | grep GA4GH
#         <title>GA4GH reference server 0.2.3.dev4+nge0b07f3</title>
#    <h2>GA4GH reference server 0.2.3.dev4+nge0b07f3</h2>
# Welcome to the GA4GH reference server landing page! This page describes

We can also test the server by running some API commands. Please refer to the instructions in the GA4GH API Demo for how to access data made available by this server.

There are any number of different ways in which we can set up a WSGI application under Apache, which may be preferable in different installations. (In particular, the Apache configuration here may be specific to Ubuntu 14.04, where this was tested.) See the mod_wsgi documentation for more details. These instructions are also specific to Debian/Ubuntu and different commands and directory structures will be required on different platforms.

The server can be deployed on any WSGI compliant web server. See the instructions in the Flask documentation for more details on how to deploy on various other servers.

Troubleshooting

Server errors will be output to the web server’s error log by default (in Apache on Debian/Ubuntu, for example, this is /var/log/apache2/error.log). Each client request will be logged to the web server’s access log (in Apache on Debian/Ubuntu this is /var/log/apache2/access.log).

For more server configuration options see Configuration

Deployment on Docker

It is also possible to deploy the server using Docker.

First, you need an environment running the docker daemon. For non-production use, we recommend boot2docker. For production use you should install docker on a stable linux distro. Please reference the platform specific Docker installation instructions. OSX and Windows are instructions for boot2docker.

Local Dataset Mounted as Volume

If you already have a dataset on your machine, you can download and deploy the apache server in one command:

docker run -e GA4GH_DATA_SOURCE=/data -v /my/ga4gh_data/:/data:ro -d -p 8000:80 --name ga4gh_server ga4gh/ga4gh-server:latest

Replace /my/ga4gh_data/ with the path to your data.

This will:

  • pull the automatically built image from Dockerhub
  • start an apache server running mod_wsgi on container port 80
  • mount your data read-only to the docker container
  • assign a name to the container
  • forward port 8000 to the container.

For more information on docker run options, see the run reference.

Demo Dataset Inside Container

If you do not have a dataset yet, you can deploy a container which includes the demo data:

docker run -d -p 8000:80 --name ga4gh_demo ga4gh/ga4gh-server:latest

This is identical to the production container, except that a copy of the demo data is included and appropriate defaults are set.

Developing Client Code: Run a Client Container and a Server

In this example you run a server as a daemon in one container, and the client as an ephemeral instance in another container. From the client, the server is accessible at http://server/, and the /tmp/mydev directory is mounted at /app/mydev/. Any changes you make to scripts in mydev will be reflected on the host and container and persist even after the container dies.

# make a development dir and place the example client script in it
mkdir /tmp/mydev
curl https://raw.githubusercontent.com/ga4gh/ga4gh-server/master/scripts/demo_example.py > /tmp/mydev/demo_example.py
chmod +x /tmp/mydev/demo_example.py

# start the server daemon
# assumes the demo data on host at /my/ga4gh_data
docker run -e GA4GH_DEBUG=True -e GA4GH_DATA_SOURCE=/data -v /my/ga4gh_data/:/data:ro -d --name ga4gh_server ga4gh/ga4gh-server:latest

# start the client and drop into a bash shell, with mydev/ mounted read/write
# --link adds a host entry for server, and --rm destroys the container when you exit
docker run -e GA4GH_DEBUG=True -v /tmp/mydev/:/app/mydev:rw -it --name ga4gh_client --link ga4gh_server:server --entrypoint=/bin/bash --rm ga4gh/ga4gh-server:latest

# call the client code script
root@md5:/app# ./mydev/demo_example.py

# call the command line client
root@md5:/app# ga4gh_client variantsets-search http://server/current

#exit and destroy the client container
root@md5:/app# exit

Ports

The -p 8000:80 argument to docker run will run the docker container in the background, and translate calls from your host environment port 8000 to the docker container port 80. At that point you should be able to access it like a normal website, albeit on port 8000. Running in boot2docker, you will need to forward the port from the boot2docker VM to the host. From a terminal on the host to forward traffic from localhost:8000 to the VM 8000 on OSX:

VBoxManage controlvm boot2docker-vm natpf1 "ga4gh,tcp,127.0.0.1,8000,,8000"

For more info on port forwarding see the VirtualBox manual and this wiki article.

Advanced

If you want to build the images yourself, that is possible. The ga4gh/ga4gh-server repo builds automatically on new commits, so this is only needed if you want to modify the Dockerfiles, or build from a different source.

The prod and demo builds are based off of mod_wsgi-docker, a project from the author of mod_wsgi. Please reference the Dockerfiles and documentation for that project during development on these builds.

Examples

Build the code at server/ and run for production, serving a dataset on local host located at /my/dataset

cd server/
docker build -t my-repo/my-image .
docker run -e GA4GH_DATA_SOURCE=/dataset -v /my/dataset:/dataset:ro -itd -p 8000:80 --name ga4gh_server my-repo/my-image

Build and run the production build from above, with the demo dataset in the container (you will need to modify the FROM line in /deploy/variants/demo/Dockerfile if you want to use your image from above as the base):

Troubleshooting Docker

DNS

The docker daemon’s DNS may be corrupted if you switch networks, especially if run in a VM. For boot2docker, running udhcpc on the VM usually fixes it. From a terminal on the host:

eval "$(boot2docker shellinit)"
boot2docker ssh
>     sudo udhcpc
(password is tcuser)

DEBUG

To enable DEBUG on your docker server, call docker run with -e GA4GH_DEBUG=True

docker run -itd -p 8000:80 --name ga4gh_demo -e GA4GH_DEBUG=True ga4gh/ga4gh-server:latest

This will set the environment variable which is read by config.py

You can then get logs from the docker container by running docker logs (container) e.g. docker logs ga4gh_demo

Installing the development version on Mac OS X

Prerequisites

First install libraries and header code for Python 2.7. It will be a lot easier if you have Homebrew, the “missing package manager” for OS X, installed first. To install Homebrew, paste the following at a Terminal prompt ($):

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Now use brew install to install Python if you don’t have Python 2.7 installed and then pip install, which comes with Python, can be used to install virtual environment:

brew install python
pip install virtualenv

Install

Download source code from GitHub to the project target folder, here assumed to be ~/ga4gh: (If you haven’t already done so, set up github to work from your command line.)

git clone https://github.com/ga4gh/ga4gh-server.git

Before installing Python library dependencies, create a virtualenv sandbox to isolate it from the rest of the system, and then activate it:

cd server
virtualenv ga4gh-env
source ga4gh-env/bin/activate

Install Python dependencies:

pip install -r dev-requirements.txt -c constraints.txt

Test and run

Run tests to verify the install:

ga4gh_run_tests

Please refer to the instructions in the GA4GH API Demo for how to access data made available by this server.