22: Deploying to a Public Web Server

Summary: <h2>Goals</h2> In this lesson, we finally deploy our application to a public web server. Please note that, while we’ve tried to make these notes complete, they aren’t the full tutorial; that’s in the screencast, which you can access via the button on the left. We recommend that you right-click on the button and choose Save As to download it to your computer. The code used in this lesson has all been checked in to the <a href="http://github.com/chaupt/learning-rails-sample-app/">Github repository</a>. <h2>Rails deployment overview</h2> Deploying a Rails application is more complex than for a simple web site or PHP application for two reasons: <ul> <li>In general, a separate application server is used to keep the Rails application in memory, and execute requests that come in through the HTTP server. This requires more configuration, and that the application server be restarted when the code is changed.</li> <li>Static and PHP sites are typically deployed by simply uploading files to the server, with FTP or SFTP. Although there is nothing about Rails that requires it, Rails applications are almost universally deployed by having the server check the code out from a version control system. (This is a better practice for all kinds of sites, but it less often used with simpler technologies.)</li> </ul> The flow of a deployment is thus as follows: <ol> <li>Make changes to your code locally</li> <li>Test the changes</li> <li>Check the changes in to version control (and push to the main repository, if using Git)</li> <li>Optionally, display a maintenance page to replace the site while it is being updated</li> <li>Perform a checkout of the code from the version control system to the web server</li> <li>Run any migrations that the new code requires</li> <li>Restart the application server</li> <li>Disable the maintenance page, if any</li> </ol> Although this may sound complicated, with the right tools it is quite simple once you’re all set up. The key tool is a program called <a href="http://capify.org">Capistrano</a>, which can automate the entire process (starting with step 4). In the previous few lessons, you’ve been through the first three steps; by the end of this lesson, you’ll have been through the whole process. <h2>Choosing a web and application server configuration</h2> There’s a huge array of choices to be made when it comes to server software selection and configuration. We’ll describe one of many possible approaches in this lesson. You’ll almost surely want to explore other options, and need to learn more about issues specific to your server, to do your own deployments. Here’s where to find some useful resources on our <a href="http://www.buildingwebapps.com">BuildingWebApps</a> site: <ul> <li><a href="/topic/24356-the-linux-open-source-operating-system-for">Linux</a></li> <li><a href="/topic/24358-hosting-alternatives-for-ruby-on-rails">Rails Hosting</a></li> <li><a href="/topic/24348-application-servers-for-ruby-on-rails">Rails Application Servers</a></li> <li><a href="/topic/24383-http-servers-the-front-end-for">HTTP Servers</a></li> <li><a href="/topic/24400-automation-for-web-application-development-and">Automation</a></li> </ul> We also highly recommend the book <a href="http://www.amazon.com/exec/obidos/ASIN/0978739205/buildicom-20">Deploying Rails Applications</a>. For the past two years or so, the most widely used setup has been <a href="/topic/24384-the-apache-web-server-a-common">Apache 2.2</a> as the front-end server, which processes HTTP requests, serves static files (including JavaScript, CSS, and images, as well as any cached HTML pages) and passes (“proxies”) requests that must handled by Rails to the application server(s). Many people have started using <a href="/topic/24385-the-nginx-lightweight-http-web-server">Nginx</a> instead of Apache as the front-end server. Whether Apache or Nginx is the HTTP server, the most widely used application server is <a href="/topic/24349-the-mongrel-application-server-for-ruby">Mongrel</a>, which is installed as a gem. Mongrel handles only one request at a time, so most servers use two or more Mongrels (dozens for a system designed to handle high loads), and another gem called mongrel_cluster manages the Mongrels. Setting this up can be a little complex, and if you’re going to take this route, going with a hosting company that provides a “stock” configuration, and knows how to support it, is usually a good investment. A relatively new alternative, <a href="http://modrails.org">Passenger</a> (a.k.a. mod_rails), makes all this much simpler by operating as an Apache module. Not only does it eliminate most of the configuration, it shuts down Rails processes that have been idle for a while, freeing the memory they use. This has the disadvantage of a several-second delay the first time an idle site is accessed, but it is a tremendous advantage for a server that is running lots of low-traffic sites, since they don’t always need to sit in memory. In this lesson, we’ll use Apache 2.2 with Passenger. <h2>Set up database configuration for MySQL</h2> Before we get to the server setup, we need to do a little work on our application. So far, we’ve used <a href="/topic/24373-sqlite">SQLite</a> for our development and test databases, and the production database is set up to use it as well. <a href="/topic/24368-the-mysql-database-server-for-web">MySQL</a> is generally a better choice for production, so let’s change the production configuration. In <code>config/database.yml</code>, replace the section under <code>production</code> to the following: <pre> adapter: mysql database: learningrails_production username: deploy password: secret host: localhost </pre> The development and test databases can remain set up for SQLite. We’ve also added a <a href="http://github.com/mzslater/learning-rails-sample-app/tree/master/db/migrate/011_default_pages.rb">migration</a>) that will create all the pages and a couple of sample resource categories and links, so the application when deployed should look much like the version you have on your development system. This migration won’t do anything unless there are no pages defined in the database. Remember that the contents of your local SQLite database aren’t transferred to production, so you’ll need to enter, on the production system, any data you’ve added to the database on your development system (or add a migration to do so, as we’ve done for the basic site contents). This is a bit of a pain for ongoing development — you’ll probably find that you’ll want to copy the database from the production system down to development when you’re doing development work, and sometimes push it back up to production when you’ve added content locally. A GUI tool such as <a href="http://www.navicat.com/">Navicat</a> simplifies this process. <h2>Install Capistrano</h2> Capistrano is a powerful utility for automating any kind of web deployment. It is nearly universally used by Rails developers. First, install the Capistrano gem if you don’t already have it: <pre> sudo gem install capistrano </pre> We’ve used version 2.5 in this lesson. Some details will be different if you’re using an earlier version, and if it is much earlier, it may not work with Git. Capistrano runs on your development system, not on the server. It performs actions on the server by connecting to it via SSH and issuing commands, just as you would from an SSH terminal session. To configure the application for Capistrano, the first step is to “Capify” it. From your root application directory, simple enter in a terminal: <pre> capify . </pre> (That’s “capify”, space, period.) This adds two files to your application: <code>Capfile</code> at the root of the app, and <code>deploy.rb</code> in the config directory. We don’t need to make any changes to <code>Capfile</code>. <h2>Set up deploy.rb</h2> The file <code>deploy.rb</code> is where you tell Capistrano how to access your code repository, as well as other details of your production server(s). The art of configuring Capistrano is mostly a matter of setting up the <code>deploy.rb</code> file. There’s a <a href="http://github.com/guides/deploying-with-capistrano">guide at github</a> that gives us some guidance about what is needed so Capistrano can fetch our code directly from our github repository. There’s also a lot of tutorial content at <a href="http://capify.org">capify.org</a>. Here’s the lines we need to add to support checking out from our public Git repository: <pre> default_run_options[:pty] = true set :repository, "git://github.com/chaupt/learning-rails-sample-app.git" set :scm, "git" set :branch, "master" set :deploy_via, :remote_cache </pre> The first line ensures that password prompts are passed through. The next three lines tell Capistrano where the repository is, that it is a Git repository, and which branch to use. The last line tells Capistrano to keep a cache of the repository on the web server, so it only has to transfer new code from the repository when deploying a new version. If you are deploying from your own repository, you’ll need to change the repository URL, and if it isn’t a public repository, you’ll want to set up an SSH key pair, with the private key on your server and the public key added to your Github account; see the <a href="http://github.com/guides/home">help docs on Github</a> for details. Then we need to tell Capistrano about our servers: <pre> set :application, "learningrails" set :deploy_to, "/var/www/apps/#{application}" set :user, "deploy" set :admin_runner, "deploy" role :app, "sampleapp.learningrails.com" role :web, "sampleapp.learningrails.com" role :db, "sampleapp.learningrails.com", :primary => true </pre> The application name is used to name the folder into which the application code is deployed. The second line allows us to specify where this folder is located on the system, with the application name substituted in as a variable. Then we tell Capistrano what user name to use when logging into the server. (It is customary to use a dedicated user name just for deployments.) By default, Capistrano uses sudo to execute commands on the server. This won’t work on most shared servers, where you don’t have sudo access, so you might need to add the option <code>set :use_sudo, false</code>. If you do so, however, Capistrano might not be able to create the directories for your application. We’ve left sudo enabled, and used the <code>admin_runner</code> variable to set the user name under which Capistrano will use the sudo command when creating directories. This is what worked best for us in our server configuration, but your mileage may vary; if in doubt, check with your hosting company. (Versions of Capistrano prior to 2.4 behave somewhat differently.) The three “role” lines tell Capistrano about our servers. We’re using a very simple configuration, in which the web (HTTP), application (Rails), and database (MySQL) servers are all on the same machine, but they don’t have to be. You can also have a cluster of servers, with more than one server performing each role, and Capistrano will take care of, for example, checking out a copy of the code to each application server, and modifying the setup on each web server to display a maintenance page. Finally, you need to tell Capistrano how to start and restart the application server. There are standard Capistrano tasks for this that are designed for Mongrel and similar application servers; for Passenger, things are simpler, but we need to tell Capistrano just what to do, with these lines of code: <pre> namespace :deploy do desc "Restart Application" task :restart, :roles => :app do run "touch #{current_path}/tmp/restart.txt" end desc "Start Application -- not needed for Passenger" task :start, :roles => :app do # nothing -- need to override default cap start task when using Passenger end end </pre> The namespace block tells Capistrano what name prefix to use for these tasks. Namespaces help avoid unintentional name conflicts in complex setups. To restart Passenger, you only need to update the modification date of the <code>restart.txt</code> file, which is done with the <code>touch</code> command. Note that you can simply write <code>run</code> and then the command, and Capistrano knows to open an SSH terminal to the app server (as specified on the <code>task</code> line) and issue this command. <code>current_path</code> is a variable that provides the path to the current version of the application. With Passenger, you don’t need to do anything to start the app server, but we need to define an empty start task to override the one that is part of the default Capistrano tasks. These tasks give you a hint of what’s possible with Capistrano — you can use it to automate nearly everything you’d want to do on your server. Keep in mind, though, that if you use the <code>:use_sudo, false</code> setting, you can’t do things that require a privileged user. We’re done changing code now, so if you’re working in your own repository, commit your changes now. We’ve already made all these changes in the public Github repository. <h2>Install Passenger</h2> Now we’re ready to set up the server. There’s a vast array of options here, depending on whether you’re on a shared or dedicated server, what operating system is installed, and so forth. To avoid a lot of system-specific discussion and keep this lesson to a reasonable length, we’re going to assume you have a Linux server with Apache 2.2, MySQL 2, and Git installed. If not, see the links at the start of this article for pointers to some resources, or check with your hosting company. To install Passenger, simply enter in an SSH terminal window connected to your server: <pre> sudo gem install passenger </pre> And then: <pre> passenger-install-apache2-module </pre> The Passenger installer will check for any required dependencies and give you specific instructions if there’s anything else you need to set up first. <h2>Configure Apache</h2> Now we need to configure Apache. Your system should have a base Apache configuration already, often in <code>/etc/httpd/conf/httpd.conf</code>, though this varies depending on the Linux variant and system configuration. We need to add a few lines to the Apache configuration file to tell Apache to use the Passenger module: <pre> LoadModule passenger_module /usr/lib/ruby/gems/1.8/gems/passenger-2.0.3/ext/apache2/mod_passenger.so PassengerRoot /usr/lib/ruby/gems/1.8/gems/passenger-2.0.3 PassengerRuby /usr/bin/ruby </pre> The Passenger installer will provide these lines for you; the paths and other details may be different for your system configuration. Then we need a virtual host configuration for our site. We put this is a separate file, which is included by reference at the end of the main Apache config file. Here’s what we used: <pre> <VirtualHost *:80> ServerName sampleapp.learningrails.com DocumentRoot /var/www/apps/learningrails/current/public <Directory "/var/www/apps/learningrails/current/public"> Options FollowSymLinks AllowOverride None Order allow,deny Allow from all </Directory> </VirtualHost> </pre> The first line identifies the root URL for your site. We’ve used the URL for our demonstration server, but you’ll want to use your own server URL here. You’ll need to set up a DNS record to point this name to the IP address of your server. The second line must point to the <code>public</code> directory of your Rails application. We told Capistrano to put the application in <code>/var/www/apps/learningrails</code>, and as we’ll see shortly, <code>current</code> is a symlink that points to the currently active version. The Directory directives ensure that Apache is allowed to access the public directory. This may or may not be necessary, depending on what the rest of your Apache config file looks like. Now restart Apache, and make sure there are no errors reported. On our server, this can be done with: <pre> sudo /sbin/service httpd restart </pre> Once again, the command may be different for your server configuration. <h2>Set up MySQL</h2> We have just one more thing we need to do on the server: create the empty database. You can do this with a MySQL GUI application, if it can connect to your server (typically using an SSH tunnel), or you can do it from the command line in a SSH terminal session. To use the latter approach, first create the database with the following command at the shell prompt: <pre> mysqladmin -u root -p create learningrails_production </pre> This command assumes that you have a user configured for MySQL named root, and you’ll be prompted for root’s database password after you enter this command. Then, start the mysql monitor program by entering <pre> mysql -u root -p </pre> and create the deploy user by entering the following command at the mysql prompt: <pre> mysql> grant all privileges on *.* to 'deploy'@'localhost' identified by 'secret'; </pre> (Note that you don’t enter <code>mysql></code>; that’s the prompt.) The user name (deploy) and password (secret) must correspond to those in your database.yml file’s production section. <h2>Your first deploy</h2> Now, on to Capistrano tasks. First, to get an overview of all the Capistrano commands available, enter (in a terminal on your development system): <pre> cap -T </pre> Remember, Capistrano always runs on your development system, even though it is issuing commands to your server. Now there’s some one-time-only setup: <pre> cap deploy:setup </pre> Capistrano will connect to your server, and will prompt you for the server password if don’t have an SSH key set up. Capistrano will then create some directories to hold your application code. To make sure everything is ok, enter: <pre> cap deploy:check </pre> Hopefully, this tells you that everything looks good; if not, read the error messages carefully and correct any problems. If there are permissions issues, Capistrano may not be able to create the directories it needs. You may need to log into the server and create the <code>/var/www</code> directory manually. For our server, the owner of this directory is set to <code>deploy</code>, and the group to <code>apache</code>. Now, just enter: <pre> cap deploy:cold </pre> And you should see lots of lines fly by, as Capistrano tells the server to check out the code from Github and sets up some symlinks. At the end, all the migrations should be run. If all has gone well, you should now be able to browse to your server and see the application running. (You can visit http://sampleapp.learningrails.com to see the copy that we deployed.) <h2>Troubleshooting</h2> There’s a lot that can go wrong in this process, and unless everything works, your app won’t. So watch the Capistrano scripts carefully while they’re running. If there were no errors there, but it hangs on checking out the code, then there’s a problem with the authentication with Github (if you’re using a private repository) or the Git URL. If you get no response when you try to access the application, make sure your DNS is set up correctly and has propagated to your computer, and that Apache is running. The Apache error log file may be helpful (the location varies depending on the system configuration; on our system, it is <code>/var/log/httpd/error_log</code>). If you get a Rails error page (error 500), then check the Rails log on the server for clues (<code>/var/www/apps/learningrails/shared/log/production.log</code>). Because of the wide range of issues that can come up, it’s unlikely that we’ll be able to answer all your questions here. Here’s some resources to try: <ul> <li><a href="http://groups.google.com/group/capistrano">Google group for Capistrano</a></li> <li><a href="http://groups.google.com/group/rubyonrails-deployment">Google group for Rails Deployment</a></li> <li><a href="http://groups.google.com/group/rubyonrails-talk">Google group for general Rails issues</a></li> <li><a href="http://groups.google.com/group/phusion-passenger">Google group for Passenger</a></li> <li><a href="http://railsforum.com">RailsForum</a></li> <li><a href="http://linuxquestions.org">LinuxQuestions forum</a></li> </ul> And, of course, there’s always the tutorials, FAQs, and support staff at your hosting company. <h2>Deploy, and deploy again</h2> Once this is all set up and working, deploying new versions is gloriously simple. First, be sure to commit your changes and push them to Github. Then, just: <pre> cap deploy </pre> Or, if you’ve added any migrations: <pre> cap deploy:migrations </pre> If you want to put up a maintenance page during deployments, first issue this command: <pre> cap deploy:web:disable </pre> And then do the deployment. When it’s done, just: <pre> cap deploy:web:enable </pre> <h2>Something wrong? Just roll back</h2> Another great thing about Capistrano is that it maintains multiple versions of your application on your web server. In the directory <code>/var/web/apps/learningrails</code>, you’ll find three directories: <code>shared</code>, <code>releases</code>, and <code>current</code>. <ul> <li> <code>shared</code> is for information that doesn’t change with each deployment. Log files, for example are stored in <code>shared/log</code>. If you have assets that are uploaded to the server and don’t go in the database or the repository, then they should go here, and you’ll want to add a Capistrano task (which can be automatically executed after each deployment) to symlink to the shared location from the current <code>public</code> directory.</li> <li> <code>releases</code> has a directory for each deployed version. By default, Capistrano keeps the most recent four versions.</li> <li> <code>current</code> isn’t a real directory; it’s a symlink to the current release directory. So you can always refer to the current version as <code>/var/www/apps/learningrails/current</code>, and you’ll really be accessing one of the directories inside <code>releases</code>.</li> </ul> So what’s the use of all these releases directories? If you deploy a version and then find that your application has a serious problem, just roll back: <pre> cap deploy:rollback </pre> This task takes only a moment to execute, because all it has to do change the <code>current</code> symlink to point to the previous <code>releases</code> directory. <h2>And there’s more…</h2> We’ve just scratched the surface in this lesson, as deployment is a very complex topic. There’s much more you can do with Capistrano, and more that you should know to create a secure production deployment. Check out the links early in this lesson page to dive deeper. Next lesson, we’ll look at how to evaluate the performance of your deployed application and find performance troublespots using New Relic’s RPM service.

22: Deploying to a Public Web Server

Directory

Click for all Categories