Highly Available Web Pages, Apache and PHP as an Example

Standard

Several weeks ago, I discussed how to have a highly available file storage and a highly available relational database.  With a robust supply of file system and database service, we can stack more services on top.  Today, I take Apache and PHP as an example how to have a highly available web server.  As usual, I use FreeBSD for the purpose.

Why Apache and PHP!?

Quite some people would argue Apache and PHP are outdated.  I am not going to make a comment on this.  I hope you will appreciate the shortness of this article because of this choice.  The concepts you acquire from this example can be applied with your favourite platform.

Active / Active Web Servers

The example today is active / active pair.  The two servers can serve webpages together without error.  In fact, one can have unlimited amount of hosts working together.  In contrast, in the previous examples of NFS, only one server is active and another server is passively receiving updates and standby to take over.

To make this happen, one needs to ensure intermediate state of executions of one servers are saved so they can be taken up by the other servers.  In this example, we will move these state to the highly available storage.  With role separation, those intermedia data will not be lost no matter what happens to the web servers.  Another benefit is that web servers can be easily added (handle more load), replaced (for system updates), or reduced (for economy) without affecting any user sessions.

Session Data

When running a cluster of web hosting servers, one needs to take care of the session data.  HTTP servers are stateless by itself.  Cookies (notes to Europeans) are saved on the client side so servers need not to worry about.  Sessions storage, provided by most web programming environments, expect the storage on the server is consistent and persistent for each user session.

Session data is useful storing information that should not be understood or modify the the web users.  This could be some intermediate game states, some login privileges, etc.  It is therefore important to make sure such data is not lost when a user encounters another web server in the cluster.

Packages

At the time of writing, Apache 2.4 and PHP 7.1 are the newest.

In order to run the PHP in the most lazy way, install “mod_php” packages.  In addition, install some common PHP modules.  There is a moderate collection called “php71-extensions”.  I would also install “php71-mysqli” and “php71-gd”.  They are for database connections and image processing respectively.

# pkg install mod_php71 php71-mysqli php71-gd

What about Apache?  It is installed automatically as an dependency when you request installing the “mod_php”.

Shared Directories

To build on top of the previous example, I am mounting the NFS prepared previously to /mnt/nfs.  On the shell, overtime you want to mount:

# mkdir /mnt/nfs
# mount 10.65.10.13:/nfs /mnt/nfs
#

Alternatively, in the “/etc/fstab”:

# mkdir /mnt/nfs
# cat >> /etc/fstab << EOF
10.65.10.13:/nfs.   /mnt/nfs    nfs    rw    0    0
EOF
# mount /mnt/nfs
#

Configuring the Apache

Modify Apache so that the web page and script locations are in the NFS share.  The file to be modified is “/usr/local/etc/apache24/httpd.conf”.  There are originally two directories for normal web pages and CGI pages in “/usr/local/www/apache24”.  We are moving it to “/mnt/nfs”.

# sed -ibak 's|/usr/local/www/apache24|/mnt/nfs|' \
  /usr/local/etc/apache24/httpd.conf
# cp -a /usr/local/www/apache24/cgi-bin /mnt/nfs/
# cp -a /usr/local/www/apache24/data /mnt/nfs/
# cat >> /etc/rc.conf << EOF
apache24_enable="YES"
EOF
#

Also, update the “/usr/local/etc/apache24/httpd.conf” so that it decodes PHP files.  Modify the following parts manually:

<IfModule dir_module>
    DirectoryIndex index.html index.php
</IfModule>

<FilesMatch "\.php$">
    SetHandler application/x-httpd-php
</FilesMatch>
<FilesMatch "\.phps$">
    SetHandler application/x-httpd-php-source
</FilesMatch>

Configuring the PHP

PHP requires only copying the default configuration, and then modifying the session path.  One can also modify the upload path similarly, but I do not think it is necessary.

# cd /usr/local/etc/
# cp php.ini-production php.ini
# vi php.ini

The line to be added is just as follows:

; http://php.net/session.save-path
;session.save_path = "/tmp"
session.save_path = "/mnt/nfs/tmp"

Example Code

To make effect, reboot Apache24.  Here is a simple code that prints a counter every time it is loaded.  Write a file “/mnt/nfs/data/counter.php”

<?php
session_start();
if (isset($_SESSION['counter']))
  $_SESSION['counter'] += 1;
else
  $_SESSION['counter'] = 1;
print($_SESSION['counter']);
session_write_close();
?>

Testing

In order to test the service, we first turn on the web servers.  Then, we alternatively point a domain name to either one of the IP addresses.  The counter code above is repetitively executed as the map changes.  If successful, the counter continues increasing.

The laziest way to change the domain name mapping is of course editing the host table where the web browser is run.  For BSD and BSD-like systems, edit “/etc/hosts”.  For Windows, edit “C:\Windows\System32\Drivers\etc\hosts”.

Upcoming Steps

Once there are multiple web servers ready, one can consider having a load balancer or a content distribution network.  A load balancer service can be found in some value-added cloud service providers.  (Or else, you can set up one yourself.)  A content distribution network can be found anywhere around the globe.  Put aside their fundamental difference, they both accept multiple web server addresses and route network traffic accordingly.

Troubleshoot

Getting a blank page of PHP:  If you follow my instructions, you have configured PHP in a production mode.  Errors messages are not displayed but saved to the Apache log, which is “/var/log/httpd-error.log” in our context.  Read it and you will get an idea.

Function “session_start” undefined:  Make sure you have installed the package “php71-session” and restarted Apache.  It should be installed as an dependency when you install the package “php71-extensions”.

Appending Distribution Files after Installing FreeBSD

Standard

Previously, it was discussed how to install FreeBSD with the installer.  In the Question 4, The installer allows administrators to select what distribution to be installed – 32-bit compatibility libraries, source code, debug symbols, etc.

Sometimes, maybe due to a mistaken omission, or maybe due to a new purpose, more distribution files have to be added.  In the good old days of FreeBSD 4.x, I could easily run the “/stand/install” again and let it be reconfigured.  The new installer since 9.x becomes unknown to me and I get to do it myself.

Thankfully, it is much easier than one could have thought of.

Downloading the Files

Downloading the distribution file is relatively simple with FTP.  There is an FTP client coming with the default minimal FreeBSD installation.  From there, we can download the distributions files.  For simplicity, I have skipped the directory listing messages.  The filenames will be self-explanatory as you encounter them.

# ftp -a ftp.freebsd.org
Connected to ftp.geo.freebsd.org.
(Output truncated)
220 This is ftp.geo.freebsd.org - hosted at ISC.org
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> cd pub/FreeBSD/releases
ftp> ls
150 Here comes the directory listing.
(Output truncated)
226 Directory send OK.
ftp> cd amd64
ftp> ls
150 Here comes the directory listing.
(Output truncated)
226 Directory send OK.
ftp> cd 11.0-RELEASE
ftp> ls
150 Here comes the directory listing.
(Output truncated)
226 Directory send OK.
ftp> mget kernel-dbg.txz base-dbg.txz
mget kernel-dbg.txz [anpqy?]? a
Prompting off for duration of mget
229 Entering Extended Passive Mode
150 Opening BINARY mode data connection for kernel-dbg.txz
226 Transfer complete
229 Entering Extended Passive Mode for base-dbg.txz
226 Transfer complete
ftp> exit
221 Goodbye

Installing the Files

If you want to preview what files are inside, you can use “tar tf” command directly, such as…

# tar tf kernel-dbg.tgz
# tar tf base-dbg.tgz

Installing the files is a simple Bzip2 tarball decompression to the root directory.  For example…

# tar jxf kernel-dbg.txz -C /
# tar jxf base-dbg.txz -C /

Here, the “j” stands for Bzip2, “x” stands for decompress, “f” stands for filename, and “C” stands for changing to a given directory (which is the root in our case).

Updating FreeBSD

It is likely the system has been patched since the “release” installation.  To make sure the files you installed match with your updated system, you can consider running the FreeBSD update once.  Please note the commands have to be run on interactive terminals.  Make backups if the system holds files that you cannot lose.

# freebsd-update fetch
# freebsd-update install

Installing without Installer?

Replying questions of the FreeBSD Installer can be boring.  Technically, installing a minimal FreeBSD can be as simple as:

  1. Boot a temporary operating system environment (like live CD)
  2. Partition the drives and install the boot loader (like Question 8 of here)
  3. Download and decompress the distribution files “kernel.txz” and “base.txz”
  4. Configure the essential config files, “/etc/fstab” and “/etc/rc.conf”
  5. Remove any temporary boot media and reboot

Will it work?  Well…

Highly Available MySQL Server

Standard

Previously, I have discussed how to setup a highly available block device and also a highly available file system.  In this article, I further demonstrate how to setup a highly MySQL database service.

Installing the Packages

As usual, one will need to install the package on two hosts and it can be easily done by:

# pkg install mysql57-server

I know there are alternatives.  Forgive my laziness.

Running for the First Time

We are starting MySQL once so that it generates the file structures.  Try to login, and then Ctrl-D to exit.

# service mysql-server onestart
# cat /root/.mysql_secret
# mysql -u root -p
Password: **********
root@localhost [(none)]> ^D
# service mysql-server onestop

It is then discovered (through educated guess) some directories are created in the “/var/db” directory, namely “mysql”, “mysql_tmpdir”, and “mysql_secure”.  Suppose you already have the “/db” mounted (as in the previous article), move them there and make the replacement symbolic links.

# mv /var/db/mysql /db
# mv /var/db/mysql_tmpdir /db
# mv /var/db/mysql_secure /db
# ln -s /db/mysql /var/db/
# ln -s /db/mysql_tmpdir /var/db/
# ln -s /db/mysql_secure /var/db/
# ls -ld /var/db/mysql*
lrwxr-xr-x  1 root  wheel   9 Apr 19 20:37 /var/db/mysql -> /db/mysql
lrwxr-xr-x  1 root  wheel  16 Apr 19 20:37 /var/db/mysql_secure -> /db/mysql_secure
lrwxr-xr-x  1 root  wheel  16 Apr 19 20:37 /var/db/mysql_tmpdir -> /db/mysql_tmpdir

Some would question why not change the configuration for the new paths.  I find it mostly a matter of taste.  If you want to make lives easier for those who have recited the default paths, do make the symbolic links.

Configurations

You will want to modify the configuration file “/usr/local/etc/mysql/my.cnf”.  For your reference, there is a sample file “my.cnf.sample”.  At minimum, you will need to modify the bind address (default 127.0.0.1) so that the service is available not just locally, but to the other computers in the same intranet.

The Script

The script for starting and stopping the MySQL server is simpler than the NFS one and are as follows.  Like last time, automatic switching is skipped due to my conflict of interest.  You will need a mechanism to call “start” and “stop” properly.

#!/bin/sh -x

start() {
 ifconfig vtnet1 add 10.65.10.14/24
 hastctl role primary db_block
 while [ ! -e /dev/hast/db_block ]
 do
 sleep 1
 done
 fsck -t ufs /dev/hast/db_block
 mount /dev/hast/db_block /db
 service mysql-server onestart
}

stop() {
 service mysql-server onestop
 umount /db
 hastctl role secondary db_block
 ifconfig vtnet1 delete 10.65.10.14
}

status() {
 ifconfig vtnet1 | grep 10.65.10.14 && \
 service mysql-server onestatus && \
 ls /dev/hast/db_block
}

residue() {
 ifconfig vtnet1 | grep 10.65.10.14 || \
 service mysql-server onestatus || \
 mount | grep /db || \
 ls /dev/hast/db_block
}

clean() {
 residue
 if [ $? -ne 0 ]
 then
 exit 0
 fi
 exit 1
}

if [ "$1" == "start" ]
then
  start
elif [ "$1" == "stop" ]
then
  stop
elif [ "$1" == "status" ]
then
  status
elif [ "$1" == "clean" ]
then
  clean
fi

Troubleshoot

If there are any issues MySQL fails to start, you can verify its absence with the command “service mysql-server onestatus”.  There are also log files located in the MySQL data directory; in our context, it is “/db/mysql/<hostname>.err”.  Please note the end of the log is most likely a graceful shutdown.  You will need to scroll upwards for the actual reason why the startup failed.

System Performance with FreeBSD (Minecraft Server as Example)

Standard

Quite some time ago, we discussed how to get compile Minecraft and get it running on FreeBSD.  In this article, we take the server as an example how we can monitor system performance.

Minecraft Memory Usage

Minecraft is a Java program.  Java programs generally consume more memory than the counterparts made of unmanaged languages.  Thankfully, Java programs run inside their own sandboxes and have memory usage allocated and constrained.  In the previous article, we defined memory to be 1024 megabytes and expands up to 1024 megabytes only:

java -Xmx1024M -Xms1024M -jar spigot*.jar

It is important not to have Minecraft overrun the system memory.  As folklore, a Java program running on a dedicated computer should not be higher than 60% of total memory.  For example, my virtual machine has 2048 megabytes of memory and 60% of it is about 1200 megabytes.  I deducted myself further for 200 megabytes as safety margin.

General Process Monitoring

FreeBSD provides top(1) utility to check for system saturation and utilisation.  Generally speaking, a system is considered saturated if the number of threads ready to run is higher than the number of processor cores, and considered fully utilised if the utilisation numbers are near 100%.

螢幕快照 2017-06-06 下午10.57.56.png

In the top part, there are three numbers in decimal point, representing the load averages in 1, 5, and 15 minutes.  This is calculated by average number of threads ready to run and is regarded as the saturation of the processors.  In the third and forth row, the overall processor and memory utilisation ratios / rates are shown.

In the picture, the system not yet saturated as there is about 0.6 threads to run per second.  It is under some computation load that around 20% processing power and 80% memory are utilised.  This is a healthy situation that the system is being used, with some slacks to handle possible usage spikes.

In the bottom part, we have detailed breakdown per process.  The fields “TIME” and “WCPU” represents the total time and current portion of processor each process is using. Actual amount of memory usage are represented in the “RES” column.

In the picture, we see the Java process using 44% of current processing power and it has accumulated 295 hours.  Although the Java process is instructed to use only 1024 megabytes of memory, eventually it consumes 1368 megabytes.  This echoes why the folklores recommend only specify 60% of system memory to a Java virtual machine.

System Input / Output Monitoring

FreeBSD provides systat(1) utility for system usage statistics.  Since the general process monitoring can be done by the top(1) command above, it is mostly useful for detailed input and output monitoring.  By default, it shows a “pig” screen that refreshes every 10 seconds.  You can specify what you want to see by specifying in commands, for example, to see network interface usage every second:

systat -ifstat 1

螢幕快照 2017-06-06 下午11.02.03

The screen works like vi(1).  You can press the colon sign (:) and to enter a command.  The first command you want to try is “help”.  It immediate tells you the list of available pages you can switch to.  You can then use the colon sign again and enter the page you want.

I would tell you vmstat is my favourite page.  I learned it on the very first days I was taught FreeBSD.  It shows a lot of comprehensive information from system utilisation to interrupts and disk accesses.

systat -vmstat 1

螢幕快照 2017-06-06 下午10.59.11On the top, we see the system saturation and utilisation as usual.  Immediately under it, we see the detailed breakdown system events like context switches (Csw), traps (Trp), system calls (Sys), etc.  The system is now having thousands of context switch per second.  It would be no good if it were a system for high performance computing, but for our context of gaming with network messages, it is absolutely normal.

Further below, we see an ASCII art of system utilisation break down into system (=), interrupts (+), user space applications (>) and niced user applications (-).  The system is using mild amount of processing power and most of it are for the user space, which is a good thing.

The second bottom section shows the name caches of the virtual file system.  For a gaming system that uses few files, you can expect the hit rate be near 100%.  Otherwise you may want to pay more attention to the name caches.  In the picture, we see the system rarely searches for a file and those requests are handled by the cache perfectly.

The bottom section shows the disk utilisation.  Utilisation and bandwidth of each of the virtual disks are shown.  In the picture, we see the system rarely access the files.

The right most column shows the interrupt statistics.  The “timer” interrupts happens almost all the time in order to hint the operating system to context switch and update system clocks.  It used to be 100 or 1000 per processor core; thankfully, with the advent of more advanced system clocks, systems no longer need to tick as frequent as before.  In the picture, the network card (virtio-network) requires quite some interrupts handling.  As long as the network card interrupts does not go to numbers like 50000 or 100000, they are most likely normal.

The list of pages available in the tool, as of today, are:

  • pigs: shows the processes which consume the most processing power
  • vmstat: (as discussed above)
  • swap: shows the system swap situation
  • zarc: shows the ZFS adaptive read cache situation
  • iostat: raw disk input and output statistics
  • netstat: network socket statistics, such as buffered bytes for each of the connections
  • sctp: stream control transport protocol statistics
  • tcp: transport layer protocol statistics
  • ip: internet protocol statistics
  • ip6: internet protocol statistics for IPv6
  • icmp: internet control message statistics such as ping, etc
  • icmp6: internet control message statistics for IPv6
  • ifstat: raw network interface utilisation statistics

Conclusion

In this article, we go through some performance monitoring tools that come with FreeBSD.  The general process information can be listed by the top(1) command, where you can understand the system saturation and utilisation, and also list the resource consuming processes.  More detailed resource utilisations like network, disks, hardware interrupts can be found in the systat(1) command.  If in doubt, the “vmstat” page can be a great starting point to look for congested system resources.

Highly Available Network File System

Standard

In the previous article, we discussed the way to create a highly available block device by replication.  We continue and attempt making a network file system (NFS) on top of it.  We first discuss the procedures to start and stop the service.  Then we have the script…  Some parts are deliberately missing due to my conflict of interest.

NFS Configuration

Since it is not our goal here, we only do minimal NFS configuration in this example.  In short, the export(5) file “/etc/exports” is being modified like as follows.  This implies the directory “/nfs” is shared with the given two IP subnets.

/nfs -network=10.65.10.0/24
/nfs -network=127.0.0.0/24

Unlike previous setting, we do not use the “/etc/rc.conf” file to start the service.  This is because we like to control when a service is started, instead of blindly just after boot.  In FreeBSD, services can be started with the “onestart” command.

Firewall Configuration

Configuring NFS for a tight firewall is tricky, because it uses random ports.  For convenience, a simple IP address-based whitelist can be implemented.  In this example, we have the server IP 10.65.10.13 (see later), and the client IP 10.65.10.21.  If you simply do not have a firewall, skip this part.  On the server side, the PF can be configured with:

pass in quick on vtnet1 from 10.65.10.21 to 10.65.10.13 keep state

On the client side, the PF can be configured with:

pass in quick on vtnet1 from 10.65.10.13 to 10.65.10.21 keep state

Starting the Service

When we start the service, we want the following to happen:

  1. Acquire the IP address, say 10.65.10.13, regardless which machine it is running.
  2. Activate the HAST subsystem so to become the primary role.
  3. Wait for the HAST device to be available.  If the device is in secondary role, the device file in “/dev/hast” will not appear so we can go to sleep a while.
  4. Run the file system check just in case the file system was corrupted in the last unmount.
  5. Mount the file system for use (in this example, “/nfs”)
  6. Start the NFS-related services in order: the remote procedural call binding daemon, the mount daemon, the network file system daemon, the statistic daemon, and the lock daemon.
  7. Once the step 5 completes, the service is available to the given clients as instructed to the NFS and allowed by the firewall.

For the inpatient, one can jump to the second last section for the actual source code.

Stopping the Service

Stopping the service is the reverse of starting, except some steps can be less serious.

  1. Stop the NFS-related services in order: the lock daemon, the statistic daemon, the network file system daemon, the mount daemon, and finally the remote procedural call binding daemon.
  2. Unmount the file system.
  3. Make the HAST device in secondary role.
  4. Release the iP address so neighbours can reuse.

Also, one can jump to the second last section for the actual source code.

Service Check Script

There are two types of checking.  The first one ensures all the components (like the IP address, mount point service, etc) are present and valid.  The procedure returns success (zero) only when the components are all turned on.  Whenever a component is missing, it will be reported as a failure (non-zero return code).

The second one ensures all the components are simply turned off, so that the service can be started on elsewhere.  The procedure returns success (zero) only when all the components are turned off.  Whenever a component is present, it will be reported as a failure (non-zero return code).

What is Missing

Once we master how to start and stop the service on one node, we need the mechanism to automatically start and stop the service as appropriate.  In particular, it is utmost important not to run the service concurrently on two hosts, as this may damage the file system and confuse the TCP/IP network.  This part should be done out of the routine script.

The Script

Finally, the script is as follows…

#!/bin/sh -x

start() {
  ifconfig vtnet1 add 10.65.10.13/24
  hastctl role primary nfs_block
  while [ ! -e /dev/hast/nfs_block ]
  do
    sleep 1
  done
  fsck -t ufs /dev/hast/nfs_block
  mount /dev/hast/nfs_block /nfs
  service rpcbind onestart
  service mountd onestart
  service nfsd onestart
  service statd onestart
  service lockd onestart
}

stop() {
  service lockd onestop
  service statd onestop
  service nfsd onestop
  service mountd onestop
  service rpcbind onestop
  umount /nfs
  hastctl role secondary nfs_block
  ifconfig vtnet1 delete 10.65.10.13
}

status() {
  ifconfig vtnet1 | grep 10.65.10.13 && \
  service rpcbind onestatus && \
  showmount -e | grep /nfs && \
  mount | grep /nfs && \
  ls /dev/hast/nfs_block
}

residue() {
  ifconfig vtnet1 | grep 10.65.10.13 || \
  (service rpcbind onestatus && showmount -e | grep /nfs) || \
  mount | grep /nfs || \
  ls /dev/hast/nfs_block
}

clean() {
  residue
  if [ $? -ne 0 ]
  then
    exit 0
  fi
  exit 1
}

if [ "$1" == "start" ]
then
  start
elif [ "$1" == "stop" ]
then
  stop
elif [ "$1" == "status" ]
then
  status
elif [ "$1" == "clean" ]
then
  clean
fi

Testing

To test, fine the designated computer and mount the file system.  Assume the file system has been running on the host “store1”, make a manual failover to see…  The file client does not need to explicitly remount the file system; it cab be remounted automatically.

client# mount 10.65.10.13:/nfs /mnt
client# ls /mnt
.snap
client# touch /mnt/helloworld
store1# ./nfs_service.sh stop
store2# ./nfs_service.sh start
client# ls /mnt
.snap helloworld

Highly Available Storage Target on FreeBSD

Standard

To achieve highly available service, it is vital to have the latest data available for restarting the service elsewhere.  In enterprise environments, multipath SAS drives allows each drive to be accessible from multiple hosts.  What about the rest of us, living in the cloud of / or inexpensive SATA drives?  Highly Available Storage Target (HAST) is the answer.  It relatively a new feature in FreeBSD 8.1.  It is useful to keep two copies of drive content on two loosely coupled computers.

In this article, I demonstrate how HAST can be setup, without bothering the actual failover logic.  I assume the two virtual machines are prepared from scratch with some unallocated space.  Alternatively, you can prepare a virtual machine with multiple drives (which is not quite feasible in my setting).

Preparing the Partitions

First, examine the drives.  Here we have 18 (no, indeed 17.9 something) gigabytes of space available.

# gpart show
=>      40  52428720 vtbd0 GPT (25G)
        40       984       - free - (492K)
      1024      1024     1 freebsd-boot (512K)
      2048   4192256     2 freebsd-swap (2.0G)
   4194304  10485760     3 freebsd-ufs (5.0G)
  14680064  37748696       - free - (18G)

Then, we can add two more partitions.  Since we have two hosts, we set two partitions so that each host can run the service on one partition.  This is so-called active-active setup.  I dedicate one for NFS and one for database.  Once you finish working on one virtual machine, do not forget performing the same on another machine.

# gpart add -t freebsd-ufs -l nfs_block -a 1M -s 8960M /dev/vtbd0
vtbd0p4 added.
# gpart add -t freebsd-ufs -l db_block -a 1M -s 8960M /dev/vtbd0
vtbd0p5 added.

This is the result.  Two block devices are created and made available inside /dev/gpt with their appropriate labels.

# gpart show
=>      40  52428720 vtbd0 GPT (25G)
        40       984       - free - (492K)
      1024      1024     1 freebsd-boot (512K)
      2048   4192256     2 freebsd-swap (2.0G)
   4194304  10485760     3 freebsd-ufs (5.0G)
  14680064  18350080     4 freebsd-ufs (8.8G)
  33030144  18350080     5 freebsd-ufs (8.8G)
  51380224   1048536       - free -  (512M)
# ls /dev/gpt
db_block nfs_block

HAST Daemon Setup

Here is an sample of defining the HAST configuration, hast.conf(5).  In short, there are two hosts and two resource items.  The host “store1” has its remote partner “store2” and vice versa.  Since we use the short host names “store1” and “store2”, do not forget to update the host(5) file to make them resolvable.  Remember to repeat these another machine.  Thankfully, the HAST configuration need not to be customised for each member host.

# sysrc hastd_enable="YES"
# cat > /etc/hast.conf << EOF
resource nfs_block {
  on store1 {
    local /dev/gpt/nfs_block
    remote store2
  }
  on store2 {
    local /dev/gpt/nfs_block
    remote store1
  }
}
resource db_block {
  on store1 {
    local /dev/gpt/db_block
    remote store2
  }
  on store2 {
    local /dev/gpt/db_block
    remote store1
  }
}
EOF
# service hastd start

Firewall Rules

If you are having a firewall, remember to open the port number 8457 opened.  For example, in PF, add these three lines to the two hosts.  Remember the replace the IP addresses as appropriate.

geompeers="{10.65.10.11,10.65.10.12}"
geomports="{8457}"
pass in quick inet proto tcp from $geompeers to any port $geomports keep state

HAST Daemon Status

Once the HAST daemon is started, the status of the blocks can be checked.  Since we have defined two resource items, there are two HAST device status reported.  For example, in the host “store1”, it says there are two components for each resource item: one block device, and one remote host.  At first, the resource items are in “initialisation” state:

store1# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store2
db_block  -        init      /dev/gpt/db_block store2

To turn on the device for operation, use the “role” subcommand on “store1”

store1# hastctl role primary db_block
store1# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store2
db_block  degraded primary   /dev/gpt/db_block store2

Similarly, use the “role” command on “store2”, but this time we set it secondary:

store2# hastctl role secondary db_block
store2# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store1
db_block  -        secondary /dev/gpt/db_block store1

When the synchronisation completes, the status is marked complete:

store2# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store1
db_block  complete secondary /dev/gpt/db_block store1

Formatting the Block Devices

We can format the block device like usual, just that we should format the device under the device directory “/dev/hast” instead of “/dev/gpt”.  Since currently the “db_block” is active on the host “store1”, it has to be executed over there.

store1# newfs -J /dev/hast/db_block
/dev/hast/db_block: 8960.0MB (18350064 sectors) block size 32768, fragment size 4096
using 15 cylinder groups of 626.09MB, 20035 blks, 80256 inodes.
super-block backups (for fsck_ffs -b #) at:
192, 1282432, 2564672, 3846912, 5129152, 6411392, 7693632, 8975872, 10258112, 11540352, 12822592, 14104832, 15387072, 16669312, 17951552

One thing to note is, the raw device was 18350080 sectors.  When HAST takes it for service, there is only 18350064 blocks left for payload.  Next, we can mount the file system.  We do not use the fstab(5) like before, because they do not need to execute every time boot.

store1# mkdir /db
store1# mount /dev/hast/db_block /db

Switching Over

In order to switch over, the procedure is as follows.

store1# umount /db
store1# hastctl role secondary db_block
store2# hastctl role primary db_block
store2# mkdir /db
store2# mount /dev/hast/db_block /db

What if, in an error, the backend device in “/dev/gpt” is being used for mounting?  It will say the following.  The chance to go wrong is really not heavy.

store2# mount /dev/gpt/db_block /db
/dev/gpt/db_block: Operation not permitted

For automatic switching, it will be discussed in a separated article.

Troubleshooting

Nothing can go really wrong without a serious application on top.  Nevertheless, the following message troubled me for a few hours.

We act as primary for the resource and not as secondary as secondary“: check the HAST configuration.  Likely a host is configured to have itself as the remote partner.

To be Continued

In the upcoming articles, I will cover how to make use of this highly-available block storage for shared file system, database, and web servers.

Installing FreeBSD from Scratch and Reinstalling the Boot Loader

Standard

There are cases the default image does not suit for one.  In this exercise, I practice installing FreeBSD version 11 from scratch.  I go beyond the standard procedure by partitioning the drive manually with commands. This is to leave space I can create partitions purely for payload later.   (If you just want to go automatic, you can refer to the FreeBSD handbook.)

Some errors take place so I get to correct the boot loader manually.  If you have tried fixing the boot loader of some other “freedom” operating system, you will appreciate how easy it is!

Inserting the Disc and Boot

Instead of selecting the default boot image, we pick an installation disc.  In Vultr, There are two ways.  The first way is to let the system download the installation disc.  For example, you find a link for the FreeBSD installation disc, copy the URL, and pass it to the interface.  The second way is to reuse the existing library of installation discs.

It takes quite some time for the system to boot.  Depending whether you are lucky or not, you may or may not see the beastie welcome screen.  This is so-called the boot loader, or simply the loader, with just a few tens of kilobytes.

Screen Shot 2017-04-13 at 9.33.34 pm

Inside the Installer

The system boots and the installer (precisely, “bsdinstall”) automatically executes.  From now on, there are a few keystrokes you need to know.  The action buttons, quoted in brackets, can be selected with left and right arrow keys.  To toggle the action button, press enter key.  The items above the action buttons are selected with up and down.  To toggle the item on or off, press spacebar.  At any one time, an action button and a selectable item are highlighted.  When there are multiple fields, press the tab, not enter, to jump between.

Question 1 – mode selection: In the screen below, you can press enter to run the installer.  You can alternatively press right arrow to select the shell, then enter to run the shell.  Here we select “install” directly.

Screen Shot 2017-04-13 at 9.34.07 pm

Question 2 – keymap: If you want to select an alternative keymap, use up and down arrow keys, and press spacebar to select.  Then, press enter to confirm.

Screen Shot 2017-04-13 at 9.34.17 pm

Question 3 – hostname: You are going to enter a hostname.  If you are creating a machine to be cloned, you can pick a generic name.

Question 4 – distributions: You are asked what distribution components to select.  Usually I just pick “lib32” only.  By default, they propose installing “ports”, I deselect it (with spacebar) most of the time.  The updated ports can be downloaded by “postsnap” command later.

Partitioning and Formatting the Drive

Question 5 – partition method: You are given several ways to partition, the “auto” one are the most easy but they may generate something you do not like.  The “manual” shows a dialog where you can create the partitions yourself, but not control the partition alignments.  So let us select “shell”.

Screen Shot 2017-04-13 at 9.35.40 pm.png

Question 6 – partition: You are given a shell and instructed to type in commands, edit a file, and mount the effective file system.  Use the following commands to partition the only virtual hard drive, “vtbd0”, and then install the bootloader.

Screen Shot 2017-04-13 at 9.35.50 pm

# gpart show
# gpart create -s gpt /dev/vtbd0
vtbd0 created
# gpart show
=>      40  52428720 vtbd0 GPT (25G)
        40  52428720       - free - (25G)

# gpart add -t freebsd-boot -a 512K -s 512K /dev/vtbd0
vtbd0p1 added
# gpart add -t freebsd-swap -a 1M -s 2047M /dev/vtbd0
vtbd0p2 added
# gpart add -t freebsd-ufs -a 1M -s 5120M /dev/vtbd0
vtbd0p3 added
# gpart show
=>      40  52428720 vtbd0 GPT (25G)
        40       984       - free - (492K)
      1024      1024     1 freebsd-boot (512K)
      2048   4192256     2 freebsd-swap (2.0G)
   4194304  10485760     3 freebsd-ufs (5.0G)
  14680064  37748696       - free - (18.0G)
# gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 /dev/vtbd0
bootcode written to /dev/vtbd0

Previous step, we partition the drive into three, a boot partition, a swap partition, and a unix file system partition.  We install the GPT boot loader into the boot partition.  Then, format the last partition, define the file system table as previously instructed, then we are done.  The installer starts installation without a question asked.

# newfs -U /dev/vtbd0p3
(message truncated)

# mount /dev/vtbd0p3 /mnt
# cat >> /tmp/bsdinstall_etc/fstab << EOF
/dev/vtbd0p2 none swap sw 0 0
/dev/vtbd0p3 /    ufs  rw 1 1
EOF

# exit

Screen Shot 2017-04-13 at 9.59.14 pm

Final Touches to the Installation

Question 7 – root password: Pick and enter a password carefully, twice.

Question 8 – network configuration: You are asked what network devices you like to configure.  Select the only virtual network device, “vtnet0”.  Enable IPv4 and DHCP.  Disable IPv6 (unless you know why not).

Question 9 – name resolver configuration: Simply press “ok” for the DNS configuration.  The DNS server setting will be overridden soon.

Question 10 – time zone selection: Select the continent you are in, and then the city.  You are then asked if the abbreviation is appropriate, and confirm the system date and time.

Question 11 – services: I would select “local_unbound”, “sshd”, and “ntpd”.

Screen Shot 2017-04-13 at 10.01.51 pm

Question 12 – security: Since version 11, the FreeBSD installer asks if the user wants any additional security measures.  I think most of them can be enabled, except the debugging.  (This is because I do debug programs.)

Screen Shot 2017-04-13 at 10.03.21 pm

Question 13 – additional users: This is up to you.  I prefer customisation before user creation.

Question 14 – final configuration: Just skip…

Question 15 – final modification: Just skip…

Question 16 – what next: Instead of rebooting, I prefer going to the live CD mode, login and “poweroff”.

Remaining Activities

Take a snapshot before booting the system again.  On the first system boot, the SSH generates its identities.  If you want multiple hosts having their distinct identities, taking the snapshot before the first boot is the laziest and the most correct way.

Last but not least, remove the virtual optical drive image.  Then you are good to boot from the virtual hard drive.

Troubleshooting and Fixing the Boot Loader

Missing boot loader: When generating the screenshots, I forgot to install the boot code.  The boot screen looks like this and is stuck.  This is a sign of missing the boot loader.  I booted with the installation disc again, then choose shell mode, and finally rerun the “gpart bootcode” command.

Screen Shot 2017-04-13 at 10.05.10 pm

# gpart show
=>      40  52428720 vtbd0 GPT (25G)
        40       984       - free - (492K)
      1024      1024     1 freebsd-boot (512K)
      2048   4192256     2 freebsd-swap (2.0G)
   4194304  10485760     3 freebsd-ufs (5.0G)
  14680064  37748696       - free - (18G)
# gpart bootcode -b /boot/pmbr -p /boot/gptboot -i 1 /dev/vtbd0
bootcode written to /dev/vtbd0

Damaged file system table: On the next boot attempt, I drop into single user mode because of bad file system table.  This was because I wrote “rw” instead of “sw” for the swap.  I then corrected the “/etc/fstab” with an editor.  Then I “exit” to continue the boot.

Screen Shot 2017-04-13 at 10.11.27 pm.png

Security Settings

For you reference, the security options I made in installation turns out to be the following.  So they can be incorporated in other installation tools, without actually running the “bsdinstall”.

/etc/rc.conf

clear_tmp_enable="YES"
syslogd_flags="-ss"
local_unbound_enable="YES"

/etc/sysctl.conf

security.bsd.see_other_uids=0
security.bsd.see_other_gids=0
security.bsd.unprivileged_read_msgbuf=0
security.bsd.stack_guard_page = 1

/etc/resolv.conf

nameserver 127.0.0.1
options edns0

To be Continued

In the upcoming articles, I will use the snapshots created here to build a highly available block device, and then highly available file systems and database systems.