Highly Available Storage Target on FreeBSD

Standard

To achieve highly available service, it is vital to have the latest data available for restarting the service elsewhere.  In enterprise environments, multipath SAS drives allows each drive to be accessible from multiple hosts.  What about the rest of us, living in the cloud of / or inexpensive SATA drives?  Highly Available Storage Target (HAST) is the answer.  It relatively a new feature in FreeBSD 8.1.  It is useful to keep two copies of drive content on two loosely coupled computers.

In this article, I demonstrate how HAST can be setup, without bothering the actual failover logic.  I assume the two virtual machines are prepared from scratch with some unallocated space.  Alternatively, you can prepare a virtual machine with multiple drives (which is not quite feasible in my setting).

Preparing the Partitions

First, examine the drives.  Here we have 18 (no, indeed 17.9 something) gigabytes of space available.

# gpart show
=>      40  52428720 vtbd0 GPT (25G)
        40       984       - free - (492K)
      1024      1024     1 freebsd-boot (512K)
      2048   4192256     2 freebsd-swap (2.0G)
   4194304  10485760     3 freebsd-ufs (5.0G)
  14680064  37748696       - free - (18G)

Then, we can add two more partitions.  Since we have two hosts, we set two partitions so that each host can run the service on one partition.  This is so-called active-active setup.  I dedicate one for NFS and one for database.  Once you finish working on one virtual machine, do not forget performing the same on another machine.

# gpart add -t freebsd-ufs -l nfs_block -a 1M -s 8960M /dev/vtbd0
vtbd0p4 added.
# gpart add -t freebsd-ufs -l db_block -a 1M -s 8960M /dev/vtbd0
vtbd0p5 added.

This is the result.  Two block devices are created and made available inside /dev/gpt with their appropriate labels.

# gpart show
=>      40  52428720 vtbd0 GPT (25G)
        40       984       - free - (492K)
      1024      1024     1 freebsd-boot (512K)
      2048   4192256     2 freebsd-swap (2.0G)
   4194304  10485760     3 freebsd-ufs (5.0G)
  14680064  18350080     4 freebsd-ufs (8.8G)
  33030144  18350080     5 freebsd-ufs (8.8G)
  51380224   1048536       - free -  (512M)
# ls /dev/gpt
db_block nfs_block

HAST Daemon Setup

Here is an sample of defining the HAST configuration, hast.conf(5).  In short, there are two hosts and two resource items.  The host “store1” has its remote partner “store2” and vice versa.  Since we use the short host names “store1” and “store2”, do not forget to update the host(5) file to make them resolvable.  Remember to repeat these another machine.  Thankfully, the HAST configuration need not to be customised for each member host.

# sysrc hastd_enable="YES"
# cat > /etc/hast.conf << EOF
resource nfs_block {
  on store1 {
    local /dev/gpt/nfs_block
    remote store2
  }
  on store2 {
    local /dev/gpt/nfs_block
    remote store1
  }
}
resource db_block {
  on store1 {
    local /dev/gpt/db_block
    remote store2
  }
  on store2 {
    local /dev/gpt/db_block
    remote store1
  }
}
EOF
# service hastd start

Firewall Rules

If you are having a firewall, remember to open the port number 8457 opened.  For example, in PF, add these three lines to the two hosts.  Remember the replace the IP addresses as appropriate.

geompeers="{10.65.10.11,10.65.10.12}"
geomports="{8457}"
pass in quick inet proto tcp from $geompeers to any port $geomports keep state

HAST Daemon Status

Once the HAST daemon is started, the status of the blocks can be checked.  Since we have defined two resource items, there are two HAST device status reported.  For example, in the host “store1”, it says there are two components for each resource item: one block device, and one remote host.  At first, the resource items are in “initialisation” state:

store1# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store2
db_block  -        init      /dev/gpt/db_block store2

To turn on the device for operation, use the “role” subcommand on “store1”

store1# hastctl role primary db_block
store1# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store2
db_block  degraded primary   /dev/gpt/db_block store2

Similarly, use the “role” command on “store2”, but this time we set it secondary:

store2# hastctl role secondary db_block
store2# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store1
db_block  -        secondary /dev/gpt/db_block store1

When the synchronisation completes, the status is marked complete:

store2# hastctl status
Name      Status   Role      Components
nfs_block -        init      /dev/gpt/nfs_block store1
db_block  complete secondary /dev/gpt/db_block store1

Formatting the Block Devices

We can format the block device like usual, just that we should format the device under the device directory “/dev/hast” instead of “/dev/gpt”.  Since currently the “db_block” is active on the host “store1”, it has to be executed over there.

store1# newfs -J /dev/hast/db_block
/dev/hast/db_block: 8960.0MB (18350064 sectors) block size 32768, fragment size 4096
using 15 cylinder groups of 626.09MB, 20035 blks, 80256 inodes.
super-block backups (for fsck_ffs -b #) at:
192, 1282432, 2564672, 3846912, 5129152, 6411392, 7693632, 8975872, 10258112, 11540352, 12822592, 14104832, 15387072, 16669312, 17951552

One thing to note is, the raw device was 18350080 sectors.  When HAST takes it for service, there is only 18350064 blocks left for payload.  Next, we can mount the file system.  We do not use the fstab(5) like before, because they do not need to execute every time boot.

store1# mkdir /db
store1# mount /dev/hast/db_block /db

Switching Over

In order to switch over, the procedure is as follows.

store1# umount /db
store1# hastctl role secondary db_block
store2# hastctl role primary db_block
store2# mkdir /db
store2# mount /dev/hast/db_block /db

What if, in an error, the backend device in “/dev/gpt” is being used for mounting?  It will say the following.  The chance to go wrong is really not heavy.

store2# mount /dev/gpt/db_block /db
/dev/gpt/db_block: Operation not permitted

For automatic switching, it will be discussed in a separated article.

Troubleshooting

Nothing can go really wrong without a serious application on top.  Nevertheless, the following message troubled me for a few hours.

We act as primary for the resource and not as secondary as secondary“: check the HAST configuration.  Likely a host is configured to have itself as the remote partner.

To be Continued

In the upcoming articles, I will cover how to make use of this highly-available block storage for shared file system and database.

Advertisements

3 thoughts on “Highly Available Storage Target on FreeBSD

  1. Pingback: Installing FreeBSD from Scratch and Reinstalling the Boot Loader | Virtualisation Works

  2. Pingback: Highly Available Network File System | Virtualisation Works

  3. Pingback: Highly Available MySQL Server | Virtualisation Works

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s