bookmark_borderUsing ZFS replication features in FreeBSD to improve my offsite backups

Recently I decided to improve the reliability of my file system backups by using the data replication capabilities inherent in the FreeBSD Zettabyte File System (ZFS). ZFS provides a built-in serialization feature that can send a stream representation of a ZFS file system (Which ZFS refers to as a “dataset”) to standard output. Using this technique, it is possible to not only store the dataset(s) on another ZFS storage pool (zpool) connected to the local system, but also to send it over a network to another FreeBSD system. ZFS dataset snapshots serve as the basis for this replication, and the essential ZFS commands used for replicating the data are zfs send and zfs receive.

This post describes how I used this ZFS feature to perform replication of ZFS dataset snapshots from my home FreeBSD server to another FreeBSD machine located offsite. I’ll also discuss how I manage the quantity of snapshots stored locally and offsite, as well as a couple of options for recovering my files should it become necessary.

For purposes of example, I’ll refer to the FreeBSD system hosting the snapshots I want to send as “server”, and the offsite FreeBSD system that I will send snapshots to as “backup”. Unless otherwise noted, all steps were performed as the user root. However a non-root user, “iceflatline”, was created on both machines and is used for many of the commands. The versions for the software used in this post were as follows:

  • FreeBSD 11.0-RELEASE
  • Configure server

    On server I had created a simple mirror vdev for my zpool consisting of (2) two terabyte disks. The mirror and the zpool were created using the following commands:

    As you can see, I created one large ZFS partition (-t freebsd-zfs) on each disk. Specifying the -a option, the gpart utility tries to align the start offset and partition size on the disk to be a multiple of the alignment value. I chose 1 MiB. The advantage to this is that it is a multiple of 4096 (helpful for larger, 4 kiB sector drives), leaving the leftover fraction of a megabyte at the end of the drive. In the future, if I have to replace a failed drive containing a slightly different number of sectors, I’ll have some wiggle room in case the replacement drive is slightly larger in size. After partitioning each drive I created the zpool using these partitions. I elected to use name “pool_0” for this zpool.

    To improve overall performance and usability of any datasets that I create in this zpool, I performed the following configuration changes:

    The zfs command property atime controls whether the access time for files is updated when the files are read. Setting this property to off avoids producing write traffic when reading files, which can result in a gain in file system performance. The lz4 property controls the compression algorithm used for the datasets. lz4 is a high-performance replacement for the older the Lempel Ziv Jeff Bonwick (lzjb) algorithm. It features faster compression and decompression, as well as a generally higher compression ratio than lzjb. The snapdir property controls whether the directory containing my snapshots (pool_0/dataset_0/.zfs) is hidden or visible. I prefer the directory to be visible so I have another way to verify the existence of snapshots. These configuration changes were made at the zpool level so that any datasets I create in this zpool will inherit these settings; however, I could configure each dataset differently if desired.

    The dataset on server that I back up offsite is called “dataset_0”, and was created using the following command:

    To ensure I have still have some headroom if/when the zpool starts to get full, I set the size quota for this dataset to 80% of zpool size (1819 GiB), or 1455 GiB:

    Since ZFS can send a stream representation of a dataset to standard output, it can be piped through secure shell (“SSH”) to securely send it over a network connection. By default, root user privileges are required to send and receive these streams. This requires logging into the receiving system as user root. However, logging in as the user root via a SSH is disabled by default in FreeBSD systems for security reasons. Fortunately, the necessary ZFS commands can be delegated to a non-root user on each system. The minimum delegated ZFS permissions I needed for user iceflatline to successfully send snapshots from server were as follows:

    In this case I delegated the permissions at the zpool level, so any datasets I create in pool_0 will inherit them. Alternatively I could have delegated permissions at the dataset level or a combination of both if desired. There’s a lot of flexibility.

    I’m able to verify which permissions were delegated anytime using the following command as either user root or iceflatline:

    Finally, to avoid having to enter a password each time a backup is performed, I generated a SSH key pair as user iceflatline on server and copied the public key to /usr/home/iceflatline/.ssh/authorized_keys on backup.

    Configure backup

    I configured backup similar to server: a simple mirror vdev, and a zpool named pool_0 with the same configuration as the one in server. I did not create a dataset on this zpool because I will be replicating pool_0/dataset_0 on server directly to pool_0 on backup.

    The minimum delegated ZFS permissions I needed for user iceflatline on backup to successfully receive these snapshots were as follows:

    Using zfs send and receive

    After configuring both machines it was time to test. First, I created a full snapshot of pool_0/dataset_0 on server using the following command as as user iceflatline:

    While not strictly needed in this case, the -r option will recursively create snapshots of any child datasets that I may have created under pool_0/dataset_0.

    Now I can send this newly created snapshot to backup, which was assigned the IP address 192.168.20.6. The following command is performed as user iceflatline:

    The zfs send command creates a data stream representation of the snapshot and writes it to standard output. The standard output is then piped through SSH to securely send the snapshot to backup. The -v option will print information about the size of the stream and the time required to perform the receive operation. The -u option prevents the file system associated with the received data stream (pool_0/dataset_0 in this case) from being mounted. This was desirable as I’m using backup to simply store the dataset_0 snaphots offsite. I don’t need to mount them on that machine. The -d option is used so that all but the pool name (pool_0) of the sent snapshot is appended to pool_0 on backup. Finally, the -F option is useful for destroying snapshots on backup that do not exist on server.

    zfs send can also determine the difference between two snapshots and send only the differences between the two. This saves on disk space as well as network transfer time. For example, if I perform the following command as user iceflatline:

    A second snapshot pool_0/data_0@snap-test-1 is created. This second snapshot contains only the file system changes that occurred in pool_0/dataset_0 between the time I created this snapshot and the previous snapshot, pool_0/dataset_0@snap-test-0. Now, as user iceflatline, I can use zfs send with the -i option and indicate the pair of snapshots to generate an incremental stream containing only the data that has changed:

    Note that sending an incremental stream will only succeed if an initial full snapshot already exists on the receiving side. I’ve also included the -R option with the zfs send command this time. This option will preserve the ZFS properties of any descendant datasets, snaphots, and clones in the stream. If the -F option is specified when this stream is received, any snapshots that exist on the receiving side that do not exist on the sending side are destroyed.

    By the way, I can list all snapshots created of pool_0/dataset_0 using the following command as either user root or iceflatline:

    After testing to make sure that snapshots could be successfully sent to backup, I created an ugly little script that creates a daily snapshot of pool_0/dataset_0 on server; looks for yesterday’s snapshot and, if found, sends an incremental stream containing only the file system data that has changed to backup; looks for any snapshots older than 30 days and deletes them on both server and backup; and finally, logs its output to the file /home/iceflatline/cronlog:

    To use the script, I saved it to /home/iceflatline/bin with the name zfsrep.sh and, as user iceflatline, made it executable:

    Then added the following cron job to the crontab under the user iceflatline account. The script runs every day at 2300 local time:

    The script works is working pretty well for me, but I soon discovered that if it missed a daily snapshot or could not successfully send a daily snapshot to backup, say because either server or backup were offline or the connection between the two was down, then an error would occur the following day when the script attempts to send a new incremental snapshot. This is because backup was missing previous day’s snapshot and so the script could not send an incremental stream. To recover from this error I needed to manually send those missing snapshots. Say, for example, I had the following snapshots on server:

    pool_0/dataset_0@snap-20150620
    pool_0/dataset_0@snap-20150621
    pool_0/dataset_0@snap-20150622

    Now say that the script was not able to create pool_0/dataset_0@snap-20150623 on server because it was offline for some reason. Consequently, it was not able to successfully replicated this snapshot to backup. The next day, when server is back online, the script will successfully create another daily snapshot pool_0/dataset_0@snap-20150624 but will not be able to successfully send it to backup because pool_0/dataset_0@snap-20150623 is missing. To recover from this problem I’ll need to manually perform an incremental zfs send using pool_0/dataset_0@snap-20150622 and pool_0/dataset_0@snap-20150624:

    Now both server and backup have the same snapshots and the script will function normally again.

    File recovery

    Having now a way to reliably replicate the file system offsite on daily basis, what happens if I need to recover some files? Fortunately, there are a couple of options available to me. First, because I chose to make snapshots visible on server, I can easily navigate to /pool_0/dataset_0/.zfs/snapshot and copy any files up to 30 days in the past (given the current retention value in the script). I could also mount pool_0/dataset_0 on backup and copy these same files from there using a utility like scp if desired.

    I could also send snapshot(s) from backup to back to server. To do this I would create a new dataset on pool_0 on server. In this example, the new dataset is named receive:

    Why is creating a new dataset necessary? Because there exists already the dataset pool_0/dataset_0 on server. If I tried to send pool_0/dataset_0@some-snapshot from backup back to server there would be a conflict. I could have avoided this step if I had created a dataset on pool_0 on backup and replicated snapshots of pool_0/dataset_0 to that dataset instead of directly to pool_0.

    Okay, now, as user iceflatline I can send the snapshot(s) I want from backup to server:

    After the stream is fully received I switch to user root and mount the dataset:

    This will result in pool_0/dataset_0@snap-20150620 sent from backup to be mounted read only to pool_0/receive/dataset_0 on server. Now I can navigate to /pool_0/receive/dataset_0 and copy the files I need to recover, or I can clone or clone and promote pool_0/receive/dataset_0@snap-20150629, whatever.

    Conclusion

    Well, that’s it. A long and rambling post on how I’m using the replication features in FreeBSD’s ZFS to improve the reliability and resiliency of my file system backups. So far, it’s working rather well for me, and it’s been a great learning experience. Is it the best or only way? Likely not. Are there better (or at least more professional) utilities or scripts to use? Most assuredly. But for now I’ve met my most important requirement: reliably backing up my data offsite.

    References

    ZFS(8)
    https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/zfs.html

    bookmark_borderHow To Setup And Configure FreeBSD As A Syslog Server

    I’ve grown tired of connecting to each host individually in my network to examine their log files. In addition to logging events locally, I would like these hosts to send their logs to a designated host in my network, resulting in a single location where I can examine and analyze all logs.

    This post describes how to setup and configure a machine running FreeBSD to be a system log or “syslog” server, receiving incoming log events from other hosts in the network. A second machine, also running FreeBSD, will be configured to send its log events to the syslog server.

    For purposes of example, we’ll use the hostname “server” for the machine hosting our our syslog server, and “client” for the other machine – the one sending its log events to the syslog server. All steps involved assume that FreeBSD is installed and operating correctly on both machines. All commands are issued as the root user.

    The versions for the software used in this post were as follows:

    • FreeBSD 11.0-RELEASE

    Let’s get started…

    Configure the syslog server

    First, we need a file in server’s /var/log directory to host the log events coming from client. For our example, we’ll make this file name the same as client’s hostname. While you don’t need to use the .log extension, I find it helpful as it clearly indicates the purpose of the file:

    Next we need to add a couple of options to syslogd, the FreeBSD utility that reads and logs messages to the system console and log files. Use sysrc to add the following line to /etc/rc.conf, substituting the IP network and network mask for your own:

    The -4 (IPv4) option forces syslogd to listen for IPv4 addresses only.

    The -a (allowed_peer) option specifies which clients are allowed to log to this syslog server. This option can take the form of IP address/mask:service, such as “-a 192.168.10.1/24:*” (the `*’ character permits packets sent from any UDP port), or hostname.domain, such as “-a client.home”, or “-a *.home” (assuming that the hostname can be successfully resolved to the correct IP address in the network). Multiple -a options may be also be specified. In this example, allowed_peer will the form of any host within an entire IP network, in this case 192.168.1.0/24.

    Finally, the -v opton indicates verbose logging. If -v is specified once, the event’s numeric facility and priority will be added to the log. If specified more than once, the names of the event’s facility and priority (e.g., “user.notice”) are also added to the log.

    Now we need to add some lines to server’s /etc/syslog.conf file, the configuration file for syslogd. First, the name of server’s hostname, preceeded by a + character, must be added to top of the file – before any existing syslog options (i.e., right before *.err; …, etc.) – so that those existing options will be applied only to log events generated by locally by server. If we did not add this line then all those options would also be applied to the log events that arrive from client. In other words, any options after a +(some_hostname) in the this file will apply until the next +(some_hostname) is parsed:

    Then add following lines to the bottom of /etc/syslog.conf after the last !* , substituting the .home domain for your own:

    The first line specifies that remote log events will be arriving from client. client can be specified using either its hostname or its IP address. Note that when using a hostname the domain name must be appended to it. In either case, the hostname.domain or host ip address is preceded by a + character.

    The second line contains parameters to control the handling of incoming log events from client, specifically a selector field followed by an action field. The syntax of the selector field is facility.level. The facility portion describes which subsystem generated the message, such as the kernel or a daemon, while the level portion describes the severity of the event that occurred. Multiple selector fields can be used for the same action and should be separated using a semicolon (;). In our example we’ll use the * characters in the selector field to match any log events received by client.

    The action field denotes where to send the log message. In our case, log events will be sent to the log file we created previously. Note that spaces are valid field separators in FreeBSD’s /etc/syslog.conf file. However, other nix-like systems still insist on using tabs as field separators. If you are sharing this file between systems, you may want to use only tabs as field separators.

    Managing the log files

    The file /var/log/client.log will grow over time, making it difficult to locate useful event information as well as taking up disk space. FreeBSD mitigates this problem using using newsyslog, a built-in utility that, among other things, periodically rotates and compresses log files. newsyslog is scheduled to run periodically by the system crontab (/etc/crontab). In its default configuration, it runs every hour.

    newsyslog reads from its configuration file, /etc/newsyslog.conf in order to determine which actions to take. This file contains one line for each log file that newsyslog manages. Each line is comprised of various fields which control the log file’s owner and group, permissions, and when the log file should be rotated. In addition there are several optional fields for controlling log file compression and programs that should be signaled when the log file is rotated. Each field is separated with whitespace.

    In order to have newsyslog recognize client’s log file, we’ll place the following line at the botton of /etc/newsyslog.conf:

    In this example, the file permission for /var/log/client.log is set to 640. newsyslog will retain up to five archive files, and rotate the file when its size reaches 100 kB. The * character in the when column instructs newsyslog to ignore a time interval, a specific time, or both and instead only consider the size of the file when determining whether or not to rotate the file. The J flag tells newsyslog compress the rotated log file using bzip2, and the C flag tells newsyslog to create the log file if it does not already exist.

    Finally, let’s restart syslogd and newsyslog on server:

    Configure the client

    Let’s move on now and configure client so that it will send its event logs to server. Open client’s /etc/syslog.conf file and add the following line after the last !*, to instruct client to send log events of any facility and level to server:

    server can be specified using either its hostname, hostname.domain or its IP address, preceded by a @ character.

    Now let’s restart syslogd on client:

    Finally, let’s make sure client is sending its log events to server using the logger utility. Logon to client and issue the follow command:

    Now login to server and and check client’s log file:

    You should see the message you sent using the logger utility:

    Conclusion

    That’s it. In addition to logging events locally, the client host will send its logs to our syslog server, resulting in a single location where log events can be examined and analyzed.

    References

    NEWSYSLOG(8)
    NEWSYSLOG.CONF(5)
    SYSLOGD(8)
    SYSLOG.CONF(5)

    bookmark_borderHow To Create, Configure And Connect To A FreeBSD Instance In Amazon EC2

    (20180108 – The steps in this post were amended to address changes in the Amazon AWS service — iceflatline)

    FreeBSD is an free and open source advanced computer operating system used to power modern servers, desktops and embedded platforms.

    Amazon Elastic Compute Cloud (“EC2”) provides resizable computing capacity in the Amazon Web Services (“AWS”) cloud. Amazon EC2 can be used to launch as many or as few virtual servers as you need, configure security and networking, and manage storage. An Amazon Machine Image (AMI) is a template that contains a software configuration (for example, an operating system, an application server, and applications). From an AMI, you launch an instance; virtual servers that can run applications. Instances feature varying combinations of CPU, memory, storage, and networking capacity, and give you the flexibility to choose the appropriate mix of resources for your applications.

    This post describes how to create and configure a FreeBSD instance in Amazon EC2. Then goes on to explain how to connect to the new instance using SSH from a machine running a BSD, Linux or Windows operating system.

    The steps discussed in this post assume you have an active AWS account. If you do not, you can sign up for one at Amazon Web Services.

    Let’s get started…

    Create and Configure the FreeBSD Instance

    Fire up your web browser and navigate to Amazon Web Services. Login to the AWS Management Console by selecting “AWS Managment Console” from among the options in the drop down list under “My Account” (See Figure 1).

    Screenshot showing how to find the Amazon AWS Management Console

    Figure 1

    Once you’ve successfully logged in, select “EC2″ from among the options listed under the “Services” section (See Figure 2).

    Screenshot showing the EC2 option in the Amazon AWS Management Console

    Figure 2

    Next you’ll choose the Amazon EC2 “region” under which the FreeBSD instance will be created. In this example we’ll select the US West (Oregon) region (See Figure 3).

    Screenshot showing the selection of an Amazon region where the FreeBSD instance will be created

    Figure 3

    Now select “Instances” from among the options under the “Instances” category on the left side of the page. If this is the first time you’ve created an instance in this Amazon EC2 region you’ll be greeted with a message indicating “you do not have any running instances in this region” and a button to launch one (See Figure 4).

    Screenshot showing the Amazon AWS EC2 launch instance screen

    Figure 4

    Select “Launch Instance” and you’ll be greeted with Amazon’s quick start guide for creating a new AMI. Select “AWS Marketplace” from among the choices on the left side of the web page where you will be offered the ability to search for and select an AMI. Simply search for “freebsd” and you will presented with several FreeBSD image options (See Figure 5).

    Screenshot showing search results for a FreeBSD AMI

    Figure 5

    In this example we’ll select the “FreeBSD 11” AMI, where we’ll be presented with some product details, including instance pricing. Select “Continue” where you’ll be asked to choose an instance type. Amazon EC2 provides several instance types optimized to fit different use cases. In this example we’ll use the recommended m4.large instance. (See Figure 6).

    Screenshot showing the selection of a Amazon m4.large instance

    Figure 6

    Select “Next: Configure Instance Details” where you will be presented with a list of default options that can be modified, if desired, to better suite your needs. Hovering your mouse over the “i” icon near an option will describe its purpose in greater detail. One option that may prove helpful is the termination protection. Enabling this option will prevent the instance from being accidentally “terminated” (i.e., deleted). If enabled, you will not be able to delete the instance through the AWS Management Console until this option is once again disabled. For our example, however, we’ll simply retain the default options (See Figure 7).

    Screenshot showing the configuration of the default Amazon EC2 instance options

    Figure 7

    Now select “Next: Add Storage” where you can adjust the size of the default or “root” Elastic Block Store (“EBS”) volume. You can also attach additional EBS volumes to your instance, or edit the settings of the root volume. You can also choose to delete the volume should you decide to terminate the instance. For our example, we’ll retain the 10GB root EBS volume and all default settings (See Figure 8).

    Screenshot showing the Amazon EC2 EBS storage volume configuration options

    Figure 8

    After configuring storage, select “Next: Add Tags” where you be given the option of creating a “Tag” for your instance (See Figure 9). Tags enable you to categorize your AWS resources in different ways, for example, by purpose, owner, or environment. Each tag consists of a key and a value, both of which you can define. Uniquely tagging instances can be beneficial, particularly if you plan on creating many of them. Again, this is an optional step, and since we’re creating a single instance, we’ll forgo tagging and move on to the next step: Configure Security Group.

    Screenshot showing the Amazon EC2 instance tagging option

    Figure 9

    A security group is a set of firewall rules that control the traffic for your instance. For example, if you want to set up a web server and allow traffic to reach your instance, you would add rules that permit unrestricted access to HTTP and HTTPS ports. You can create a new security group or select from an existing one. In this example, we would simply like to connect to the new FreeBSD instance using a secure shell (SSH) so there is no need to create a new rule as one already exists for SSH by default. However, you may wish to filter incoming SSH connections to your FreeBSD instance. If you’d like to connect from any network, then simply retain the select “custom” from among the options in the drop down list under “Source”, else you can limit incoming connections to the IP your currently using or to a custom IP address or IP subnet. For this example, we’ll allow incoming SSH connections on port 22 from anywhere (See Figure 10).

    Screenshot showing the configuration of security group rules in Amazon EC2

    Figure 10

    When complete, select “Review and Launch” where you’ll be given one last opportunity to modify your settings. If everything checks out select “Launch” where a pop up screen will provide the opportunity to select an existing key pair or create a new key pair. A key pair consists of a OpenSSL public key, which Amazon AWS retains and copies to your instance, and a private key that you download and retain. Together, they allow you to connect to your FreeBSD instance securely using SSH. If this this is first time you’ve created an instance you’ll likely not have an existing key pair from which to chose. If this is the case, select “Create a new key pair” from among the options in the drop down list and enter a name for your new key pair. In this example we’ll use the name “ec2-or-freebsd.” Now select “Download Key Pair” and save the file in a secure and accessible location (See Figure 11).

    Screenshot showing the creation of a new key pair in Amazon EC2

    Figure 11

    Next, select “Launch Instances”, followed by “View Instances” and you’ll be taken to a page showing your FreeBSD instance launching. After a minute or two, the “Instance State” will change from “pending” to “running” (See Figure 12). You can stop your instance by selecting “Stop” from among the options in the drop down list under “Actions” located at the top of the page.

    Screenshot showing a running FreeBSD instance in Amazon EC2

    Figure 12

    Finally, let’s get the public IP address of our FreeBSD instance. Select “Connect” at the top of the instance page and make a note of the public IP address assigned to your instance (See Figure 13). Note that the instance will be assigned a new public IP address if you stop it and restart it. If you want to avoid this situation then consider using an Elastic IP address. If you simply reboot the instance from within the operating system it will retain the same public IP addresses.

    Screenshot showing the public IP address assigned to this FreeBSD instance n Amazon EC2

    Figure 13

    Connect to the instance from Windows

    Now that we have our new FreeBSD instance up and running under Amazon EC2 let’s turn our attention to connecting to it using SSH under Windows. Since Windows doesn’t typically support SSH, we’ll need an SSH client. There are many out there to choose from, but the one we’ll use in this example is PuTTY, a free implementation of Telnet and SSH for Windows and Linux/BSD platforms.

    PuTTY does not natively support the private key format *.pem generated by Amazon EC2, so we’ll also need a way to convert this key file to a key format that the PuTTY application can use. For this we’ll use PuTTYgen, a free key generation utility, which can convert keys to *.ppk, the file format required by PuTTY. You can download standalone versions of PuTTY and PuTTYgen, or simply download the Windows installer version of PuTTY, which will also install PuTTYgen, as well as Pageant, an SSH authentication agent for PuTTY.

    Fire up PuTTYgen and select “Load”. Navigate to where you downloaded the ec2-or-freebsd.pem file and select “Open” (Note: you may have to change the search filter from “PuTTY Private Key Files (*.PPK)” to “All Files (*.*)” in order to readily locate the file). Once ec2-or-freebsd.pem has been successfully loaded into PuTTYgen, you can modify the “Key comment” field if desired, as well as add a passphrase to protect your private key. Electing not to means that anyone gaining access to your private key will also quite easily be able to access your FreeBSD instance. Once complete select “Save private key” and select a name (for this example, we’ll use the same name: ec2-or-freebsd) and a location to save the new key file (See Figure 14).

    Screenshot showing the creation of a ppk file in PuTTYgen

    Figure 14

    Exit out of PuTTYgen and fire up PuTTY. Navigate to Connection->SSH->Auth. Under Authentication parameters select the Browse button and select the ec2-or-freebsd.ppk file you saved in the previous step. Navigate back up to Session. You’ll connect as “ec2-user” so prepend this user name to the public IP address assigned to your instance so that the entire field looks like this: “ec2-user@”. If you chose a different SSH port number other than the default 22 when setting up your instance’s security group, ensure that number is reflected in the “Port” field.

    Now select “Open” and the PuTTY client will connect to your FreeBSD instance. If this is the first time you’ve connected to it, you’ll receive a warning concerning the authenticity of the host you’re trying to reach. If you’re sure this is the correct instance and you want to continue connecting, select “Yes” to add the key to PuTTY’s cache and carry on connecting. If you want to carry on connecting just once, without adding the key to the cache, select “No”. You’ll be asked to provide the passphrase (if you created one) for your private key and you’ll be connected to the instance.

    Connect from FreeBSD or Linux

    Connecting to your FreeBSD EC2 instance via SSH is significantly easier in FreeBSD or Linux. Start by checking to see if the .ssh directory exists in your home directory. If it does not, create it and set it’s permissions appropriately:

    Now move the ec2-or-freebsd.pem file you downloaded to ~/.ssh and modify its permissions appropriately:

    As an optional security step you can add a passphrase to your key:

    Now let’s connect to our FreeBSD instance:

    If you chose a different port number than the default when setting up the instance’s security group, then you’ll need to specify that on the command line as well:

    If this is the first time you’ve connected to it, you’ll receive a warning concerning the authenticity of the host you’re trying to reach. If you’re sure this is the correct instance and you want to continue connecting type “yes” at the prompt. The public key of your FreeBSD EC2 instance will be added to ~/.ssh/known_hosts and you will be connected.

    Conclusion

    Well, that’s it. With a little effort you can easily create, configure and connect to your own FreeBSD instance in Amazon EC2. Now that you know that your *.ppk and/or *.pem private key works, you should back it up to offline media such as a flash drive or CD and keep it someplace secure. I also strongly recommend that you create a password for the user root in your FreeBSD instance(s).

    Issues to note

    Amazon does not provide a easy way to verify the key fingerprint – the one listed in the EC2 Management Console. I did manage to find this rather obscure command that will work from FreeBSD and Linux, but I have yet to find an easy way to perform this task under Windows, outside of installing and setting up the the Amazon EC2 command line interface tools.

    References

    http://aws.amazon.com/documentation/ec2/

    http://www.daemonology.net/blog/2017-10-21-FreeBSD-EC2-community-vs-marketplace-AMIs.html

    bookmark_borderHow to Create and Maintain a ZFS Mirror in NAS4Free

    NAS4free is an open source NAS (“Network Attached Storage”) platform based on FreeBSD that supports file sharing across Windows, Apple, and UNIX-like systems. Support for ZFS, Software RAID (0,1,5), disk encryption, S.M.A.R.T, email reports, CIFS FTP, NFS, TFTP, AFP, RSYNC, Unison, iSCSI, HAST, CARP, Bridge, UPnP, and Bittorent, are among its many features – all configurable through its GUI interface. NAS4Free can be installed on Compact Flash or USB flash drive, hard disk or booted into a “LiveCD” environment. NAS4Free code and documentation are released under the Simplified BSD License.

    The ZFS (“Zetabyte File System”) is a combined file system and logical volume manager designed by Sun Microsystems. The features of ZFS include protection against data corruption, support for high storage capacities, snapshots and clones, continuous integrity checking and automatic repair. ZFS is implemented as open-source software, licensed under the Common Development and Distribution License (CDDL).

    This post will describe how to setup a simple, yet resilient, ZFS-based RAID 1 (ZFS mirror) in NAS4Free. In RAID 1, data is written identically to two disk drives, thereby producing a “mirrored” set. If one disk becomes defective, the remaining disk still contains all the data. To help explain the steps involved, we’ll use two new 2TB (Terabyte) SATA 3.0 hard disks, along with the ZFS utilities available within NAS4Free, to create and configure our ZFS mirror. We’ll also discuss a few post-install activities to help maintain your ZFS mirror. All steps involved assume that the two hard drives have been installed correctly and are recognized by the BIOS, and that NAS4Free is installed and operational. The software versions used in this post were as follows:

    • NAS4Free v9.1.0.1 – Sandstorm (revision 636)

    So, let’s get started.

    Adding the Disks

    The first thing we need to do is logically add the two new disks to NAS4Free so the system acknowledges their existence, permitting further configuration on them. Log in to the NAS4Free GUI (“Graphical User Interface”), navigate to Disks->Management, and select the “+” icon. (See Figure 1).

    Screenshot showing the Disk Management page in NAS4Free

    Figure 1

    In the subsequent page you are presented with the configuration screen for adding new disks. Select the first 2TB disk from the drop-down menu under the “Disk” field, and select “unformatted” from among the options in the drop-down menu under the “Preformatted file system” field. The remaining options on this page can retain their default settings. Now select “Add” (See Figure 2).

    Screenshot showing the Disk Management - Add Disk page in NAS4Free

    Figure 2

    Repeat these steps for the second 2TB disk. When complete, select “Apply changes” (See Figure 3).

    Screenshot showing the Disk Management page in NAS4Free indicating that two new disks have been added

    Figure 3

    Note: If you’re adding disks that have previously been formatted using ZFS, NAS4Free will likely not allow you to add these disks as unformatted. You can, however, add them by selecting “zfs storage pool device” under the “Preformatted file system” field and skip the following formatting step.

    Format the Disks

    Now that the disks have been added, we need to format them. Navigate to Disks->Format, and select one of the newly added disks from the drop-down menu under the “Disk” field. Select “ZFS storage pool device” from the drop-down menu under the “File system” field, then select “Format disk” (See Figure 4).

    Screenshot showing a newly added disk being formatted as a ZFS storage pool device in NAS4Free

    Figure 4

    Repeat these steps for the second disk, then navigate back to Disks->Management and ensure that both disks are present and formatted as ZFS storage pool devices (See Figure 5).

    Screenshot showing two newly added disks formatted as a ZFS storage pool device in NAS4Free

    Figure 5

    Create a ZFS Virtual Device

    We’ve added our two 2TB hard disks and formatted them. Now its time to create a ZFS “vdev” or virtual device.

    Unlike traditional file systems, which reside on single devices and require a volume manager to use more than one device, ZFS filesystems are built on top of virtual storage pools called “zpools.” A zpool is constructed of virtual devices, or “vdevs,” which are themselves constructed of block devices: files, hard disk partitions, or entire disks, with the latter being the recommended usage. Block devices within a vdev may be configured in different ways, depending on needs and space available: non-redundantly (similar to RAID 0), as a mirror (RAID 1) of two or more devices, which is the focus of this post, or as a RAID-Z (similar to RAID-5) group of three or more devices.

    In summary then, a vdev represents the disk drives that are used to create a zpool. A zpool can have any number of vdevs at the top of the configuration, known as a “root vdev.” If the top-level virtual devices contain two or more physical devices, the configuration provides data redundancy as mirror or RAID-Z virtual devices.

    To create a virtual device consisting of our newly added hard disks, navigate to Disks->ZFS->Pools->Virtual device, and select the “+” icon. In the subsequent page, enter a name for the new virtual device under the “Name” field (e.g., “vd_1”), and select “Mirror” from among the options under the “Type” field. Now select both hard disks in the “Devices” field by holding the CTRL key and left-clicking each disk. You can also enter a description for the virtual device under the “Description” field, if desired. Select “Save” when complete (See Figure 6).

    Screenshot showing the creation of a ZFS virtual device in NAS4Free

    Figure 6

    Create a ZFS Pool

    Having created our vdev, let’s move on and create a zpool. Navigate to Disks->ZFS->Pools->Management, and select the “+” icon. In the subsequent page, enter a name for the new zpool under the “Name” field (e.g., pool_1). You should see the vdev created previously listed under the “Virtual devices” field. Select the vdev by left-clicking on it. Add a description for the virtual device under the “Description” field if desired. The remaining options can retain their default settings, resulting in the mount point for the zpool becoming /mnt/[your-zpool-name]. Select “Save” when complete (See Figure 7).

    Screenshot showing the creation of a ZFS zpool in NAS4Free

    Figure 7

    Create a ZFS Dataset

    At this point you could start using your entire zpool as storage if desired. However, a significant feature of ZFS is the concept of “datasets.” A dataset is essentially a child filesystem of the parent zpool. Imagine that the zpool is a single hard disk. In a typical hard disk you would create a single, disk-sized partition, and then format that partition with a filesystem. But if later you’d like to add additional filesystems to the disk, you have to erase and redo your partition to create more partitions to contain the new filesystems, or use a tool to actively resize existing partition, and then create the new partitions and filesystems.

    With datasets, all of these partitioning efforts are unnecessary. A ZFS dataset acts like another mounted partition with no locked-in size. The quantity of disk space it takes up is only as much space as you use in populating it, or children datasets of it (of course, it can never be larger than the size of its parent zpool). You don’t have to worry about resizing partitions as ZFS inherently handles all that for you. Additionally, each dataset can have its own special configuration by modifying different behavioral variables. For example, you can determine quota and permissions independently for each dataset. Finally, datasets provide more flexibility if you need to snapshot or clone your filesystems.

    To add a dataset to the zpool, navigate to Disks->ZFS->Datasets->Dataset, and select the “+” icon. Enter a name (e.g., “files”) in the “Name” field (resulting in the mount point for the dataset becoming /mnt/[your-zpool-name]/[your-dataset-name]). Ensure that the zpool created previously is selected from the drop-down list under the “Pool” field. If you’re interested in performing periodic snapshots of the dataset (discussed below), I recommend enabling the “Snapshot Visibilty” option so that the snapshots are added automatically to /mnt/[your-zpool-name]/[your-dataset-name])/.zfs/snapshots. The remaining options can be configured according to your requirements. Select “Add” when complete (See Figure 8).

    Screenshot showing the creation of a ZFS dataset in NAS4Free

    Figure 8

    Wrapping up

    We’ve successfully added two new 2TB hard disks to NAS4Free and formatted them, created a vdev and a zpool, and finally, created a dataset within our zpool. At this point you can start enabling services such as CIFS, NFS, UPnP, etc., to take advantage of your new ZFS mirror storage. Remember, when configuring some of these services to select the correct mount point for your dataset (e.g., /mnt/pool_1/files).

    With the creation and configuration of our ZFS mirror out of the way, let’s move on talk about a few maintenance activities that should prove useful.

      Replacing a defective hard disk

    Occasionally you may have to replace a hard disk in your zpool that has become defective. To perform the replacement, navigate to Disks->ZFS->Pools->Information and note which disk is defective or missing (e.g. ada2). Next, navigate to Disks->ZFS->Pools->Tools and offline the disk if possible by selecting “offline” from the drop-down list under the “Command” field. Ensure that “Device” is selected under the “Option” field and that the correct pool is selected under the “Pool” field. Use the checkbox to select the defective disk under the “Devices” field, then select “Send Command!” (See Figure 9).

    Screenshot showing a defective disk being offlined in NAS4Free

    Figure 9

    Power down NAS4Free, then identify and replace the defective disk with one of equal storage capacity using, if possible, the same SATA port [Pro-tip: Take the time to label your disks correctly (e.g. ada2) when you install them. It will make physically identifying the defective disk much easier!]. Restart NAS4Free and navigate to Disks->ZFS->Pools->Information to verify the device name for the new disk. If you were able to reuse the same SATA port, the device name should be same as the defective disk (e.g. ada2). Navigate to Disks->ZFS->Pools->Tools and replace the disk by selecting “replace” from the drop-down list under the “Command” field. Ensure that “Device” is selected under the “Option” field and that the correct pool is selected under the “Pool” field. Use the checkbox to select the defective disk under the “Devices” field and the new disk from the drop-down list under the “New Device” field, then select the “Send Command!” The replacement disk should resilver fairly quickly. Verify by navigating to Disks->ZFS->Pools->Information

      Creating and managing snapshots

    One of the many great features about using ZFS is its snapshot capability. A snapshot is a read-only reference to the state of a dataset at the moment the snapshot was taken. It is a reference, and not copy, because at the moment it is taken, it takes up no additional space. However, as data within the dataset changes, either because files are modified or deleted, the snapshot consumes disk space by continuing to reference the old data. This behavior allows you to easily recover files if necessary, but in doing so prevents disk space from being freed until the snapshot is deleted.

    To take a snapshot manually, navigate to Disks->ZFS->Snapshots->Snapshot, and select the dataset you want to snapshot (e.g., pool_1/files) from under the “Path” field. Enter a name for the snapshot (e.g., snapshot_1), enable “Recursive” option, then select “Add” (See Figure 10).

    Screenshot showing a ZFS snapshot being manually created in NAS4Free

    Figure 10

    NAS4Free also provides the ability to configure reoccurring snapshots under Disks->ZFS->Snapshots->Auto Snapshot. Here you can schedule a time the system should perform the snapshot and how long it should retain them, resulting in the oldest snapshot being deleted when the deadline is reached.

    You have a couple of options when it comes to “rolling back” to a particular snapshot. In fact, though , rolling back is a slight misnomer, because what you’re really doing is locating the snapshot you’re interested in and copying over the files you’d like to recover. If you selected the option “Snapshot Visibility” when setting up your dataset in NAS4Free (See Disks->ZFS->Datasets->Dataset->Edit), then all snapshots for that dataset will be located in that filesystem under the directory /.zfs/snapshot (e.g., /mnt/pool_1/files/.zfs/snapshot). This allows you to simply navigate to the snapshot directory your interested in and copy files from that directory to the current filesystem.

    Another way you can recover files from snapshots is to clone one to another directory. This approach has the advantage of allowing you to share out the cloned snapshot directory, say using CIFS or NFS, for some period of time until files are recovered. To clone a snapshot, navigate to Disks->ZFS->Snapshots->Snapshot and edit the snapshot you’re interested in cloning by selecting the small wrench icon. Ensure that “Clone” is selected under the “Action” field, then enter a path to the directory where the clone is to reside. Note that this path must be expressed as a relative path. So, for example, pool_1/files/oldfiles would work, but /mnt/pool_1/files/oldfiles would not, nor would /pool_1/files/oldfiles. Also note that the directory where the snapshot will be cloned does not have to be created in advance, rather it will be created automatically for you when you clone the snapshot. Now, select “Execute” when finished and your cloned snapshot will be available for use at the path you specified (e.g. /mnt/pool_1/files/oldfiles) (See Figure 11). Cloned snapshots can be destroyed at anytime by navigating to Disks->ZFS->Snapshots->Clone.

    Screenshot showing a snapshot clone being manually created in NAS4Free

    Figure 11
      Data scrubbing

    Performing a ZFS “scrub” on a regular basis helps to identify data integrity problems, detect silent data corruptions caused by transient hardware issues, and to provide early alerts to disk failures. This operation traverses all the data in the zpool once and verifies that all blocks can be read. Scrubbing proceeds as fast as the vdevs will allow, though the priority of any disk I/O generally remains below that of normal operations. So, while the scrub operation might negatively impact performance slightly, the zpool’s data should remain usable and nearly as responsive while the scrubbing occurs.

    To schedule and manage scrubs on a ZFS zpool in NAS4Free, we’ll set up a cron job to run the zpool scrub command. Navigate to System->Advanced, and select the Cron tab. Ensure that the “Enable” checkbox is selected, then enter the command zpool scrub [your-pool-name] in the “Command” field. Ensure that the command is run as the root user and enter a description for the cron job if desired. Now select when you’d like the command to run in the “Scheduled time” field. If you have consumer-quality drives, consider a weekly scrubbing schedule. If you have data center-quality drives, consider a monthly scrubbing schedule. Also note that depending upon the amount of data in the zpool, a scrub can take a long time. Consequently, you may want to consider scheduling them for evenings or weekends to minimize the impact on performance. When complete, select “Add”, then “Apply changes”. The example shown in Figure 12 shows the command zpool scrub pool_1 will run every Sunday at 1300 local time.

    Screenshot showing ZFS scrubbing being configured as a cron job in NAS4Free

    Figure 12

    Conclusion

    This post described how to create and maintain a simple, yet resilient, ZFS mirror in NAS4Free, an open source NAS implementation based on FreeBSD.

    bookmark_borderHow to Use Portmaster to Update Ports

    (20170315 — The steps in this post were amended to address changes in recent versions of software — iceflatline)

    The Ports Collection is a set of Makefile, patches, and description files stored in /usr/ports. This set of files is used for building and installing applications on FreeBSD, and other BSD-based operating systems.

    This post will describe how to use portmaster, a utility for updating installed ports. portmaster is nothing more than a shell script (albeit a quite elegant and powerful one), written in /bin/sh. It does not depend upon other ports, external databases or languages, rather it’s been written in such a way as to make use of the information about a port’s dependencies, dependents, file locations and other information contained in /var/db/pkg to determine which ports to update.

    The versions of software discussed in this post are as follows:

    • FreeBSD 11.0-RELEASE
    • portmaster-3.17.10

    Okay, let’s get started. All commands are issued as the root user or by simulating the root user by using the command su. Let’s make sure that the Ports Collection is updated to its most current version with the following command:

    If you haven’t installed portmaster yet, let’s do that now. You’ll be prompted with several configuration options. Select any options you’d like and select “OK”:

    Now that the Ports Collection has been updated and portmaster installed, let’s check the installed ports against the updated Ports Collection to see whether any installed ports need to be updated. portmaster provides a way to list ports that need updating using the -L option:

    As you’ll see in the corresponding output of this command that portmaster groups all installed ports into four categories:

    Root ports: port listed under this category have no dependencies, nor are they depended on by other ports.

    Trunk ports: ports listed under this category have no dependencies, but other ports depend upon them.

    Branch ports: ports listed under this category have dependencies and are also depended upon by other ports.

    Leaf ports: Ports listed under this category have dependencies but are not depended upon by other ports.

    Each installed port will be listed in one of these categories along with whether the port has a revised version available:

    Following the list portmaster will present a succinct summary of the status of your ports:

    Before updating a particular port or ports, it’s a good idea to check the notes contained in /usr/ports/UPDATING to see if there are any issues related to updating one or more of them. /usr/ports/UPDATING contains all the last minute notes on all of the ports in the Ports Collection and documents, where applicable, some of the problems you may encounter when updating, and/or additional features or options that may be available. Follow the instructions contained in /usr/ports/UPDATING to update the affected ports. In most every case there will be instructions for how to use portmaster to perform the task. The remaining ports can be updated using the following command:

    The -d option tells portmaster to clean up the installation files (in /usr/ports/distfiles), which will help save some disk space. The -w option tells portmaster to save old shared libraries (in /usr/local/lib/compat/pkg/) before “deinstalling” the existing port, allowing those libraries to potentially be restored if there are any incompatibility issues between the new port and the installed libraries. Adding the -v option will direct portmaster to be a bit more forthcoming about what it’s doing. Finally, the name of the port should be one of the following: the full name of the port directory as specified in /var/db/pkg, for example apache22-2.2.23_3 or the full path to the port in the Ports Collection, for example /usr/ports/www/apache22.

    After entering the command above portmaster will recurse through the port and its dependencies (if any) to handle any configuration options. If configuration options have changed since the last time the port was updated, portmaster will likely prompt for input. However, you can force the configuration dialogs for all ports by adding the force-config option to the command:

    If none of the port’s dependencies require updating, portmaster will simply download the necessary source files and perform the update, otherwise you will be presented a list of ports that will be updated and asked to confirm before portmaster proceeds. You can skip the confirmation step by adding the no-confirm option to the command:

    You can also update all of the outdated ports at once using the following command:

    The -a options tells portmaster to review all installed ports and update them if necessary. Once again, if portmaster is unclear about the configuration options for a particular port, it will prompt for input, otherwise it will present a list of ports that it will update and ask to confirm before proceeding. The force-config and no-confirm options can be used here as well, if desired.

    Adding the -x option will direct portmaster to avoid building or updating ports that match a pattern. For example, the following will update all installed ports except apache22:

    The portmaster utility also provides some other useful functions. For example, portmaster can be used as a port installation tool by executing it as though you were updating a port. portmaster will recognize that it’s a new port and install the port’s dependencies as usual:

    Sometimes it’s helpful to have portmaster figure out what needs to be updated and in what order, but not actually do it. Adding the -n option directs portmaster to run through the configuration, but not actually update or install any ports

    There you have it. The portmaster utility is a simple yet powerful tool for updating your ports. It does not depend on other software or use an external database to track what you have installed, but rather uses the existing ports infrastructure, including what is located in /var/db/pkg. This post covered the basics. The portmaster man page contains a lot more information about portmaster, how it works and what choices are available to you.

    iceflatline