Automatic Local Mirror YUM Configuration

Introduction and Goals
Overview
Local mirror creation
Create the mirrors.fedoraproject.org virtual host
DNS configuration
Improvements and thoughts


Introduction and Goals

The Yellow dog Update, Modified, or YUM, is the package installation and upgrade tool integrated into the Fedora Linux and other distributions. This Guru Labs Guide achieves the following goals:

  • Deploy a local Fedora mirror for core package installation and updates
  • Have local Fedora boxes automatically use the local mirror with ZERO configuration changes.

Since ZERO configuration change occurs on the Fedora box, if the Fedora box is moved outside the local network (common with laptops) it will automatically use the stock behavior of accessing the Fedora Mirror system.

The benefit is greatly reduced Internet bandwidth utilization and install/updates that occur at LAN speeds.

The solution detailed in this Guru Guide is ideal for any organization such as a company, school/university, hosting company or ISP who has a multiple Fedora users on their network. This solution has been in production for many months here at Guru Labs, a Linux training company, with great results for the Linux instructors and Fedora users who work here.

Also note that this same solution could be trivially adapted for CentOS boxes as well.

Finally, note that GL250 Linux Systems Administration and GL275  Enterprise Linux Network Services training courses have advanced coverage of all the software in this guide including topics such as implementing split DNS with views, virtual hosting, rsync and YUM repository management.

Overview

The stock Fedora Linux YUM configuration defines repositories in the /etc/yum.repos.d/ directory. The repo files use the world-wide Fedora Mirror system with a line such as the following:

mirrorlist=http://mirrors.fedoraproject.org/mirrorlist?repo=core-$releasever&arch=$basearch

The mirrorlist program running on mirrors.fedoraproject.org returns a list of mirrors for the repository and YUM selects one. To see the list of mirrors for the Fedora core-6 repository, with your web browser, click the URL:

http://mirrors.fedoraproject.org/mirrorlist?repo=core-6&arch=i386

For those whose first instinct it is to use a transparent proxy server to automatically cache updates, be aware that it won't work effectively. The reason is that clients select different mirrors so a given RPM will be accessed via numerous URIs, each of which is cached independently of the others.

To implement Guru Labs YUM Automatic Local Mirror solution the following steps should be performed:

  • On the DNS server(s) used by your organization, become authoritative for the mirrors.fedoraproject.org host and return an A record to local IP address.
  • On that local IP address, run a virtual web server to handle requests for the mirrors.fedoraproject.org host.
  • On that virtual web server, install the GPL'd Guru Labs mirrorlist perl CGI that directs Fedora clients to use a local mirror for locally mirrored repositories and otherwise uses the normal Fedora Mirror system.
  • Create a local mirror by using a cron job to mirrors the Fedora repositories you are interested in. This could be done on the same server that is hosting mirrors.fedoraproject.org or another server

I would recommend implementing them in reverse order to avoid disrupting your Fedora users.

For the detailed instructions below, the following place holders will be used:

  • Local mirror and webserver hostname: fileserver.gurulabs.com
  • IP address used for virtual hosting the mirrors.fedoraproject.org site: 192.0.2.5

Where you see those placeholders in the instructions below, replace them with actual values appropriate to your organization. The directions assume you are using Fedora, RHEL, or CentOS for fileserver.gurulabs.com but they could be easily tailored for any Linux or UNIX variant.

Local mirror creation

Decide which Fedora repositories you'd like to mirror locally. Keep in mind that in Fedora Core v6, the core and extras repositories are separate. In Fedora v7 they have been merged. In this example, we'll mirror core-6 and updates-released-fc6. Make sure you have enough disk space. As of May 2007, core-6 and updates-released-fc6 require 7.7GB. If you have copied the contents of the Fedora v6 DVD to you filesystem already, depending on your filesystem layout, you might be able to use hard links to save space on the core-6 repo. On fileserver.gurulabs.com perform the following:

Setting up the core-6 repo:

Create a directory tree to hold the core-6 repo:

mkdir -p /export/mirror/fedora-linux/6/i386

Copy (or create hard links to) all the RPMs shipped on the Fedora Core v6 DVD, then in the directory, run createrepo on the RPMs:

createrepo .

Since the core-6 repo is static that is all that needs to be done for the repo

Setting up the updates-released-fc6 repo:

Create a directory tree to hold the updates-released-fc6 repo:

mkdir -p /export/mirror/fedora-linux-updates/6/i386

Consult the the Fedora Mirror system and choose a rsync accessible mirror that is "close" to you in Internet proximity (if known) or geographically. Perform the initial population of the directory (all on one line):

rsync -rlpth --progress --exclude=debug/ 
rsync://fedora-mirror/fedora-linux-updates/6/i386/ /export/mirror/fedora-linux-updates/6/i386/

As of May 2007, this will download ~4.4GB of data. Also you may need to adjust the rsync URL depending on your selected mirror.

Create a cron job that will update the updates-released-fc6 repo on a daily basis:

cd /etc/cron.daily/
wget http://www.gurulabs.com/downloads/mirror-repos.sh
chmod 755 mirror-repos.sh

You must edit mirror-repos.sh and replace fedora-mirror with the hostname of your selected Fedora mirror. Note that mirror-repos.sh will randomly sleep for up to 60 minutes before running the rsync command to spread out the load on the mirrors.  Finally, you may need to adjust the rsync URL depending on your selected mirror.

Adjust the SELinux policy

If your local mirror is using Fedora Core v6, RHEL5, CentOS5 or newer with SELinux in enforcing mode, then adjust the SELinux policy to allow Apache to serve content from that directory tree:

semanage fcontext -a -t httpd_sys_content_t "/export(/.*)?"
chcon -R -t httpd_sys_content_t /export

Adjust the commands to be more specific if you were already using /export/ and it contained other content with different security requirements.

Setup Apache to serve up the mirrored repositories

The following instructions assume that no other web sites are running on fileserver.gurulabs.com. If there are other web sites running on the machine, then adjust the directions as needed so as to not interfere with the existing sites.   One best practice we teach at Guru Labs is to create a separate Apache configuration file for each IP address on a web server and add all Apache directives for all the virtual hosts using that IP address to that configuration file. Assuming that the IP address for fileserver.gurulabs.com is 192.0.2.5, create the file /etc/httpd/conf.d/192.0.2.5.conf with the following content:

NameVirtualHost 192.0.2.5:80

<VirtualHost 192.0.2.5:80>
ServerName fileserver.gurulabs.com
ServerAlias fileserver
Alias /fedora-linux-updates /export/mirror/fedora-linux-updates
<Location /fedora-linux-updates>
Options Indexes
</Location>
Alias /fedora-linux /export/mirror/fedora-linux
<Location /fedora-linux>
Options Indexes
</Location>
</VirtualHost>

Create mirrors.fedoraproject.org virtual host

Recall that Fedora boxes request a list of mirrors from the URL http://mirrors.fedoraproject.org/mirrorlist. Implement a virtual web host for mirrors.fedoraproject.org that you will use to run the Guru Labs mirrorlist replacement. At the bottom of the previously created /etc/httpd/conf.d/192.0.2.5.conf file add the following directives:

<VirtualHost 192.0.2.5:80>
ServerName mirrors.fedoraproject.org
ScriptAlias /mirrorlist /var/www/mirrors.fedoraproject.org/GLmirrorlist
</VirtualHost>

Create the directory to hold the CGI script:

mkdir -p /var/www/mirrors.fedoraproject.org

Download and install the Guru Labs mirrorlist replacement and set the proper SELinux file contexts:

cd /var/www/mirrors.fedoraproject.org
wget http://www.gurulabs.com/downloads/GLmirrorlist
chmod 755 GLmirrorlist
semanage fcontext -a -t httpd_sys_script_exec_t /var/www/mirrors.fedoraproject.org/GLmirrorlist
chcon -t httpd_sys_script_exec_t GLmirrorlist

Install the perl modules required by the CGI script:

yum install -y perl-Net-DNS perl-libwww-perl

You must edit the perl script and change the two occurrences of fileserver.gurulabs.com to whatever the actual hostname is of your local mirror.

DNS Configuration

To complete the configuration, local Fedora clients must resolve the mirrors.fedoraproject.org hostname to the IP address running the virtual host for the same. For testing purposes of your local mirror before putting it into production, you can edit the /etc/hosts file on your own workstation.

The goal of the DNS configuration is to make your local DNS server(s) used by your local Fedora boxes be authoritative for the exact host, mirrors.fedoraproject.org and no other part of the fedoraproject.org domain. Another goal is that only local DNS clients are effected; however if non-local clients don't or can't use your local DNS servers then this goal is automatically implemented. If non-local clients do query your DNS servers then this goal can be implemented by using "views" in the BIND name server.

In your named.conf file add the following zone entry:

zone "mirrors.fedoraproject.org" IN {
type master;
file "data/mirrors.fedoraproject.org";
};

In the data/mirrors.fedoraproject.org zone file add the following:


; Local clients should see mirrors.fedoraproject.org
; as 192.0.2.5 for Automatic Local Mirror
$TTL 3600
@ IN SOA ns1.gurulabs.com. noc.gurulabs.com. (
2007022801 ; Serial
10800 ; Refresh (3 hours)
60 ; Retry (1 mins)
3600000 ; Expire (1000 hours)
600 ) ; negative TTL (10 mins)

IN A 192.0.2.5
IN NS ns1.gurulabs.com.

You should adjust the IP address in the A record to be the IP address on the Apache server hosting the mirrors.fedoraproject.org virtual host. The NS record(s) and SOA record should be adjusted to reflect correct site specific configuration.

Ideas for improvement and thoughts

It would be nice to avoid the DNS trickery and still have 100% local control of what mirror YUM uses.

The problem of automatically finding a "local" cache/proxy/mirror is a solved problem in the web browser arena using the WPAD protocol. If YUM implemented a WPAD-like technique (call it YMAD for YUM Mirror Auto Discovery), it could auto-detect if the local site administration have defined a local mirror, and use it. For example, YUM could use the following algorithm:

$myFQDN = getMyFQDN;
while (hostname.not.a.TLD($myFQDN)) {
   if hostname.exists(ymad.$myFQDN) {
       mirrorlist="http://ymad.$myFQDN/$PathandQuery";
   }
   $myFQDN = removeLeftmostSubdomain($myFQDN);
}


The benefit of using a WPAD-like technique is that many organizations have experience implementing it for Firefox and IE web browsers.

Another way YUM could auto-detect and use a local mirror is to use DNS SRV records similar to how Kerberos, Jabber and other client/server network applications use them to automatically detect their local servers.

The GLmirrorlist CGI could be improved to use a configuration file to define local repositories without having to edit the script directly.

As of mid-May 2007, the Fedora Mirror System moved to a new backend called "mirrormanager". If you register your mirror (including a private internal mirror if you can run report_mirror) you can also list local IP networks that should use your mirror. This way the official mirrorlist CGI will only return your mirror for clients on those IP networks. The drawbacks to this approach are:

  • Security concerns on which mirror should really be authoritative for a given IP network. What happens if someone (possibly rouge) lists your IP networks? Hope that your users pay attention to YUM gpg messages and don't import bad keys.
    • In other words there is no 100% local administrative control
  • If you have RFC1918 addresses internally, use NAT to access the internet, and your external IP changes, then the "mirrormanager" list-local-IPs approach becomes unworkable. If your external IP changes, the official mirrorlist will tell someone else to use a your internal mirror, which will cause their package install/updates to fail and your clients to not use your local mirror.
  • The list of local IP networks must be maintained if/when they change for your organization

Like the DNS trickery, the mirrormanager defined local IP network approach is not ideal and has worrisome security issues. The proper solution is to tackle this problem within YUM directly using a WPAD-like or DNS SRV technique for it to auto discover the local mirror. This needs to be turned on by default.