Wednesday, February 23, 2011

PSA: NFS Locking in RHEL6 Needs Reverse Lookup of NFS Server

The RHEL6 version of rpc.statd needs to be able to reverse lookup a hostname for the IP address of your NFS server if you want to lock files mounted from that server. If you don't have a reverse lookup for the IP of your NFS server, just throw its IP and some kind of name for it in /etc/hosts.

Otherwise you get errors in syslog to the effect of:

Feb 23 13:14:57 wardentest3 rpc.statd[1259]: No canonical hostname found for 10.238.3.20
Feb 23 13:14:57 wardentest3 rpc.statd[1259]: STAT_FAIL to wardentest3.geneseo.edu for SM_MON of 10.238.3.20
Feb 23 13:14:57 wardentest3 kernel: lockd: cannot monitor 10.238.3.20

Thursday, February 10, 2011

Oracle Linux 5.6 Anaconda, HTTP RPM Repo, device-mapper-multipath

Including the "base" RPM channel of Oracle Linux 5.6 (a copy of the Server directory) that is accessed over HTTP causes the last RPM fail to install during a cobbler-assisted pxeboot install. For me it's device-mapper-multipath-0.4.9-23.0.8.el5.x86_64.rpm. Anaconda says it cannot download/install that package and asks me if I want to try to download/install it again or not.

The line in the Apache logs each time I pick "retry":

137.238.2.94 - - [10/Feb/2011:21:02:19 +0000] "GET /oracle/OracleLinux/OL5/6/base/x86_64/device-mapper-multipath-0.4.9-23.0.8.el5.x86_64.rpm HTTP/1.1" 416 394

A packet capture reveals that anaconda is issuing a GET request for the RPM in question but only asking for bytes 440-25514. This corresponds to that RPM's information in the repo:

rpm:header-range start="440" end="25515"


It looks like when Anaconda says "beginning installation" it's querying the header-range of each RPM it's interested in, then when it has resolved dependencies/order of install it downloads the full contents of each RPM.

In the packet capture the GET request has the Range header set to "bytes=86392-" which I assume means it wants from bytes 86392 to the end of the file. The only problem is that the file is only 86392 bytes, so it's asking for a byte range past the end of the file and thus Apache returns the 416 error.

The strange thing is this is the second time during the "fetch and install each RPM" phase where this file was requested. The first time, anaconda was smart enough to fetch the whole file and presumably install it.

As I mentioned earlier, this problem is only apparent when making the 5.6 "base" repo available to Anaconda over HTTP. I disabled that repo in cobbler and Anaconda was able to use the NFS path to the folder that has the extracted contents of the Oracle Linux 5.6 DVD to install without a problem. This is strange because I created our 5.6 "base" repo by copying all the .rpm files from the 5.6 DVD and running "createrepo ." in the new directory. I even ran diff to make sure the device-mapper-multipath-0.4.9-23.0.8.el5.x86_64.rpm is the same in my repo and on the 5.6 DVD.

I even tried using Oracle Linux 5.5's installer with the 5.6 "base" repo which should effectively install Oracle Linux 5.6 packages with 5.5's version of Anaconda. That had the same problem. It really looks like there's something about the 5.6 "base" repo that Anaconda doesn't like accessing over HTTP.