天天看點

【分布式存儲】GlusterFS failing to mount at boot with Ubuntu 14.04

GlusterFS failing to mount at boot with Ubuntu 14.04

【分布式存儲】GlusterFS failing to mount at boot with Ubuntu 14.04

Previously I asked about mounting GlusterFS at boot in an Ubuntu 12.04 server  and the answer was that this was buggy in 12.04 and worked in 14.04.  Curious I gave it a try on a virtual machine running on my laptop and in  14.04 it worked. Since this was critical for me, I decided to upgrade  my running servers to 14.04 only to discover that GlusterFS is not  mounting localhost volumes automatically either.

This is a Linode server and fstab looks like this:

# <file system> <mount point>          <type>    <options>                 <dump>  <pass>
proc        /proc                        proc    defaults                       0       0
/dev/xvda   /                            ext4    noatime,errors=remount-ro      0       1
/dev/xvdb   none                         swap    sw                             0       0
/dev/xvdc   /var/lib/glusterfs/brick01   ext4    defaults                       1       2
koraga.int.example.com:/public_uploads /var/www/shared/public/uploads glusterfs defaults,_netdev 0 0
           
The booting process likes like this (around the networking mounting part, which are the only fails):
* Stopping Mount network filesystems                                    [ OK ]
 * Starting set sysctls from /etc/sysctl.conf                            [ OK ]
 * Stopping set sysctls from /etc/sysctl.conf                            [ OK ]
 * Starting configure virtual network devices                            [ OK ]
 * Starting Bridge socket events into upstart                            [ OK ]
 * Starting Waiting for state                                            [fail]
 * Stopping Waiting for state                                            [ OK ]
 * Starting Block the mounting event for glusterfs filesystems until the [fail]k interfaces are running
 * Starting Waiting for state                                            [fail]
 * Starting Block the mounting event for glusterfs filesystems until the [fail]k interfaces are running
 * Stopping Waiting for state                                            [ OK ]
 * Starting Signal sysvinit that remote filesystems are mounted          [ OK ]
 * Starting GNU Screen Cleanup                                           [ OK ]
           
I believe the log file

/var/log/glusterfs/var-www-shared-public-uploads.log

contains the main clue to the problem, as it's the only one that is really different between this server, where mounting is not working, and my local virtual server, where it is:
[2014-07-10 05:51:49.762162] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.1 (/usr/sbin/glusterfs --volfile-server=koraga.int.example.com --volfile-id=/public_uploads /var/www/shared/public/uploads)
[2014-07-10 05:51:49.774248] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-07-10 05:51:49.774278] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2014-07-10 05:51:49.775573] E [socket.c:2161:socket_connect_finish] 0-glusterfs: connection to 192.168.134.227:24007 failed (Connection refused)
[2014-07-10 05:51:49.775634] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: koraga.int.example.com (No data available)
[2014-07-10 05:51:49.775649] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2014-07-10 05:51:49.776284] W [glusterfsd.c:1095:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7f6718bf3f83] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7f6718bf7da0] (-->/usr/sbin/glusterfs(+0xcf13) [0x7f67192bbf13]))) 0-: received signum (1), shutting down
[2014-07-10 05:51:49.776314] I [fuse-bridge.c:5475:fini] 0-fuse: Unmounting '/var/www/shared/public/uploads'.
           
The status of the volume is:
Volume Name: public_uploads
Type: Distribute
Volume ID: 52aa6d85-f4ea-4c39-a2b3-d20d34ab5916
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: koraga.int.example.com:/var/lib/glusterfs/brick01/public_uploads
Options Reconfigured:
auth.allow: 127.0.0.1,192.168.134.227
client.ssl: off
server.ssl: off
nfs.disable: on
           
If I run

mount -a

after booting up, the volume is mounted correctly:
koraga.int.example.com:/public_uploads on /var/www/shared/public/uploads type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)
           
A couple of related log files show this:

/var/log/upstart/mounting-glusterfs-_var_www_shared_public_uploads.log

:
start: Job failed to start
           

/var/log/upstart/wait-for-state-mounting-glusterfs-_var_www_shared_public_uploadsstatic-network-up.log

status: Unknown job: static-network-up
start: Unknown job: static-network-up
           

but on my testing server, it shows exactly the same, so, I don't think this is relevant.

Any ideas what's wrong now?

Update: I tried the change of WAIT_FOR from static-network-up to networking and it still didn't work but all the [fail] messages at boot disappear. These are the contains of the log files under these conditions:

/var/log/glusterfs/var-www-shared-public-uploads.log

contains:
wait-for-state stop/waiting
           

/var/log/upstart/wait-for-state-mounting-glusterfs-_var_www_shared_public_uploadsstatic-network-up.log

start: Job is already running: networking
           

/var/log/glusterfs/var-www-shared-public-uploads.log

[2014-07-11 17:19:38.000207] I [glusterfsd.c:1959:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.5.1 (/usr/sbin/glusterfs --volfile-server=koraga.int.example.com --volfile-id=/public_uploads /var/www/shared/public/uploads)
[2014-07-11 17:19:38.029421] I [socket.c:3561:socket_init] 0-glusterfs: SSL support is NOT enabled
[2014-07-11 17:19:38.029450] I [socket.c:3576:socket_init] 0-glusterfs: using system polling thread
[2014-07-11 17:19:38.030288] E [socket.c:2161:socket_connect_finish] 0-glusterfs: connection to 192.168.134.227:24007 failed (Connection refused)
[2014-07-11 17:19:38.030331] E [glusterfsd-mgmt.c:1601:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: koraga.int.example.com (No data available)
[2014-07-11 17:19:38.030345] I [glusterfsd-mgmt.c:1607:mgmt_rpc_notify] 0-glusterfsd-mgmt: Exhausted all volfile servers
[2014-07-11 17:19:38.030984] W [glusterfsd.c:1095:cleanup_and_exit] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x23) [0x7fd9495b7f83] (-->/usr/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_notify+0x90) [0x7fd9495bbda0] (-->/usr/sbin/glusterfs(+0xcf13) [0x7fd949c7ff13]))) 0-: received signum (1), shutting down
[2014-07-11 17:19:38.031013] I [fuse-bridge.c:5475:fini] 0-fuse: Unmounting '/var/www/shared/public/uploads'.
           
Update 2: I also tried this in the upstart file:
start on (started glusterfs-server and mounting TYPE=glusterfs)
           
but the computer failed to boot (don't know why yet).

參考資料:

http://serverfault.com/questions/611462/glusterfs-failing-to-mount-at-boot-with-ubuntu-14-04/

繼續閱讀