天天看點

【轉】Unable to boot RHEL 7.6 on baremetal nodesUnable to boot RHEL 7.6 on baremetal nodes

來源于:

https://access.redhat.com/solutions/3717341

Unable to boot RHEL 7.6 on baremetal nodes

 SOLUTION 已驗證 - 已更新 2018年十二月6日09:21 - 

English 

環境

Red Hat OpenStack Platform 13

Red Hat Enterprise Linux 7.6

問題

When trying to deploy an image built from the RHEL7.6 QCOW2, the image fails to boot.

The issue is that the double-quotes are in the wrong spot in the 

GRUB_CMDLINE_LINUX

 as such, everything after 

crashkernel=auto"

 is taken as a command instead of a CMDLINE argument.

決議

The image can be edited using 

guestfish

 to modify 

/etc/default/grub

 like so:

Raw

sudo guestfish -a rhel-7.6.qcow2
> run
> mount /dev/sda /
> edit /etc/default/grub
           

Move the quotes from the end of crashkernel=auto" to the end of the whole line. So the change should be:

Change From:

Raw

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto" console=ttyS0,115200n8 no_timer_check net.ifnames=0
           

Change To:

Raw

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0"
           

Raw

> exit
           

根源

This issue is caused by the double-quotes (") being in the wrong place in the 

GRUB_CMDLINE_LINUX

 line of 

/etc/default/grub

By default the 

GRUB_CMDLINE_LINUX

 looks like this:

Raw

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto" console=ttyS0,115200n8 no_timer_check net.ifnames=0
           

As can be seen, the double-quotes end after crashkernel=auto. This results in everything after that being parsed as a command rather than a command-line option. It should be:

Raw

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto console=ttyS0,115200n8 no_timer_check net.ifnames=0"
           

診斷步驟

Working backwards from the Nova logs, we can see the error you're referring to with the timeout in nova-conductor.log:

Raw

req-6f866160-cdf6-4a2a-a6ad-60cf1e541b21 6cc770ec85812266b3f 063b25ed7c094053be7a64c4f3caace0 - default default] 
Failed to compute_task_build_instances: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 74627e47-4c0f-442e-a7e0-76595ef1eb7ee.:
MaxRetriesExceeded: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 74628f44-4c0f-442e-a7e0-76595ef1fd3e.
           

Checking for the request ID in nova-scheduler.log, we can see:

Raw

nova-scheduler successfully identifies a node: 
2018-11-26 02:06:58.244 1 DEBUG nova.scheduler.utils [req-6f866160-cdf6-4a2a-a6ad-60cf1e541b21 6cc770e121914a658ab3c85812266b3f 063b25ed7c094053be7a64c4f3caace0 - default default] 
Attempting to claim resources in the placement API for instance 74628f44-4c0f-442e-a7e0-76595ef1fd3e claim_resources /usr/lib/python2.7/site-packages/nova/scheduler/utils.py:786

2018-11-26 02:06:58.809 1 DEBUG nova.scheduler.filter_scheduler [req-6f866160-cdf6-4a2a-a6ad-60cf1e541b21 6cc770e121914a658ab3c85812266b3f 063b25ed7c094053be7a64c4f3caace0 - default 
default] Selected host: (overcloud-controller-0.example.com, b94daa75-82f9-4381-9aa4-52ed77de3431) ram: 768000MB disk: 184320MB io_ops: 0 instances: 0 _consume_selected_host 
/usr/lib/python2.7/site-packages/nova/scheduler/filter_scheduler.py:325
           

The selected Node ID is: 

b94daa75-82f9-4381-9aa4-52ed77de3431

Checking Ironic for the node ID:

Raw

2018-11-26 02:25:26.828 1 ERROR ironic.drivers.modules.agent_base_vendor [req-98b72aaf-5185-4ab0-8df2-5c1618629210 - - - - -] Asynchronous exception: Node failed to deploy. Exception: Failed to install a bootloader when deploying node b94daa75-82f9-4381-9aa4-52ed77de3431. Error: {u'message': u'Command execution failed: Installing GRUB2 boot loader to device /dev/sda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpoRdOMm /bin/sh -c "grub2-mkconfig -o /boot/grub2/grub.cfg"\nExit code: 127\nStdout: u\'\'\nStderr: u\'/etc/default/grub: line 6: no_timer_check: command not found\\n\'.', u'code': 500, u'type': u'CommandExecutionError', u'details': u'Installing GRUB2 boot loader to device /dev/sda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpoRdOMm /bin/sh -c "grub2-mkconfig -o /boot/grub2/grub.cfg"\nExit code: 127\nStdout: u\'\'\nStderr: u\'/etc/default/grub: line 6: no_timer_check: command not found\\n\'.'} for node b94daa75-82f9-4381-9aa4-52ed77de3431: InstanceDeployFailure: Failed to install a bootloader when deploying node b94daa75-82f9-4381-9aa4-52ed77de3431. Error: {u'message': u'Command execution failed: Installing GRUB2 boot loader to device /dev/sda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpoRdOMm /bin/sh -c "grub2-mkconfig -o /boot/grub2/grub.cfg"\nExit code: 127\nStdout: u\'\'\nStderr: u\'/etc/default/grub: line 6: no_timer_check: command not found\\n\'.', u'code': 500, u'type': u'CommandExecutionError', u'details': u'Installing GRUB2 boot loader to device /dev/sda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpoRdOMm /bin/sh -c "grub2-mkconfig -o /boot/grub2/grub.cfg"\nExit code: 127\nStdout: u\'\'\nStderr: u\'/etc/default/grub: line 6: no_timer_check: command not found\\n\'.'}
           

The exact problem is that it is unable to find the command 

no_timer_check

:

Raw

Installing GRUB2 boot loader to device /dev/sda failed with Unexpected error while running command.\nCommand: chroot /tmp/tmpoRdOMm /bin/sh -c "grub2-mkconfig -o /boot/grub2/grub.cfg"\nExit code: 127\nStdout: u\'\'\nStderr: u\'/etc/default/grub: line 6: no_timer_check: command not found\\n\'.'}
           

Looking at the 

/etc/default/grub

 file within the QCOW2 image, we can see where this is coming from:

Raw

GRUB_CMDLINE_LINUX="console=tty0 crashkernel=auto" console=ttyS0,115200n8 no_timer_check net.ifnames=0
           

繼續閱讀