Uploaded by Emerson D. Camara

Document 2772938.1-Cannot find master domain

advertisement
03/05/2023, 12:07
Document 2772938.1
Copyright (c) 2023, Oracle. All rights reserved. Oracle Confidential.
OLVM: DataCenter Non-Responsive with Error "Cannot find master domain" (Doc ID
2772938.1)
In this Document
Symptoms
Cause
Solution
References
APPLIES TO:
Linux OS - Version Oracle Linux 7.9 with Unbreakable Enterprise Kernel [5.4.17] and later
Linux x86-64
SYMPTOMS
All hosts in the DC become non-operational:
2021-04-27 22:19:47,441-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE
VDSM xxx1 command ConnectStoragePoolVDS failed: Cannot find master domain: u'spUUID=7c7903b6-c199-4ef1-97fb
2021-04-27 22:22:39,857-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE
VDSM xxx2 command ConnectStoragePoolVDS failed: Cannot find master domain: u'spUUID=7c7903b6-c199-4ef1-97fb
2021-04-27 22:22:40,406-04 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand] (
ConnectStoragePoolVDSCommandParameters:{hostId='e3768de0-0baa-4576-8ff7-afbcee94f605', vdsId='e3768de0-0baa
execution failed: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Cannot find master do
2021-04-27 22:22:41,057-04 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE
VDSM xxx3 command ConnectStoragePoolVDS failed: Cannot find master domain: u'spUUID=7c7903b6-c199-4ef1-97fb
DataCenter becomes non reponsive:
2021-04-27 16:17:35,146-04 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-M
SYSTEM_CHANGE_STORAGE_POOL_STATUS_PROBLEMATIC_WITH_ERROR(987), Invalid status on Data Center xxx. Setting Da
By checking system logs from all hosts, they report path offline and I/O errors with the master domain:
Apr
Apr
Apr
Apr
Apr
Apr
Apr
Apr
Apr
...
27
27
27
27
27
27
27
27
27
15:20:04
15:20:09
15:20:09
15:20:09
15:20:09
15:20:09
15:20:09
15:20:09
15:20:09
host1
host1
host1
host1
host1
host1
host1
host1
host1
multipathd: 3600144f0d329c44b00005ee2686e0003: sdj - path offline
multipathd: 3600144f0d329c44b00005ee2686e0003: sdj - path offline
kernel: blk_update_request: I/O error, dev dm-8, sector 1124478346 op 0x1:(WRITE) flag
kernel: blk_update_request: I/O error, dev dm-8, sector 104762199904 op 0x1:(WRITE) fl
kernel: blk_update_request: I/O error, dev dm-8, sector 12758732825 op 0x1:(WRITE) fla
kernel: blk_update_request: I/O error, dev dm-8, sector 26579493016 op 0x1:(WRITE) fla
kernel: blk_update_request: I/O error, dev dm-8, sector 11248957761 op 0x1:(WRITE) fla
kernel: blk_update_request: I/O error, dev dm-8, sector 9314783624 op 0x0:(READ) flag
multipathd: 3600144f0d329c44b00005ee2686e0003: Disable queueing
Apr
Apr
Apr
Apr
Apr
Apr
Apr
Apr
27
27
27
27
27
27
27
27
22:58:48
22:59:05
22:59:07
22:59:08
22:59:10
22:59:15
22:59:17
22:59:20
host2
host2
host2
host2
host2
host2
host2
host2
kernel: blk_update_request: I/O error, dev dm-8, sector 0 op 0x0:(READ) flags 0x0 phy
multipathd: 3600144f0d329c44b00005ee2686e0003: sdj - path offline
kernel: blk_update_request: I/O error, dev dm-8, sector 264192 op 0x0:(READ) flags 0x0
kernel: blk_update_request: I/O error, dev dm-8, sector 0 op 0x0:(READ) flags 0x0 phy
multipathd: 3600144f0d329c44b00005ee2686e0003: sdj - path offline
multipathd: 3600144f0d329c44b00005ee2686e0003: sdj - path offline
kernel: blk_update_request: I/O error, dev dm-8, sector 264192 op 0x0:(READ) flags 0x0
multipathd: 3600144f0d329c44b00005ee2686e0003: sdj - path offline
Apr 27 23:02:48 host3 kernel: blk_update_request: I/O error, dev dm-8, sector 0 op 0x0:(READ) flags 0x0 phy
Apr 27 23:02:48 host3 vdsm[5068]: ERROR Unhandled exception in <Task discardable <UpdateVolumes vm=464
2Traceback (most recent call last):#012 File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315,
91, in __call__#012 self._callable()#012 File "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line
riodic.py", line 357, in _execute#012 self._vm.updateDriveVolume(drive)#012 File "/usr/lib/python2.7/site-pa
/lib/python2.7/site-packages/vdsm/virt/vm.py", line 6101, in _getVolumeSize#012 (domainID, volumeID))#012Sto
lume 0281c278-7834-48a0-90b7-110a384b1561
https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-state=toiftfxm2_4&id=2772938.1
1/2
03/05/2023, 12:07
Document 2772938.1
....
Apr 27 23:02:48 host3 kernel: blk_update_request: I/O error, dev dm-8, sector 0 op 0x0:(READ) flags 0x0 phy
CAUSE
The I/O errors indicate the storage end issue which makes VDSM the victim.
SOLUTION
Please engage the storage team to check from storage logs to see if there are any following issues from the storage end:
- Disk issue/faulty
- Network connection problems(Switch issue, bad cables, lot iSCSI target etc)
REFERENCES
NOTE:2727849.1 - OLVM: Frequent VM Paused With Error "unknown storage error"
Didn't find what you are looking for?
https://support.oracle.com/epmos/faces/DocumentDisplay?_adf.ctrl-state=toiftfxm2_4&id=2772938.1
2/2
Download