Skip to content

CloudStack KVM agent guid logic #12438

@yadvr

Description

@yadvr

problem

On KVM, the libvirt discoverer creates and uses guid based on IP address, however, when the IP address of a KVM host is swapped with other hosts or changed, it can cause side effects such as the new host trying to re-add/configure itself to an old DB entry in the cloud.host table. It can can further cause issues if host has local storage and cause host in Alert stage. When the host is forced removed it can cause deletion of VMs, volumes and local storage pool causing data loss.

versions

Reproduce with ACS 4.22 with mixed arch kvm hosts, in adv zone.

The steps to reproduce the bug

In my env, I swapped IP address of two KVM hosts, and removed one of the hosts to add in another zone but keeping the same mgmt network IP. The zones have two phy networks, cloudbr0 to carry guest and public tariff and cloudbr1 to carry mgmt/storage traffic.

First I observed the host to be flaky to occupy the same host VO/row so I forced removed it which caused destroying all VMs and volumes on that kvm hosts. Luckily had volume snap hosts so I could recover them later one but hit an NPE issue that I fixed in a recent PR.

Next had to debug code and errors to understand the culprit was the guid logic that tried to create a UUID based on ip address string. Ideally this is a design bug, a long term fix may need to be discussed.

What to do about it?

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions