jeudi 1 octobre 2015

AWS Mesosphere DCOS Copy Launch Configuration error

I was doing some experimentation with AWS Mesosphere and found something weird:

a. Make a small AWS Mesosphere cluster (1 master, 2 private slaves, 0 public slaves). Wait for all cluster to be created.

b. Go to AWS Launch Configurations, right click the SlaveLaunchConfig then Copy Launch Configuration. Then do not modify anything, just click on Create Launch Configuration. A new Launch Configuration will be created as SlaveLaunchConfigXXXXXCopy.

c. You may compare the 2 configurations, even the User Data is exactly the same.

d. Go to Auto Scaling Groups, select the SlaveServerGroup with the SlaveLaunchConfig, in the Details tab in the bottom click Edit. Then pick the SlaveLaunchConfigXXXXXXXCopy. Save.

e. Go to Instances then right click one of the slaves (without a public DNS), Instance State, Stop. This will effectively terminate the instance because its part of an auto scale group. Once terminated, wait for a few mins then the Auto Scaling Group will kick in and create a new instance but with the new Launch Configuration.

f. The new instance will start, get to the Status Checks, then it will be terminated. Then the Auto Scaling will attempt to create another one, shut it down, terminate and again and again.

g. In the Auto Scale Groups, switch again to the original Launch Configuration, the next time the instance is created it will be fine.

So in other words, I can only think that during the Launch Configuration Copy something went wrong, however I have checked all parameters and all looks the same.

Anyone in Mesosphere or AWS may shed some light please?

Thanks!

Guimo




Aucun commentaire:

Enregistrer un commentaire