-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
master and its corresponding replica are running on the same AZ #152
Comments
Hi, Can you elaborate more about your indications to detect the issue? |
Opps sorry, wrong script input... this is the intended one showing the issue...
|
I do see that |
We define our nodes resource wise so that each node can fit only one pod. |
yes, that is very good discussion to consider.
Regard a failure of entire AZ: Regard managing the resources during AZ outage:
|
Thanks for the detailed explanation, I also read the recommended section, great piece of work! I think that you got things right in regards the The question is in regards our production environment with 3 AZs where nodes are already allocated, assumption is 1 pod on node (can fit), k8s worker nodes are already available and spread evenly on all AZs Assuming we have a 3 master pods deployment (like) that is spread correctly across all 3 AZs (thanks to the soft rule on |
Thanks for feedback on our wiki :) We did have a few days of far discussion regard this one, the conclusions we had were:
Re attempting to deploy in a way that allows all pods get created is equal to re attempting to deploy in a way that fits our standards regard the pod placement. The case of hard rule is easier to detect but harder to mitigate.I believe each strategy has cons and pros. |
OK, Thanks, I will go ahead with these assumptions, will update if anything new pops out, otherwise I will close issues next week. |
@NataliAharoniPayu @NataliAharoni99 @DinaYakovlev @voltbit @doronl @drealecs Since this is in open state, I am in a bit of confusion. Is Node Aware Replication working yet or not ? |
Hi @NataliAharoniPayu ,
the is a cluster with 3 master and 3 replica nodes:
This is the output of the CLUSTER NODES command:
This the result of script that I run to find out where (in Azure in this case) each pod is running (format pod:podIP node node-AZ)
Attached operator logs
(BTW, I have a 2nd cluster in another namespace that is running correctly)
operator.log
The text was updated successfully, but these errors were encountered: