What did I read
From http://wiki.apache.org/cassandra/Operations#Moving_nodes up
If you add nodes to your cluster your ring will be unbalanced and only way to get perfect balance is to compute new tokens for every node and assign them to each node manually by using nodetool move command.
From http://www.datastax.com/docs/1.1/operations /cluster_management#adding-capacity-to-an-existing-cluster start
If you need to increase capacity by a non-uniform number of nodes, you must recalculate tokens for the entire cluster, and then use nodetool move to assign the new tokens to the existing nodes. After all nodes are restarted with their new token assignments, run a nodetool cleanup to remove unused keys on all nodes
But I am not clear about the order of these things.
Can you explain how to do this in the following scenario?
>I am using cassandra 1.1.9, so no virtual nodes are used.
>I have a cluster ring with 5 nodes, each owning 20% of their tokens.
> Yes
> 0
> 34028236692093846346337460743176821145
> 68056473384187692692674921486353642291
> 102084710076281539039012382229530463436
> 136112946768375385385349842972707284582
I want to add 2 additional nodes.
What steps must I follow? I know that I should install and configure cassandra, use the original 5 as a seed, and calculate their new tokens, but in what order should I use nodetool move to move data? Is it one at a time?
When I move the first data, what happens to the data? Is it always available?
Should I start two new nodes before moving the original 5 to the new token?
A step-by-step guide is ideal.
Please note that I need to complete before version 1.2
> 0
> 24305883351495604533098186245126300818
> 48611766702991209066196372490252601636
> 72917650054486813599294558735378902454
> 97223533405982418132392744980505203272
> 121529416757478022665490931225631504090
> 145835300108973627198589117470757804908
Use 2 ^ 127/7 * {0-7} to calculate.
What steps do I have to follow?
in what order should I move the data using nodetool move?
You should
>Bootstrap in a node, call: 48611766702991209066196372490252601636
>Guide another node 121529416757478022665490931225631504090
>Move 34028236692093846346337460743176821145 to 24305883351495604533098186245126300818
>Move 68056439384187692692674921486353902642291 to 7291765005448681359929415 84582 to 145835300108973627198589117470757804908
(I tried to minimize the amount of data transferred-it may not be optimal but close enough, because you may already have data imbalance without much difference.)
p>
Is it one at a time?
You should guide one node and then move one token at a time. This can avoid Excessive load is placed on the cluster while streaming data.
What happens with the data when I move the first one? Is it available at all times?
p>
Data is fully available during the movement. The node participates in the read and write operations in the old range and the new range, so you can perform read and write operations during the move.
Should I start the two new nodes before moving the original 5 to their new tokens?
It is better to have more nodes in the cluster-if you move first, then some The data volume of these nodes is twice that of other nodes.
I have read the relevant documents I can find, but I still have questions.
What did I read
From http://wiki.apache.org/cassandra/Operations#Moving_nodes
If you add nodes to your cluster your ring will be unbalanced and only way to get perfect balance is to compute new tokens for every node and assign them to each node manually by using nodetool move command.
From http://www.datast ax.com/docs/1.1/operations/cluster_management#adding-capacity-to-an-existing-cluster start
If you need to increase capacity by a non -uniform number of nodes, you must recalculate tokens for the entire cluster, and then use nodetool move to assign the new tokens to the existing nodes. After all nodes are restarted with their new token assignments, run a nodetool cleanup to remove unused keys on all nodes
But I am not clear about the order of these things.
Can you explain how to do this in the following scenario?
>I am using cassandra 1.1.9, so no virtual nodes are used.
>I have a cluster ring with 5 nodes, each owning 20% of their tokens.
> Yes
> 0
> 34028236692093846346337460743176821145
> 68056473384187692692674921486353642291
> 102084710076281539039012382229530463436
> 136112946768375385385349842972707284582
I want to add 2 additional nodes.
What steps must I follow? I know that I should install and configure cassandra, use the original 5 as a seed, and calculate their new tokens, but in what order should I use nodetool move to move data? Is it one at a time?
When I move the first data, what happens to the data? Is it always available?
Should I start two new nodes before moving the original 5 to the new token?
A step-by-step guide is ideal.
Please note that I need to complete before version 1.2
The new token should be
> 0
> 24305883351495604533098186245126300818
> 48611766702991209066196372490252601636
> 72917650054486813599294558735378902454
> 97223533405982418132392744980505203272
> 1215294165 p>
Calculate using 2 ^ 127/7 * {0-7}.
What steps do I have to follow?
in what order should I move the data using nodetool move?
You should
>Bootstrap in one node, Tel: 48611766702991209066196372490252601636
>Guide another node 121529416757478022665490931225631504090
>Move 34028236692093846346337460743176821145 to 24305883351495604533098186245126300818
>Move 68056473384187692692674921486353642291 to 72917650054486813599294558735378902454
>Move 102084710076281539039012382229530463436
>I try to move to 972470, 972,285,334,538,824,18132392744,385,349,503,436,973,470,972,,,,,,,,,,,,,,,,,,,,,,,,,,,// Reduce the amount of data transferred-it may not be optimal but close enough, because you may already have data imbalance and there is not much difference.)
Is it one at a time?
You should bootstrap one node and then move one token at a time. This can avoid putting too much load on the cluster when streaming data.
What happens with the data when I move the first one? Is it available at all times?
Data is fully available during the move The node participates in the read and write operations of the old range and the new range, so you can perform read and write operations during the move.
Should I start the two new nodes before moving the original 5 to their new tokens?
It is better to have more nodes in the cluster-if you move first, some nodes will have twice the amount of data than others.
>