The interconnect can use Ethernet (100 Mbps or greater) or SCI
(Scalable Coherent Interconnect, a high-speed cluster interconnect protocol). It is
most effective for clusters with medium to large datasets; the recommended configuration
is 1??“8 nodes with 16 GB of RAM each.
Because the majority of the data is stored in memory, the cluster must have enough
memory to store as many redundant copies of the full working set as the application
dictates. This number is called the replication factor. With a replication factor of 2,
each piece of data is stored on two separate servers, and you can lose only one server
out of the cluster without losing data.
For high availability, at least three physical servers must be used: two data nodes and
a management node. The management node is needed to arbitrate between the two
data nodes if they become disconnected and out of synchronization with each other.
A replication factor of 2 is used, so the two data nodes must each have enough memory
to hold the working set, unless disk storage is used.
122 | Chapter 4: Database
Since the Cluster software is simply a storage engine, the cluster is accessed through
a standard MySQL server with tables defined with the NDB backend. The server
accesses the cluster to fulfill requests from the client. The overall architecture is
shown in Figure 4-3.
Because the mysqld servers only differ from nonclustered servers in their backend,
they can be replicated with binlogs just as nonclustered servers can.
Pages:
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192