Monday, 7 May 2007

Questions and Answers in the Second Session of the Online Solutions with MySQL Webinar - On Replication

Correction on the INSERT DELAYED
In slide 20 I have mentioned that the INSERT DELAYED statement can increase performance on the slave. This is wrong, since the DELAYED keyword is ignored by the SQL thread on the slave server. The INSERT DELAYED statement can increase the overall performance of an application since the control is returned to the client as soon as the row is queued into the list of inserts to execute. The INSERT DELAYED can be used with MyISAM, MEMORY and ARCHIVE.

Q from Filip: Does master & slave have to be the same db-version, and the same Operative system?
Not necessarily, you can have different versions and operating systems

Q from Danilo: Is there a way to load a backup from the master without locking tables or shutting down the master database?
Using Innodb, you can use mysqldump to produce a consistent backup without locking the tables

Q from Henk: How are primairy keys communicated back from the slave to the master
There is no need: the primary keys generated on the master are pushed "as is" on the slave / they are not regenerated
Q: How does a slave follow up an insert?
It receives the keys generated by the master, so if the slave implements foreign keys (although they would not be necessary), the data on the DB is consistent

Q from Martin: How can mysql prevent binary log corruption when link between master and slave breaks? This has happened to me, which stopped the replication thread indefinitely
The IO thread on the slave stops pumping the log. As soon as the master restart the slave restart from where it stopped

Q from Brendan: What is the latency between a transaction being committed on the master to the same transaction being committed on the slave in a replicated environment?
It depends on several factors - hardware, storage engine, workload etc. From a replication point of view, the transaction is first recorded on the relay log and then it's applied to the slave.

Q from Helen: Are all the DBs on the master server replicated? Can you selectively replicate databases?
You can selectively include or exclude DBs. This works down to the table level

Q from James: Are client-updates prohibited on a slave system when the master is active? If not then how do you prevent inconsistencies?
They are allowed. The slave can be used for any task. it is your responsibility to write only on the master. In order to make the slave read-only, you should work at a user-security level

Q from Jonas: If I understand right, I could configure the master on both servers to address them with a virtual IP, is that correct?
Yes, that is correct, you can do it in order to simplify and speed up the failover

Q from Volker: The HA solution can only work if the binlog positions on Master and HA slave are the same, right?
No, the slave can be a little delayed. For synchronous HA we use others techniques like IO replication (DRBD) or shared disk architecture

Q from Paul: How does master-master replication perform with cross continent latency? eg: UK & USA
MySQL Replication is asynchronous so it will not be affected by latency. The use of separated threads avoid that the geographical position would affect the activity on the slave DB.

Q from Danilo: What does "Non-deterministic writes" mean?
An example of it is an INSERT statement that contains the RAND() function

Q from Dave: Can replication handle 'load data infile' type operations?
Yes, the data loaded is stored in a temporary file in the tmp directory and passed to the slave through the IO thread. The SQL thread will load the data from a file available in the tmp directory of the slave server. The file are security protected from access as other.

Q from Owen: Could you use blackhole on Master 1 and split data afterwards to mutiples partioned DBs
Yes, blackhoe storage is a good way to build a relay before replication split

Q from Andrew: If the update is async, is there a possibility that an update x made after update y in the server is performed y and then x on the slave?
If done in a different transaction that commit in a reverse order it might happen. The locking mechanism of InnoDB and the default isolation level (repeatable read) make replication consistent

Q from Alessandro: I've never heard about BLACKHOLE storage engine. What is it designed for?
It is a storage engine that store nothing in the db but the binlog is generated anyway. It's used to build a relay server, reduce the use of resources and improve speed

Q from John: Can you specify a master db by host name, or must it be an IP address?
You can use either the hostname or the IP address

Q from Clive: Is there any drawback to using dual-master replication (with auto-incr offsets) to allow automatic client failover with complete safety?
Bear in mind that replication is asyncronous. For a complete safety, MySQL cluster might be the solution. or some other HA techniques : IO replication / shared disk

Q from Rob: Does master & slave have to be the same db-version, and the same Operative system. You can always replicate to a newer version right?
You can replicate to newer version and/or different OSs

Q from Rob: Is it possible to share query cache over multiple slaves?
No, the query cache are on a server basis

Q from Daniel: If I save binlogs in a NAS and net connection/write fails, will the server still work?
The master must be able to write to its current log and to generate the next one. Then, once logs are rotated, they can be pushed anywhere for safety. NAS is a solution.

Q from Carsten: Can 2 masters read from the same database files? (Implementing DFS for example) ?
This can be done using cluster-aware file systems, but there are some aspects to be carefully considered, such as the storage engine to use and the query locking.

Q from Phil: What do you think the impact of placing data and logs on different drives is?
This is an important point to optimize the IOs and it is recommended

Q from Manuel: I think multimaster replication writing on both to a table with autoincrement column will cause heavy fragmentation of data and index. How can this affect to query performance?
If load is fairly balanced between nodes, I do not see this as a problem. In other cases, it might be necessary to periodicaly optimize tables.

Q from Domenico: Is ti possible to replicate from more then one masters to fews slaves?
Currently, a slave can be linked only to one master. There are plans to change this and create multisource replication

Q from Phil: Is the use of high performance RAID important for the logs?
Absolutely, performance on the master will get benefit from this

Q from Alessandro: What is vertical partitioning?
It's a way to separate data by columns instead of rows. For example, if a large table has 10 columns, C1, C2 to C10, one can split this table into two tables: the first table will contain, for example, C1, C2 and C3, which are the most accessed columns for read and write. The second table will contain the primary key and the remaining 7 columns, that will be rarely updated. This approach can improve performance significantly, since the tables are smaller, the columns fit easier in the blocks used by the storage engine and in the caching mechanisms.

Q from Ian: When is it a good idea to choose MySQL Cluster rather than using replication for HA?
Replication and Cluster are two completely different technologies. In short terms, Replication is the right choice for scale-out solutions when there are lots of read operations, such as on web search and read. MySQL Cluster is mainly used when the ratio of read and write operations is closed to 1:1 and when queries affect small resultsets with direct key access or simple joins.

Q from Ap: What is the best method to load data into the slave, use mysqldump or use "load data from master"
LOAD DATA FROM MASTER is deprecated, the best solution is to use mysqldump

Q from Manuel: With master/slave replication you mentioned "automatic failover", how can you do that?
You can use clustering and HA software such as Linux HA to control the activity of the master. The software can switch over to the slave in case of issues. The client application will get access to the new master using a virtual IP address.

Q from Danilo: Is it possible connect master and slave with a serial link?
It's possible to connect master and slaves to any TCP/IP connection

Q from Jon: For those people who aren't lucky enough to have access to MySQL Enterprise - are there any other methods of monitoring slave status (e.g. SNMP)?
Yes, there are some monitoring tools that could help: Munin, Cacti and Ganglia might help. You can find information on these tools in the third webinar of the series (check the slides here)

Q from Jonas: If I have a 2-node cluster, with Heartbeat, each node has one IP (let's say IP1 and IP2, and share a VIP), can I configure both servers to be slaves, and the master be the VIP? Is not the active node both master and slave of himself?
Yes, in a sense, you can have a circular master replication and the master is accessible using the VIP

Q from Owen: What is best configuration for HA in a MyISAM environment with high OTW but also high muti joined selects for reporting?
IMO, the multi-joined selects are not affected by the HA solutions provided by MySQL. MyISAM can certainly help (provided that the system is generally safe from index corruption. If the requirement is to have a reporting server with multi-joined selects aside a classic OLTP server, then replication is probably the best solution.

Q from Manuel: When you configure a slave, you didn´t configure the log pos, why?
In the example, we supposed to start from a fresh installation. The log position is important when there the master and the slave must be synchronised while the system is running.

No comments: