In terms of working with highly loaded databases, any badly-designed request can cause significant performance degradation. Depending on the used DBMS features, various methods are used for query optimization.
However, a general rule can be pointed out for all types of DBMS: the fewer server resources are expended when executing a query, the more effective will be the query itself.
To detect bottlenecks, MySQL provides the EXPLAIN command. It will help tell you what exactly the database server does, when you execute the query.
To obtain data on executing a query, it is enough to set EXPLAIN before each targeted query. However, putting EXPLAIN for a large number of constructs can become rather tedious.
To simplify search for a problem query let us use MySQL Profiler in dbForge Studio for MySQL. It provides such commands as a Query Profiling Mode and Generate Execution Plan.
To demonstrate work with execution plan, let’s consider a small sample. Suppose we need to find vendors of PCs, but not laptops, basing on the following data:
CREATE TABLE models (
model_id INT(11) NOT NULL PRIMARY KEY,
model_type VARCHAR(10) NOT NULL
) ENGINE = INNODB;
CREATE TABLE products (
maker_id INT(11) NOT NULL,
model_id INT(11) NOT NULL,
PRIMARY KEY (maker_id, model_id),
CONSTRAINT FK_products_models_model_id
FOREIGN KEY (model_id)
REFERENCES models (model_id)
ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE = INNODB;
INSERT INTO models(model_id, model_type)
VALUES
(1, 'PC'), (2, 'Laptop'), (3, 'Tablet'),
(4, 'Phone'), (5, 'PC'), (6, 'Laptop');
INSERT INTO products(maker_id, model_id)
VALUES
(1, 1), (1, 2), (1, 3), (2, 4),
(4, 4), (2, 5), (3, 5), (3, 6);
This task can be solved quite trivial:
SELECT DISTINCT p.maker_id
FROM products AS p
JOIN models m ON p.model_id = m.model_id
WHERE m.model_type = 'PC'
AND NOT EXISTS (
SELECT p2.maker_id
FROM products AS p2
JOIN models m2 ON p2.model_id = m2.model_id
WHERE m2.model_type = 'Laptop'
AND p2.maker_id = p.maker_id
);
When designing a query, one always ought to pay attention to re-readings from the same tables, since they bear an additional load on the database server disk subsystem:
The query can be rewritten with no re-readings, using aggregating functions:
SELECT p.maker_id
FROM products AS p
JOIN models AS m ON p.model_id = m.model_id
WHERE m.model_type IN ('PC', 'Laptop')
GROUP BY p.maker_id
HAVING COUNT(CASE WHEN m.model_type = 'PC' THEN 1 END) > 0
AND COUNT(CASE WHEN m.model_type = 'Laptop' THEN 1 END) = 0;
However, having a large number of records, JOIN to the models table can still affect performance.
To speed up the query, denormalization of the scheme can be performed by duplicating data from the model_type column of the products table:
UPDATE products AS p
JOIN models AS m ON p.model_id = m.model_id
SET p.model_type = m.model_type;
Also, pay attention to the fact that the model_type column is filtered by, so it is recommended to index it.
At that, selection of columns, that will be included to the primary key, should be approached carefully. Since columns from the primary key are included to the nonclustered index, this can increase database size.
As a result, selection is significantly simplified:
SELECT maker_id
FROM products
GROUP BY maker_id
HAVING COUNT(CASE WHEN model_type = 'PC' THEN 1 END) > 0
AND COUNT(CASE WHEN model_type = 'Laptop' THEN 1 END) = 0;
At this, the plan becomes simple at all:
However, the root of performance problems should not always be searched for in queries – it is very important to correctly configure database server settings. For this, the Server Variables tab can be used.
In this case, all the settings, for convenience, can be grouped according to their target facilities:
There is no perfect advice how to optimize a query. However, using the profiler embedded into dbForge Studio for MySQL, there is a probability to detect all the performance bottlenecks.
Optimal settings for MySQL server can also vary depending on the specificity of performed tasks. Therefore, detailed consideration of the server configuration process will be revealed in a separate article.
Reference: Pinal Dave (https://blog.sqlauthority.com)
2 Comments. Leave new
Thanks for sharing.
Duplicating data is a bad idea though. It solves one problem and creates new ones.