6 Easy Tips to reduce the size of MySQL Database

0
72014

Sooner or later all database administrators face up the problem of huge database disk usage. The bigger the database the more it costs for support and maintenance. This article provides easy tips to reduce the size of MySQL database.

Backup, first but not least

Best practices suggest to backup your database before take any dangerous action. MySQL and MariaDB include the mysqldump utility to simplify the process to create a backup of a database or system of databases.

# Backup Single DB
mysqldump -u [username] -p [databaseName] > [filename]-$(date +%F).sql
# Backup Entire DBMS
mysqldump --all-databases --single-transaction --quick --lock-tables=false > full-backup-$(date +%F).sql -u root -p

If you are using MariaDB, you have the option of  full or incremental backup via mariabackup utility.

$ mariabackup --backup --target-dir=/var/mariadb/backup/ --user=myuser --password=mypassword

Check more MySQL commands on MySQL Cheat Sheet.

List MySQL Table and Index Size

Use the following queries to monitor and evaluation table and index size.

Query below returns the size of each Database in MB.

MariaDB [(none)]> SELECT table_schema "DB Name", Round(Sum(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB"
 FROM information_schema.tables
 GROUP BY table_schema;
+--------------------+---------------+
| DB Name            | DB Size in MB |
+--------------------+---------------+
| hope               |       75714.0 |
| information_schema |           0.2 |
| mysql              |           1.9 |
| performance_schema |           0.0 |
+--------------------+---------------+

Query below returns the size of each Table in MB.

SELECT table_schema as `Database`,table_name AS `Table`,round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
FROM information_schema.TABLES
ORDER BY (data_length + index_length) DESC
LIMIT 5;       -- adjust it according to your needs
+----------+------------------------+------------+
| Database | Table                  | Size in MB |
+----------+------------------------+------------+
| hope     | eth_products           |   44029.54 |
| hope     | eth_customers          |   28868.08 |
| hope     | eth_emails             |    1423.92 |
| hope     | eth_id                 |    1392.43 |
| mysql    | help_topic             |       1.38 |
+----------+------------------------+------------+

Query below returns index size ordered from the ones using the most.

SELECT table_schema as database_name,table_name,round(index_length/1024/1024,2) as index_size
FROM information_schema.tables
WHERE table_type = 'BASE TABLE'
     and table_schema not in ('information_schema', 'sys','performance_schema', 'mysql')
     and table_schema = 'your database name'
ORDER BY index_size desc;
+---------------+------------------------+------------+
| database_name | table_name             | index_size |
+---------------+------------------------+------------+
| hope          | eth_products           |   18561.74 |
| hope          | eth_customers          |   12037.89 |
| hope          | eth_emails             |     638.70 |
| hope          | eth_id                 |     607.20 |
| hope          | eth_temp               |       0.00 |
+---------------+------------------------+------------+

Delete Unwanted Data

The easier and the hardest way to reduce MySQL size is by deleting all the unwanted data.  DB admins usually fill in DB  tables or columns with unnecessary data.  A DB schema reevaluation is essential to identify such cases.

The following query helps determine the last time a table is updated.

SELECT table_schema,table_name,create_time,update_time from information_schema.tables 
WHERE table_schema not in ('information_schema','mysql') 
	and engine is not null and ((update_time < (now() - interval 1 day)) or update_time is NULL) 
LIMIT 5;
+--------------+------------------------+---------------------+---------------------+
| table_schema | table_name             | create_time         | update_time         |
+--------------+------------------------+---------------------+---------------------+
| MariaDB      | eth_table1             | 2019-08-23 20:52:51 | 2019-08-23 22:54:34 |
| MariaDB      | eth_table2             | 2019-08-23 19:20:23 | 2019-08-23 19:20:23 |
| MariaDB      | eth_table3             | 2019-08-23 19:20:29 | 2019-08-23 19:20:29 |
| MariaDB      | eth_table4             | 2019-08-26 19:18:04 | 2019-08-26 21:05:10 |
| MariaDB      | eth_temp               | 2019-08-25 01:52:33 | 2019-08-25 21:16:16 |
+--------------+------------------------+---------------------+---------------------+

When all unused data are unidentified the following commands will help you delete them :

Be extra careful when using delete/drop commands ! Deleted data cannot be recovered!

DELETE FROM table1 / TRUNCATE table1        --Deletes all Records
DELETE FROM table1 WHERE condition          --Deletes records based on a condition
DROP TABLE table                            --Deletes table
DROP DATABASE                               --Deleting database
ALTER TABLE table_name DROP column_name;    --Deletes a column 

Find and Remove Unused Indexes

A general rule of thumb is that the more indexes you have on a table, the slower INSERT, UPDATE, and DELETE operations will be and more disk space will be consumed. It is essential to track down unused indexes that consume disk space and slow down your database.

By default, statistics are not collected. This is to ensure that statistics collection does not cause any extra load on the server unless desired.

To enable statistics dynamically execute the following command :

SET GLOBAL userstat=1;

Now, every query to the database updates the statistic tables. The TABLE_STATISTICS and INDEX_STATISTICS are the most interesting tables. The first table shows the number of rows reads from the table and the number of rows changed in the table. The second table shows statistics on index usage.

SELECT * FROM INFORMATION_SCHEMA.TABLE_STATISTICS;
+--------------+------------------------+-----------+--------------+------------------------+
| TABLE_SCHEMA | TABLE_NAME             | ROWS_READ | ROWS_CHANGED | ROWS_CHANGED_X_INDEXES |
+--------------+------------------------+-----------+--------------+------------------------+
| hope         | eth_table1             |      1004 |            0 |                      0 |
| hope         | eth_table2             |  14343683 |            0 |                      0 |
| hope         | eth_table3             |      1002 |            0 |                      0 |
+--------------+------------------------+-----------+--------------+------------------------+
SELECT * FROM INDEX_STATISTICS;
+--------------+-------------------+------------+-----------+
| TABLE_SCHEMA | TABLE_NAME        | INDEX_NAME | ROWS_READ |
+--------------+-------------------+------------+-----------+
| hope         | eth_table1        | PRIMARY    |         2 |
| hope         | eth_table2        | PRIMARY    |         4 |
+--------------+-------------------+------------+-----------+

With the help of the new tables, you can find all unused indexes in a single query.

SELECT DISTINCT s.TABLE_SCHEMA, s.TABLE_NAME, s.INDEX_NAME
FROM information_schema.statistics `s` LEFT JOIN information_schema.index_statistics INDXS
ON (s.TABLE_SCHEMA = INDXS.TABLE_SCHEMA AND
   s.TABLE_NAME=INDXS.TABLE_NAME AND
   s.INDEX_NAME=INDXS.INDEX_NAME)
WHERE INDXS.TABLE_SCHEMA IS NULL;

Finally, to delete index run this command in mysql client :

DROP INDEX index_name ON table_name;

Do not forget to set userstat=0 when statistics are not required anymore.

Shrink and Optimize MySQL

In general MySQL InnoDB does not release disk space after deleting data rows from the table. It keeps the space to reuse it later.

OPTIMIZE TABLE reorganizes the physical storage of table data and associated index data, to reduce storage space and improve I/O efficiency when accessing the table. The exact changes made to each table depending on the storage engine used by that table.

Optimization is available only when innodb_file_per_table is enabled.  Check your configuration like this :

MariaDB [(none)]> show variables like "innodb_file_per_table";
+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| innodb_file_per_table | ON    |
+-----------------------+-------+

You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data file.

OPTIMIZE table MyTable;
+-------------------+----------+----------+----------+
| Table             | Op       | Msg_type | Msg_text |
+-------------------+----------+----------+----------+
| testDB.MyTable    | optimize | status   | OK       |
+-------------------+----------+----------+----------+

When innodb_file_per_table is OFF, then all data is going to be stored in ibdata files. If you drop some tables and delete some data, then there are two ways to reclaim that unused disk:

  • Completely dump the database, then drop it and finally reload it
  • Change the storage engine and revert it back to your previous configuration. This will definitively recreate the table and indexes from the start and unused disk space will be reclaimed.
ALTER TABLE my_table ENGINE = MyISAM;
ALTER TABLE my_table ENGINE = InnoDB;

Whenever you run OPTIMIZE or ALTER TABLE , MySQL will create a new data file until the operation is finished. Do not forget to have enough available space for the operations to finish successfully! 

If you want to optimize a 10 GB Table, ensure that there are more than 10GB of free disk space.

Optimize Datatypes for Columns

Selecting the correct datatype for a field in a table may have a significant impact on the disk usage.

A 40 digits hash can be defined with different datatypes. If  hash is defined as constant VA(40) the key length will be 40.

EXPLAIN SELECT * 
FROM eth_myhashes 
WHERE hash='0000032f265716fc8107f49f15336ec291169679'\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: customers
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 40
          ref: const
         rows: 1

If a hash is defined as BINARY then the key length is 20, so the disk usage is much less.

Do not forget to convert binary to hex with x.

EXPLAIN SELECT * 
FROM eth_newtable 
WHERE address=x'0000032f265716fc8107f49f15336ec291169679'\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: customers
         type: const
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 20
          ref: const
         rows: 1

Advanced Topic: If your database contains hash indexes, you might not need to index full length of the hash but a smaller portion. Prefix index key length is dependent on hashes uniqueness and a complex analysis is required to determine it.

Enable Columns Compression (Only InnoDB)

The row format of a table determines how its rows are physically stored, which in turn can affect the performance of queries, DML operations, and disk usage.

The COMPRESSED row format is similar to the COMPACT row format, but tables using the COMPRESSED row format can store even more data on overflow pages than tables using the COMPACT row format. This results in more efficient data storage than tables using the COMPACT row format, especially for tables containing columns using the VARBINARY, VARCHAR, BLO  and TEXT data types.

ALTER TABLE `myTable`
	ROW_FORMAT=COMPRESSED;

Compress Table (Only MyISAM)

MySQL comes with a built-in tool, myisampack, to “pack”, or compress, MyISAM tables (MySQL’s default table format) so that they use less disk space. The myisampack command works equally well on small and large MyISAM tables, so even if you have a database with a few hundred records, you can use myisampack to compress it and save yourself a little disk space.

The resulting tables are read-only, and usually reduce the size of MySQL about 40% to 70% .

$sudo /etc/init.d/mysql stop                         # Stop DB
$cd /var/lib/mysql/
$myisamchk database.MYI                              # DB is compressed
$myisamchk -rq --sort-index --analyze database.MYI   # Index Regenaration
$mysqladmin flush-tables                             # Close open tables and flush
$sudo /etc/init.d/mysql start                        # Start DB

 

LEAVE A REPLY

Please enter your comment!
Please enter your name here