Fix: Zabbix Docker Stuck After Upgrade With MySQL 8.4

by Felix Dubois 54 views

Hey everyone,

If you've run into snags upgrading your Zabbix setup to versions 7.2 or 7.4 using Docker and MySQL 8.4, you're definitely not alone. This guide will walk you through a common issue where the Zabbix Docker containers get stuck during initialization, specifically at the zabbix-docker-server-db-init-1 stage. We'll break down the problem, explore the root cause, provide a temporary fix, and discuss long-term solutions to ensure a smooth Zabbix experience.

The Problem: Stuck in Waiting

After initiating the Docker Compose upgrade, you might find your containers hanging in a perpetual "Waiting" state, like this:

sudo docker compose up -d                                                                                          
[+] Running 7/10                                                                                                                                   
 ✔ Network zabbix-docker_backend                     Created                                                                                  0.0s 
 ✔ Network zabbix-docker_default                     Created                                                                                  0.0s 
 ✔ Network zabbix-docker_frontend                    Created                                                                                  0.0s 
 ✔ Network zabbix-docker_tools_frontend              Created                                                                                  0.0s 
 ✔ Network zabbix-docker_database                    Created                                                                                  0.0s 
 ✔ Container zabbix-docker-db-data-mysql-1           Started                                                                                  0.5s 
 ✔ Container zabbix-docker-mysql-server-1            Started                                                                                  0.5s 
 ⠸ Container zabbix-docker-server-db-init-1          Waiting                                                                                 32.6s 
 ⠙ Container zabbix-docker-zabbix-web-nginx-mysql-1  Starting                                                                                32.5s 
 ⠙ Container zabbix-docker-zabbix-server-1           Starting                                                                                32.5s

Digging into the logs of the zabbix-docker-server-db-init-1 container, you'll likely see a recurring message:

$ sudo docker compose logs --follow --tail=100                                                                       
server-db-init-1  | ** Preparing database                                                                                                          
server-db-init-1  | ** Using MYSQL_USER variable from secret file                                                                                  
server-db-init-1  | ** Using MYSQL_PASSWORD variable from secret file                                                                              
server-db-init-1  | ********************                                                                                                           
server-db-init-1  | * DB_SERVER_HOST: mysql-server                                                                                                 
server-db-init-1  | * DB_SERVER_PORT: 3306                                                                                                         
server-db-init-1  | * DB_SERVER_DBNAME: zabbix                                                                                                     
server-db-init-1  | ********************                                                                                                           
server-db-init-1  | **** MySQL server is not available. Waiting 5 seconds...                                                                       
server-db-init-1  | **** MySQL server is not available. Waiting 5 seconds...                                                                       
server-db-init-1  | **** MySQL server is not available. Waiting 5 seconds... 

Despite the logs indicating the MySQL server is unavailable, you might find that the MySQL server container is actually up and running. So, what's the real issue here? Let's dive deeper.

Unpacking the Root Cause: MySQL 8.4 and the mysql_native_password Plugin

The key to understanding this issue lies in a significant change introduced in MySQL 8.4. This version disables the mysql_native_password authentication plugin, and it's slated to be completely removed in MySQL 9.0. This plugin is an older authentication method, and MySQL is moving towards more secure alternatives like caching_sha2_password. The problem arises because the Zabbix initialization process, by default, might be attempting to connect using the mysql_native_password plugin. Let's verify this by connecting to the MySQL server container:

sudo docker exec -it zabbix-docker-mysql-server-1 bash       
bash-5.1# mysql                                                                                                                                    
ERROR 1524 (HY000): Plugin 'mysql_native_password' is not loaded

This error confirms that the mysql_native_password plugin is indeed the culprit. The Zabbix initialization script cannot connect to the database because the plugin it's trying to use is disabled.

Why This Happens

So, why is Zabbix trying to use this older plugin?

The authentication plugin selection is typically managed within the mysql.user table. This table stores user credentials and their associated authentication methods. If your Zabbix database was initially set up with users configured to use mysql_native_password, the upgrade to MySQL 8.4 will trigger this incompatibility.

The Catch-22:

The tricky part is that to change the authentication plugin for existing users, you need to connect to the database. However, in this state, the Zabbix initialization process can't connect, creating a classic chicken-and-egg scenario.

A Temporary Workaround: Re-enabling mysql_native_password

While not a long-term solution, a quick way to get your Zabbix containers up and running is to temporarily re-enable the mysql_native_password plugin in your compose_databases.yaml file. This allows the initialization process to complete. Here's how you can modify your Compose file:

-- a/compose_databases.yaml
+++ b/compose_databases.yaml
@@ -3,6 +3,7 @@ services:
   image: "${MYSQL_IMAGE}:${MYSQL_IMAGE_TAG}"
   command:
    - mysqld
+   - --mysql-native-password=ON
    - --skip-mysqlx
    - --character-set-server=utf8mb4
    - --collation-server=utf8mb4_bin

This adds the --mysql-native-password=ON option to the MySQL server's startup command, re-enabling the plugin. After making this change, restart your Docker Compose setup:

sudo docker compose up -d

With the plugin re-enabled, the Zabbix initialization should proceed without issues. Remember, this is a temporary fix. We need to address the underlying issue of users still relying on the deprecated plugin.

Examining User Authentication Plugins

Once your Zabbix instance is running, it's crucial to investigate which users are still using the mysql_native_password plugin. You can do this by connecting to the MySQL database and querying the mysql.user table:

mysql> select user,host,plugin from mysql.user;
+------------------+-----------+-----------------------+
| user             | host      | plugin                |
+------------------+-----------+-----------------------+
| root             | %         | mysql_native_password |
| zabbix           | %         | mysql_native_password |
| mysql.infoschema | localhost | caching_sha2_password |
| mysql.session    | localhost | caching_sha2_password |
| mysql.sys        | localhost | caching_sha2_password |
| root             | localhost | mysql_native_password |
+------------------+-----------+-----------------------+
6 rows in set (0.01 sec)

In this example, the root and zabbix users are still using mysql_native_password. We need to migrate these users to a more modern authentication plugin.

The Path Forward: Migrating to caching_sha2_password

To ensure long-term compatibility and security, the correct solution is to migrate users from mysql_native_password to caching_sha2_password (or another supported, secure plugin). However, it's crucial to proceed with caution.

Why Caution Is Key:

Simply changing the plugin without proper testing and consideration can break your Zabbix setup. The Zabbix components (server, web interface, agents) need to be compatible with the new authentication method. Before making any changes, ensure that your Zabbix version fully supports caching_sha2_password.

Steps to Migrate (with caution and testing):

  1. Verify Compatibility: Consult the Zabbix documentation for your specific version to confirm support for caching_sha2_password.

  2. Backup: Before making any changes to your database, create a full backup. This will allow you to revert if something goes wrong.

  3. Change the Authentication Plugin: You can change the plugin using the ALTER USER statement in MySQL. For example, to migrate the zabbix user, you would run:

    ALTER USER 'zabbix'@'%' IDENTIFIED WITH caching_sha2_password BY 'your_zabbix_password';
    

    Replace 'your_zabbix_password' with the actual password for the Zabbix user.

    Important: You'll need to do this for each user that is using mysql_native_password.

  4. Flush Privileges: After changing the plugin, flush the MySQL privileges to apply the changes:

    FLUSH PRIVILEGES;
    
  5. Test Thoroughly: After migrating the users, thoroughly test your Zabbix setup. Ensure that the server, web interface, and agents can all connect to the database without issues. Monitor for any errors or unexpected behavior.

  6. Consider auth_socket for root@localhost: For the root@localhost user, you might consider using the auth_socket plugin. This allows connections from the local server as the root user without a password, enhancing security. However, make sure you understand the implications before implementing this.

A Note on Zabbix Migration Scripts

As the original poster mentioned, it would be ideal if Zabbix migration scripts automatically handled this plugin migration during upgrades. This would prevent users from encountering this issue in the first place. Hopefully, future versions of Zabbix will include this functionality.

Conclusion: Planning for the Future

The issue of being stuck in the zabbix-docker-server-db-init-1 stage after upgrading to Zabbix 7.2 or 7.4 with MySQL 8.4 highlights the importance of staying ahead of database changes and planning for compatibility. While the temporary workaround of re-enabling mysql_native_password can get you up and running, the long-term solution involves migrating users to more secure authentication plugins like caching_sha2_password. Always remember to test thoroughly and back up your data before making any significant changes to your database.

By understanding the root cause of this issue and taking the necessary steps to address it, you can ensure a smooth and secure Zabbix experience, even as database technologies evolve. Keep monitoring your Zabbix instances, and stay tuned for future updates and best practices!