Happy New Year, Devs!
I hope you’ve had an amazing time with friends and family, and had a chance to recharge your batteries. Now that the holidays are behind us, it’s time to get back in business with some lessons learned and knowledge sharing.
In the 14.2 release, we announced a key update to the default collation for MySQL 8, changing it to utf8mb4_0900_ai_ci
. While MySQL 8.0 and newer already use this collation by default, it’s important to note that it's not included in the mysqldump
output files. That's important If you're migrating to the cloud or upgrading to Sugar 15.0, this discrepancy in collations could cause compatibility issues, impacting your database operations.
While this might not seem immediately necessary for all on-premise customers, consider these points:
- Future-Proofing: If you upgrade to Sugar 15.0, the upgrade process will include this conversion. Running the conversion in advance ensures a smoother, faster upgrade process.
- Controlled Execution: Running the script beforehand allows you to manage and monitor the conversion in a more controlled environment, rather than during the critical upgrade window.
- Performance Impact: Doing the conversion as a separate step allows for better resource allocation and minimizes the performance hit on your system during the upgrade.
Is This Script Required for On-Premise Customers?
While running the script is optional before upgrading, it is mandatory during the upgrade process to Sugar 15.0. Running it in advance will save time and reduce the impact of the upgrade. If performing it during upgrade, the collation conversion could slow down the upgrade substantially, especially for larger databases.
When Should I Convert My On-Premise DB Using This Script?
You should consider running this script before upgrading to Sugar 15.0. This allows the conversion to happen in a controlled, planned manner and avoids delays during the upgrade itself.
The script is designed to be a manual process, giving you the flexibility and control to execute it during a maintenance window or other convenient times for your operations.
Can I Skip This process? Will My Sugar Instance Still Work?
Yes, your Sugar instance will continue to work without running this script. However, the database collation will be converted during the upgrade to Sugar 15.0 automatically.
The script
This script is a separate CLI utility tool that customers who want to prepare their databases for the upgrade and it is not included as part of the Sugar release. A very simple
single-threaded
version of this script is included in the upgrade package and will work for Windows and Linux Sugar installations with MySQL.The multi-process version is only applicable for Linux but will still work on Windows too, but in a single-threded way.
Key Features:
- The goal is to ensure that customers have the clarity and tools to make an informed decision about when and how to apply this script to their environments.
- Supports dry-run testing to validate changes.
- Multi-threaded processing for faster conversions.
- Logs the process, including potential failures and table sizes before/after conversion.
Prerequisites
- PHP 8.0 or higher.
- MySQL 8.0+.
Permissions
- Full database access to execute ALTER TABLE operations.
- CLI access to run the script.
Recommendations
- Run during maintenance windows.
- Back up the database before starting the conversion.
How to Use the Script
- Copy this script and place it on your SugarCRM root directory as collUpdateAlter.php
<?php /* * Your installation or use of this SugarCRM file is subject to the applicable * terms available at * http://support.sugarcrm.com/Resources/Master_Subscription_Agreements/. * If you do not agree to all of the applicable terms or do not have the * authority to bind the entity as an authorized representative, then do not * install or use this SugarCRM file. * * Copyright (C) SugarCRM Inc. All rights reserved. */ if (!str_contains(php_sapi_name(), 'cli')) { die('This is CLI only.'); } require 'vendor/autoload.php'; // Initialize global objects $timedate = TimeDate::getInstance(); $GLOBALS['timedate'] = $timedate; $cfg = new Configurator(); $db = DBManagerFactory::getInstance(); $platform = $db->getConnection()->getDatabasePlatform(); // Configuration const LOG_FILE = 'table_sizes_log.txt'; const DRY_RUN_PREFIX = 'dry_run_before_alter_'; const MAX_TABLE_NAME_LENGTH = 64; $newCollation = 'utf8mb4_0900_ai_ci'; // Collation mapping $collationMap = [ 'utf8mb4_general_ci' => 'utf8mb4_0900_ai_ci', 'utf8mb4_bin' => 'utf8mb4_0900_ai_ci', 'utf8mb4_unicode_ci' => 'utf8mb4_0900_ai_ci', 'utf8mb4_unicode_520_ci' => 'utf8mb4_0900_ai_ci', 'utf8mb4_german2_ci' => 'utf8mb4_de_pb_0900_ai_ci', 'utf8mb4_swedish_ci' => 'utf8mb4_sv_0900_ai_ci', 'utf8mb4_spanish_ci' => 'utf8mb4_es_0900_ai_ci', 'utf8mb4_danish_ci' => 'utf8mb4_da_0900_ai_ci', 'utf8mb4_croatian_ci' => 'utf8mb4_hr_0900_ai_ci', 'utf8mb4_slovak_ci' => 'utf8mb4_sk_0900_ai_ci', 'utf8mb4_slovenian_ci' => 'utf8mb4_sl_0900_ai_ci', 'utf8mb4_polish_ci' => 'utf8mb4_pl_0900_ai_ci', 'utf8mb4_latvian_ci' => 'utf8mb4_lv_0900_ai_ci', 'utf8mb4_lithuanian_ci' => 'utf8mb4_lt_0900_ai_ci', 'utf8mb4_roman_ci' => 'utf8mb4_ro_0900_ai_ci', 'utf8mb4_romanian_ci' => 'utf8mb4_ro_0900_ai_ci', 'utf8mb4_hungarian_ci' => 'utf8mb4_hu_0900_ai_ci', 'utf8mb4_estonian_ci' => 'utf8mb4_et_0900_ai_ci', 'utf8mb4_spanish2_ci' => 'utf8mb4_es_trad_0900_ai_ci', 'utf8mb4_esperanto_ci' => 'utf8mb4_eo_0900_ai_ci', 'utf8mb4_sinhala_ci' => 'utf8mb4_si_0900_ai_ci', ]; // Get configuration and options if (isset($cfg->config['dbconfigoption']['collation']) && isset($collationMap[$cfg->config['dbconfigoption']['collation']])) { $newCollation = $collationMap[$cfg->config['dbconfigoption']['collation']]; } $charset = explode('_', $newCollation)[0]; $options = getopt('', ['max-threads::', 'skip-tables::']); $maxThreads = isset($options['max-threads']) ? (int)$options['max-threads'] : null; $skipTables = isset($options['skip-tables']) ? explode(',', $options['skip-tables']) : []; if ($maxThreads !== null && $maxThreads <= 0) { die("Invalid value for --max-threads. It must be a positive integer.\n"); } // Initialize log file file_put_contents(LOG_FILE, "Collation Change Log\n\n"); // Helper functions function getTableListForConversion($db, $newCollation, $skipTables) { $query = " SELECT DISTINCT t.table_name AS `table` FROM information_schema.tables t LEFT JOIN information_schema.columns c ON t.table_schema = c.table_schema AND t.table_name = c.table_name WHERE t.table_schema = " . $db->quoted(SugarConfig::getInstance()->get('dbconfig.db_name', '')) . " AND t.table_type = 'BASE TABLE' AND ( t.table_collation != " . $db->quoted($newCollation) . " OR c.collation_name != " . $db->quoted($newCollation) . " )"; if (!empty($skipTables)) { $quotedTables = array_map(fn($table) => $db->quoted($table), $skipTables); $query .= " AND t.table_name NOT IN (" . implode(',', $quotedTables) . ")"; } return $db->query($query); } function getFullTableList($db) { return $db->query(" SELECT DISTINCT t.table_name AS `table` FROM information_schema.tables t LEFT JOIN information_schema.columns c ON t.table_schema = c.table_schema AND t.table_name = c.table_name WHERE t.table_schema = " . $db->quoted(SugarConfig::getInstance()->get('dbconfig.db_name', '')) . " AND t.table_type = 'BASE TABLE' "); } function logTableSizeSum($db, $newCollation, $logFile, $message) { $totalSize = 0; $res = getFullTableList($db); while ($row = $db->fetchByAssoc($res)) { $table = $row['table']; $size = getTableSize($db, $table); $totalSize += $size; } $totalSizeMB = round($totalSize / (1024 * 1024), 2); file_put_contents($logFile, "$message: $totalSizeMB MB\n", FILE_APPEND); } function getTableSize($db, $table) { $db->query("ANALYZE TABLE `$table`"); $res = $db->query("SHOW TABLE STATUS LIKE '$table'"); if ($row = $db->fetchByAssoc($res)) { return $row['Data_length'] + $row['Index_length']; } return 0; } function analyzeTables($db, $newCollation) { $res = getTableListForConversion($db, $newCollation, []); while ($row = $db->fetchByAssoc($res)) { $tableName = $row['table']; $db->query("ANALYZE TABLE `$tableName`"); } } function performDryRun($db, $table, $newCollation, $charset) { $fullDryRunTable = $dryRunTable = DRY_RUN_PREFIX . $table; if (strlen($dryRunTable) > MAX_TABLE_NAME_LENGTH) { $dryRunTable = DRY_RUN_PREFIX . md5($table); } // Drop the dry run table if it exists $db->query("DROP TABLE IF EXISTS `$dryRunTable`"); // Create empty copy of the table if (!$db->query("CREATE TABLE `$dryRunTable` LIKE `$table`")) { $error = $db->lastError(); echo "Error creating dry run table for $table: $error\n"; return false; } // Try both ALTER statements $success = true; if (!$db->query("ALTER TABLE `$dryRunTable` COLLATE " . $db->quoted($newCollation))) { $error = $db->lastError(); echo "Dry run failed for table $table ({$fullDryRunTable}) (collation): $error\n"; $success = false; } if (!$db->query("ALTER TABLE `$dryRunTable` CONVERT TO CHARACTER SET " . $db->quoted($charset) . " COLLATE " . $db->quoted($newCollation))) { $error = $db->lastError(); echo "Dry run failed for table $table ({$fullDryRunTable}) (charset conversion): $error\n"; $success = false; } // Clean up $db->query("DROP TABLE IF EXISTS `$dryRunTable`"); return $success; } function alterTable($db, $table, $newCollation, $charset, $chunkNum) { $startTime = microtime(true); $sizeMB = round(getTableSize($db, $table) / (1024 * 1024), 2); echo "Altering table $table (size: $sizeMB MB) in chunk $chunkNum\n"; // Perform the ALTER operations if (!$db->query('ALTER TABLE `' . $table . '` COLLATE ' . $db->quoted($newCollation))) { $error = $db->lastError(); echo "Error altering table $table collation: $error\n"; return false; } if (!$db->query('ALTER TABLE `' . $table . '` CONVERT TO CHARACTER SET ' . $db->quoted($charset) . ' COLLATE ' . $db->quoted($newCollation))) { $error = $db->lastError(); echo "Error converting table $table charset: $error\n"; return false; } $endTime = microtime(true); $timeTaken = round(($endTime - $startTime) / 60, 2); // Time in minutes echo "Table $table (size: $sizeMB MB) was altered in $timeTaken minutes.\n"; return true; } // Main execution flow echo "Starting collation conversion process...\n"; logTableSizeSum($db, $newCollation, LOG_FILE, "Initial table sizes"); analyzeTables($db, $newCollation); logTableSizeSum($db, $newCollation, LOG_FILE, "After first ANALYZE"); // Get tables for conversion $tables = []; $res = getTableListForConversion($db, $newCollation, $skipTables); $totalTables = 0; while ($row = $db->fetchByAssoc($res)) { $size = getTableSize($db, $row['table']); $tables[] = ['name' => $row['table'], 'size' => $size]; $totalTables++; if ($totalTables % 100 === 0) { echo "Analyzed $totalTables tables...\n"; } } if (empty($tables)) { echo "No tables need conversion.\n"; exit(0); } // Sort tables by size (largest first) usort($tables, fn($a, $b) => $b['size'] - $a['size']); // Perform dry run echo "\nPerforming dry run...\n"; $failedTables = []; foreach ($tables as $table) { if (!performDryRun($db, $table['name'], $newCollation, $charset)) { $failedTables[] = $table['name']; } } if (!empty($failedTables)) { echo "\nDry run failed for the following tables:\n"; foreach ($failedTables as $table) { echo "- $table\n"; } echo "\nAborting conversion process.\n"; exit(1); } echo "\nDry run completed successfully. Proceeding with actual conversion.\n\n"; // Determine thread count $totalSizeGB = array_sum(array_column($tables, 'size')) / (1024 * 1024 * 1024); $cpuCount = (int) shell_exec("nproc"); echo "Number of available CPUs: $cpuCount\n"; $threads = match(true) { $totalSizeGB < 10 => 1, $totalSizeGB >= 10 && $totalSizeGB <= 40 => min(2, $cpuCount), default => min(4, $cpuCount) }; if ($maxThreads !== null) { $threads = min($threads, $maxThreads); } echo "Using $threads thread(s).\n"; // Process tables if ($threads === 1) { foreach ($tables as $table) { alterTable($db, $table['name'], $newCollation, $charset, 1); } } else { // Distribute tables among threads $chunks = array_fill(0, $threads, ['tables' => [], 'size' => 0]); foreach ($tables as $table) { usort($chunks, fn($a, $b) => $a['size'] - $b['size']); $chunks[0]['tables'][] = $table['name']; $chunks[0]['size'] += $table['size']; } foreach ($chunks as $i => $chunk) { echo "Chunk " . ($i + 1) . ": " . safeCount($chunk['tables']) . " tables, total size: " . round($chunk['size'] / (1024 * 1024), 2) . " MB\n"; } // Process chunks in parallel $db->disconnect(); $pids = []; for ($i = 0; $i < $threads; $i++) { $pid = pcntl_fork(); if ($pid == -1) { throw new Exception("Could not fork"); } elseif ($pid) { $pids[] = $pid; } else { $db->connect(); foreach ($chunks[$i]['tables'] as $table) { alterTable($db, $table, $newCollation, $charset, $i + 1); } exit(0); } } foreach ($pids as $pid) { pcntl_waitpid($pid, $status); if ($status != 0) { throw new Exception("Child process $pid failed with status code $status"); } echo "\tChild process $pid exited successfully\n"; } $db->connect(); } // Final analysis and logging analyzeTables($db, $newCollation); logTableSizeSum($db, $newCollation, LOG_FILE, "After all ALTER operations"); analyzeTables($db, $newCollation); logTableSizeSum($db, $newCollation, LOG_FILE, "After final ANALYZE"); // Update configuration if needed if (!isset($cfg->config['dbconfigoption']['collation']) || $cfg->config['dbconfigoption']['collation'] != $newCollation) { $cfg->config['dbconfigoption']['collation'] = $newCollation; $cfg->populateFromPost(); $cfg->handleOverride(); } echo "\nCollation conversion completed successfully.\n";
- Run the Script
- php collUpdateAlter.php --max-threads=<number> --skip-tables=<table_names>
--max-threads
: Limits the number of threads used.--skip-tables
: Excludes specified tables from conversion.
- php collUpdateAlter.php --max-threads=<number> --skip-tables=<table_names>
- Monitor Logs
- The script writes logs to
table_sizes_log.txt
.
- The script writes logs to
- Verify Changes
- Check the final log entries for completion.
- Ensure database collation matches the target:
utf8mb4_0900_ai_ci
.
Handling Issues and Errors
- Interrupted Process: Manually resume using the same parameters.
- Out Of Memory Errors: Ensure the system has enough memory or run the script table-by-table for large databases.
- Inconsistent Collation: Validate that all table collations match after completing the script.
Happy conversion folks!