Georgi Sotirov's Blog

My personal and professional life

2019-07-31

New features in MySQL 8.0.17

Last Tuesday (July 27-th), Oracle released MySQL 8.0.7 following the three months release cycle introduced last year with 8 series. This is the next "maintenance" release introducing a few new features as well as depreciating some non-standard ones, so here are my picks starting with the ones I consider important for developers.

Multi-valued indexes

With the help of generated columns (introduced with MySQL 5.7.6) and functional indexes (introduced with MySQL 8.0.13 for which I wrote before) it become possible to index data in complex column values like JSON. But in JSON you could have scalars as well as arrays, so searching into arrays with the help of an index wasn't possible. Multi-valued indexes come to solve this by allowing multiple records in the index to point to the same data record. Such indexes are created like any other functional index and used automatically by the optimizer when possible. Let's see an example - a register for translators with their spoken languages.

CREATE TABLE translators (
  id INT AUTO_INCREMENT,
  jdata JSON,

  PRIMARY KEY(id)
);

  • Case 1 - array of strings
Let's create some data:

INSERT INTO translators (jdata)
VALUES ('{"name": "T1", "langs": ["English", "French", "Spanish"]}'),
       ('{"name": "T2", "langs": ["English", "Spanish"]}'),
       ('{"name": "T3", "langs": ["French", "Spanish"]}');


Then, let's query the translators that speak English using the new MEMBER OF operator:

SELECT id, jdata->>'$.name', jdata->'$.langs'
  FROM translators
 WHERE 'English' MEMBER OF (jdata->'$.langs');


Or the new function JSON_OVERLAPS like this:

SELECT id, jdata->>'$.name', jdata->'$.langs'
  FROM translators
 WHERE JSON_OVERLAPS(jdata->'$.langs', '["English"]');


Both queries are leading to the same result:

+----+------------------+----------------------------------+
| id | jdata->>'$.name' | jdata->'$.langs'                 |
+----+------------------+----------------------------------+
|  1 | T1               | ["English", "French", "Spanish"] |
|  2 | T2               | ["English", "Spanish"]           |
+----+------------------+----------------------------------+
2 rows in set (0.00 sec)


Given the data as expected the queries return T1 and T2, but not T3. However, these queries are doing a full table scan, so their performance would degrade with the accumulation of data in the table.

Execution plan without index

Fortunately, it's now possible to add a multi-valued index like this:

ALTER TABLE translators
  ADD INDEX idx_langs_arr ((CAST(jdata->'$.langs' AS CHAR(8) ARRAY)));

It's a functional index into which it's necessary to use function CAST with the new ARRAY keyword. With the index the execution plan of the SELECT queries above become respectively:
Execution plan of MEMBER OF with indexExecution plan of JSON OVERLAPS with index

  • Case 2 - array of objects
It's slightly different for arrays of objects, but only for the JSONPath expression. Let's create some data (after cleaning Case 1):

INSERT INTO translators (jdata)
VALUES ('{"name": "T1", "langs": [{"lang": "English"}, {"lang": "French"}, {"lang": "Spanish"}]}'),
       ('{"name": "T2", "langs": [{"lang": "English"}, {"lang": "Spanish"}]}'),
       ('{"name": "T3", "langs": [{"lang": "French"}, {"lang": "Spanish"}]}');


Then, let's query the translators that speak English the same two ways:

SELECT id, jdata->>'$.name', jdata->'$.langs[*].lang'
  FROM translators
 WHERE 'English' MEMBER OF (jdata->'$.langs[*].lang');


SELECT id, jdata->>'$.name', jdata->'$.langs[*].lang'
  FROM translators
 WHERE JSON_OVERLAPS(jdata->'$.langs[*].lang', '["English"]');


Just like in the first case the queries use full table scan, which could now easily be changed by adding a multi-valued index like this:

ALTER TABLE translators
  ADD INDEX idx_langs_obj ((CAST(jdata->'$.langs[*].lang' AS CHAR(8) ARRAY)));


Notice the slightly different syntax of the JSONPath expression. For the index to work it's of course necessary to use the same expression in WHERE clause as in the index definition. Developers that prefer to keep data directly into JSON columns should be happy by this new feature, as it makes it possible to index not only scalar variables, but also arrays.

JSON

Apart from the already mentioned new standard operator MEMBER OF for searching values in JSON arrays there are three new functions JSON_OVERLAPS, JSON_SCHEMA_VALID and JSON_SCHEMA_VALIDATION_REPORT.
  • Function JSON_OVERLAPS compares two JSON documents and returns true "if the two document have any key-value pairs or array elements in common". Like MEMBER OF and JSON_CONTAINS function JSON_OVERLAPS may benefit from multi-value indexes.
  • Functions JSON_SCHEMA_VALID and JSON_SCHEMA_VALIDATION_REPORT are added in regards to JSON Schema support. The first one validates a JSON document against a JSON schema and returns true if valid otherwise false, so its usable as a CHECK Constraint. The second one would provide insights into validation errors in the form of JSON document.

Cast to FLOAT, DOUBLE and REAL

Functions CAST and CONVERT now could cast to floating point data types such as FLOAT, DOUBLE and REAL. Let's try it with an incorrect value:

SELECT CAST('1.23.34' AS FLOAT) cast_res, CONVERT('1.23.34', FLOAT) conv_res;

+--------------------+--------------------+
| cast_res           | conv_res           |
+--------------------+--------------------+
| 1.2300000190734863 | 1.2300000190734863 |
+--------------------+--------------------+
1 row in set, 2 warnings (0.00 sec)

There are two warnings, so let's see them with show warnings:

+---------+------+---------------------------------------------+
| Level   | Code | Message                                     |
+---------+------+---------------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: '1.23.34' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '1.23.34' |
+---------+------+---------------------------------------------+
2 rows in set (0.00 sec)

Casting to DOUBLE and REAL produced different result and the same warnings:

SELECT CAST('1.23.34' AS DOUBLE) cast_res, CONVERT('1.23.34', DOUBLE) conv_res;
SELECT CAST('1.23.34' AS REAL) cast_res, CONVERT('1.23.34', REAL) conv_res;


+----------+----------+
| cast_res | conv_res |
+----------+----------+
|     1.23 |     1.23 |
+----------+----------+
1 row in set, 2 warnings (0.00 sec)

CLONE command

For many years MySQL administrators had to dump masters, transfer the dump over network and load it into replicas in order to initialize their state (see Copy a Database from one Server to Another). I'm not that much into database administration, but it was cumbersome and annoying procedure especially in case of unrecoverable replication errors where I had to re-initialize the state of replicas, so I considered it a burden. Why I could not just "clone" the master after wiping the replica? Well, the new version makes it possible to easily create new instances or re-initialize existing by implementing native provisioning into the server with the CLONE command. It's made possible by the new MySQL Clone Plugin. You could find more about this by reading the following articles:
It's the focus of this release and I'm sure it's going to be a game changer for organizations using extensively MySQL replication.

Deprecations

This is a list of features that are depreciated with this release and would be removed in future releases:
  • Function FOUND_ROWS and query modifier SQL_CALC_FOUND_ROWS. Documentation suggests using COUNT(*) to find the number of rows.
  • Numeric data type attributes:
    • Integer display width. There is now warning "1681 Integer display width is deprecated and will be removed in a future release." if you try to create tables with INT(11) for example. I personally have such definitions as provided by mysqldump and MySQL Workbench, so I'll have to remove them all, before the statements are rejected with syntax error in future.
    • The ZEROFILL attribute. I never used it anyway.
    • The UNSIGNED attribute for FLOAT, DOUBLE, and DECIMAL data types. I haver used it as well.
    • AUTO_INCREMENT support for FLOAT and DOUBLE data types. Anyone?
  • The syntax  FLOAT(M,D) and DOUBLE(M,D) for specifying number of digits for floating point types.
  • Logical operators && (double ampersand), whish is synonym for AND, || (double pipe), which is synonym for OR and ! (exclamation mark), which is synonym for NOT. I'm not sure I ever used any of these even in daily queries, because I find using ANDOR and NOT a lot more expressive.
  • BINARY keyword for specifying _bin collations. I never used this as well.
Developers should consider getting rid of these nonstandard features to avoid unpleasant surprises in future. I'll personally review my projects and correct as soon as possible.

Bug fixes

In regards to CHECK constraints I found two bugs in the prevoius release and one of them was fixed (see bug #95189 CHECK constraint comparing columns is not always enforced with UPDATE queries). For the full example please, see bug_95189_test_case.sql script, so let's just verify it:

SET binlog_format = 'STATEMENT';
SET binlog_row_image = 'minimal';

UPDATE tst SET end_date = '2019-04-20' WHERE id = 1;
/* Error Code: 3819. Check constraint 'chk_dat' is violated. */

So it's fixed, because as expected the UPDATE query fails even with the specific binlog settings. The other one (see bug #95192 CHECK constraint comparing column with default value is not enforced) would have to wait.

That wraps up my review of new features in MySQL 8.0.17 maintenance release. I hope new releases bring more features for developers.

2019-06-05

TREE explain format in MySQL 8.0.16

I have omitted an interesting new feature in latest MySQL release 8.0.16, which I covered last month. It is mentioned in the release notes (see Functionality Added or Changed and scroll down or search for TREE) and also in the documentation (see Obtaining Execution Plan Information). As it turns out new TREE format for the explain plan was added.

Until now there were two options for format of the explain plans generated with EXPLAIN statement:
  • TRADITIONAL - the old tabular explain plan suitable for console;
  • JSON - the explain plan in JSON format used by MySQL Workbench to display nice graphics in Visual Explain, which I use in presentations.
Whoever is familiar with the traditional tabular output knows that it's not indented and sometimes it's hard to understand the order of operations. Now there is the option to use TREE format, which prints the "generated iterator tree" with the operations indented to right similarly to Oracle execution plans. This should help users to understand better how query execution is actually set up, but it's still experimental, so both the syntax and output may change in future.

Let's take the following two queries that generate the same result - minimal, average and maximum salaries for the departments in the organization based on the DEPT and EMP schema (see my adaptation for MySQL and EMP and DEPT script). The second query is possible only with MySQL 8.0.14 and later (see New development features in MySQL 8.0.14).

Query 1: Using LEFT OUTER JOIN and derived table

SELECT D.dname, DT.min_sal, DT.avg_sal, DT.max_sal
  FROM dept D
       LEFT OUTER JOIN
       (SELECT E.deptno,
               MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
          FROM emp E
         GROUP BY E.deptno
       ) AS DT
       ON DT.deptno = D.deptno;

Query 2: Using LATERAL derived table

SELECT D.dname, LDT.min_sal, LDT.avg_sal, LDT.max_sal
  FROM dept D,
       LATERAL
       (SELECT MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
          FROM emp E
         WHERE E.deptno = D.deptno
       ) AS LDT;


The tabular execution plans of the two queries (with columns partitions, possible_keys and filtered removed to shorten the output) and the extended output are as follows:

Query 1

+----+-------------+------------++-------++-------------+---------+-------------------+------++-------+
| id | select_type | table      || type  || key         | key_len | ref               | rows || Extra |
+----+-------------+------------++-------++-------------+---------+-------------------+------++-------+
|  1 | PRIMARY     | D          || ALL   || NULL        | NULL    | NULL              |    4 || NULL  |
|  1 | PRIMARY     | <derived2> || ref   || <auto_key0> | 5       | dept_emp.D.deptno |    2 || NULL  |
|  2 | DERIVED     | E          || index || fk_deptno   | 5       | NULL              |   14 || NULL  |
+----+-------------+------------++-------++-------------+---------+-------------------+------++-------+
3 rows in set, 1 warning (0.01 sec)


+-------+------+-----------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                             |
+-------+------+-----------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `dept_emp`.`D`.`dname` AS `dname`... left join (
                 /* select#2 */ select `dept_emp`.`E`.`deptno` AS `deptno`,min(`dept_emp`.`E`.`sal`) AS `min_sal`... |
+-------+------+-----------------------------------------------------------------------------------------------------+

1 row in set (0.00 sec)

Query 2

+----+-------------------+------------++------++-----------+---------+-------------------+------++----------------------------+
| id | select_type       | table      || type || key       | key_len | ref               | rows || Extra                      |
+----+-------------------+------------++------++-----------+---------+-------------------+------++----------------------------+
|  1 | PRIMARY           | D          || ALL  || NULL      | NULL    | NULL              |    4 || Rematerialize (<derived2>) |
|  1 | PRIMARY           | <derived2> || ALL  || NULL      | NULL    | NULL              |    2 || NULL                       |
|  2 | DEPENDENT DERIVED | E          || ref  || fk_deptno | 5       | dept_emp.D.deptno |    1 || NULL                       |
+----+-------------------+------------++------++-----------+---------+-------------------+------++----------------------------+
3 rows in set, 2 warnings (0.0006 sec)

+-------+------+-----------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                             |
+-------+------+-----------------------------------------------------------------------------------------------------+
| Note  | 1276 | Field or reference 'dept_emp.D.deptno' of SELECT #2 was resolved in SELECT #1                       |
| Note  | 1003 | /* select#1 */ select `dept_emp`.`D`.`dname` AS `dname`, ... join lateral (
                 /* select#2 */ select min(`dept_emp`.`E`.`sal`) AS `min_sal`...                                     |
+-------+------+-----------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)


That's not nice, right? The additional information printed with SHOW WARNINGS only marks the involved queries, so you could more easily identify them in the plan (corresponding to first column id). For the second query there is also a note about how the reference from the second query was resolved in the first query.

Now, here is how the new execution plan outputs look like:

Query 1

+-----------------------------------------------------------------+
| EXPLAIN                                                         |
+-----------------------------------------------------------------+
| -> Nested loop left join
    -> Table scan on D
    -> Index lookup on DT using <auto_key0> (deptno=d.deptno)
        -> Materialize
            -> Group aggregate: min(e.sal), avg(e.sal), max(e.sal)
                -> Index scan on E using fk_deptno                |
+-----------------------------------------------------------------+
1 row in set (0.0025 sec)

Query 2

+----------------------------------------------------------------------+
| EXPLAIN                                                              |
+----------------------------------------------------------------------+
| -> Nested loop inner join
    -> Invalidate materialized tables (row from D)
        -> Table scan on D
    -> Table scan on LDT
        -> Materialize (invalidate on row from D)
            -> Aggregate: min(e.sal), avg(e.sal), max(e.sal)
                -> Index lookup on E using fk_deptno (deptno=d.deptno) |
+----------------------------------------------------------------------+
1 row in set, 1 warning (0.0007 sec)
Note (code 1276): Field or reference 'dept_emp.D.deptno' of SELECT #2 was resolved in SELECT #1


That's nice, right? You see all the operations that would be executed and the indentation helps understand how they're nested or in other words which is executed before others. I will have to see how this new format would evolve, but even now I think it would be quite useful for complex queries, so it's very good they added it.

Thank you MySQL developers!

2019-05-24

Translations of Bugzilla 5.0.5 and 5.0.6

A few days ago I accidentally noticed that there are new versions of Bugzilla namely 5.0.5 (from 2019-01-30) and 5.0.6 (from 2019-02-09), which apparently appeared silently in the beginning of the year. Silently, because at leas I haven't received any notification and apparently I haven't logged in the web interfaces of my installation quite a while to notice the notification on the main page. I would have to check with the developers abut the notifications by e-mail, because they may have figured out something new. You could download the last version from the official page, but I highly recommend you to upgrade with Git - it's lot more easier and faster, especially if you have additional extensions, skins and other changes (see Upgrade below).

About the translation

Anyway, today I managed to update the translation. The changes were not much, because these are pure maintenance releases without significant new functionalities, which to come with new templates. Everything is already pushed to the repository, which I want to reorganized and eventually migrate to Git, but I still do not find the time. Generally the project seems not to develop well in the past 2-3 years, so I do not know it worth it. I'm leaving it as a task eventually for the summar.

Upgrade

This morning, after I prepared the translations, I managed to upgrade my installation, which was absolutely flawless following the instructions for direct upgrade from Git. With 5.0.5 the table bugs_fulltext was converted from MyISAM to InnoDB, and with 5.0.6 the type of the columns type_id in the tables flaginclusions, flagexclusions and flags was changed from  smallint to mediumint (see Integer types), which show allow for many more flagtypes. I got delayed a little only because I had to cleanup some local changes and hide (with .gitignore) some files (logs, patches, archives with translations, and etc.). That reminded me to issue a pull request for one of my local changes - adding of viewport meta tag, which is important nowadays for support of mobile devices.

Download the translations

You could download the translations form the usual places by FTP or HTTPS or download directly  version 5.0.6 and upgrade the translation by just de-archive in the base directory.

Happy using! And of course happy Bulgarian Education and Culture and Slavonic Literature Day (a.k.a Saints Cyril and Methodius' Day)!

2019-04-30

New features in MySQL 8.0.16 ecosystem

Last Thursday (2019-04-25), just before the Orthodox Easter holidays Oracle released version 8.0.16 of the products in the MySQL ecosystem (including server, router, shell, connectors and workbench GUI). The new versions of the server continued the tradition established with MySQL 8 of introducing new features even though they're still labeled "Maintenance Release". On the last Pre-FOSDEM MySQL Day Giuseppe Maxia suggested in a tweet that semantic versioning should be used and I stay behind this as I understand and support semantic versioning.

This would be really helpful, because version 8.0.15 of the server was as a true maintenance release fixing only an important group replication issue. Following the new numbering scheme (see MySQL 8.0: It Goes to 11!) all other products from the ecosystem were bumped to 8.0.15 even without any changes whatsoever. I'm really not sure this is necessary, because as soon as you're using the same major (and minor) versions then there shouldn't be a problem with compatibility. And the same version numbers doesn't really ensure full compatibility between the products as my bugs on MySQL Workbench (e.g. 90620, 92900, 92908 and 94012) show.

MySQL Server

So, what's new in the new release of the server. Here are my picks.

CHECK constraints

Personally, I was waiting for this feature since MySQL 4 when they started adding major new features into the server and it started becoming more like other more advanced databases. Bug #3465 was opened in April 2004 (and was there promised to be fixed in MySQL 5.1), so this SQL feature is finally implemented after more than 15 years.

Until now (as you may have noticed) the CHECK keyword in CREATE TABLE was parsed, but silently ignored. We used workarounds in the past. For example one could ensure only nonnegative values in column with numeric type (integers and fixed/floating point types) by using the nonstandard attribute UNSIGNED (see Numeric Type Attributes). More complex constraints could be implemented through the use of triggers and raising conditions (see SIGNAL syntax) as I've shown in my presentation MySQL 8 for developers on BGOUG Spring 2018 Conference and these would still be required.

MySQL 8.0.16 now supports both table constraints and column constraints (see CHECK constraints) for all storage engines. Table constraints appear anywhere in CREATE TABLE outside column definitions and could refer to one or more columns even with forward references (i.e. for columns defined later in the statement). Column constraints appear within the definition of a column and could refer only to this column. Unfortunately, the expression that specifies the check condition could use only literals, deterministic built-in functions, and operators (i.e. stored routines, UDFs, variables and subqueries are not permitted). It is also not possible to use CHECK constraints on columns with foreign key referential actions (i.e. ON UPDATE, ON DELETE).

The thing to consider with CHECK constraints is that error is raised only if the condition evaluates to FALSE. If it evaluates to UNKNOWN due to NULL values error won't be raised. This is covered very well by Markus Winand in his article The Three-Valued Logic of SQL, so I highly recommended you to read the whole article.

CHECK constraints have a name of up to 64 characters and if not specified the server would generate a name like [table name]_chk_[ordinal number], because the names must be unique per schema. A useful feature is the possibility to create, but not enforce the constraint (i.e. with NOT ENFORCED clause), which is like enable/disable for triggers that is a feature I'd like to see implemented in future releases.

For basic usage and simple examples of CHECK constraints in MySQL, please check the manual and MySQL 8.0.16 Introducing CHECK constraint blog post. There's a more interesting example of CHECK constraint on JSON data in the article MySQL 8.0.16: how to validate JSON values in NoSQL with check constraint.

Here's my first example. I think it's quite common to have two dates in a table (e.g. start and end date) that should represent the beginning and end of something, so they should be in chronological order. Let's try to do this on the emp table from the example DEPT and EMP schema of Oracle, which I adapted for MySQL.

ALTER TABLE emp
  ADD COLUMN retdate DATE AFTER hiredate,
  ADD CONSTRAINT ret_after_hire CHECK (
retdate > hiredate);

However, when I tried to update an employee row that already has value for hire date with the following query (the retirement date presumably coming from wrong user input and/or bad application):

UPDATE emp
   SET retdate = STR_TO_DATE('1019-04-25', '%Y-%m-%d')
 WHERE empno = 7369;

1 row(s) affected Rows matched: 1  Changed: 1  Warnings: 0

the query unexpectedly succeed and there was no "constraint violated" error. After discussing with Frédéric Deschamps on Slack I filed bug #95189. It seems to be related to replication, because on my master MySQL 8.0.16 server on windows the problem isn't reproducible and the query works as expected, but the test I've done was on my MySQL 8.0.16 server running on Slackware64 -current, which is replicating from my main MySQL 5.7.26 server.

Here's another example. A table constraint that ensures hire date is today or in future and salary is positive number.

ALTER TABLE emp
  ADD CONSTRAINT emp_chks CHECK (hiredate >= CURDATE() AND sal > 0);

Such a constraint may seem reasonable (if you consider just inserting data), but of course results in:

Error Code: 3814. An expression of a check constraint 'emp_chks' contains disallowed function: curdate.

because CURDATE is non-deterministic function. Such constraint is possible in PostgreSQL (which also supports usage of stored routines and UDFs), but not in Oracle (that supports usage of storage routines) and MariaDB (which pretty much has the same restrictions as MySQL although not clearly listed). The problems are how do you validate the constraint for existing rows and how do you update rows, because the value of CURDATE changes every day. The solution is to create an additional column preserving the current date when the row was created and then use it to validate the constraint.

If you try to directly create the column and enforce the constraint (i.e. with an ALTER TABLE) you'll of course get:

Error Code: 3819. Check constraint 'emp_chks' is violated.

because check constraints, like other constraints (e.g. primary and foreign keys, unique indexes, NOT NULL), are validated for all the rows when created and should always remain valid. So, let's try another way.

First, initialize the new column from existing data (e.g. based on hiredate column as expressions are possible as default values since 8.0.13):

ALTER TABLE emp
  ADD COLUMN created DATE DEFAULT (hiredate);

Then, modify the column and add the check constraint:

ALTER TABLE emp
  MODIFY COLUMN created DATE NOT NULL DEFAULT (CURDATE()),
  ADD CONSTRAINT emp_chks CHECK (hiredate >= created AND sal > 0);

Now, let's try some queries:

UPDATE emp SET hiredate = '1979-04-25' WHERE empno = 7369;

INSERT INTO emp
  (empno, ename, job, mgr, hiredate, sal)
VALUES
  (9999, 'MULDER', 'INVESTIG.', 7839, '2019-04-25', 4242);

As expected, the UPDATE query produces error:

Error Code: 3819. Check constraint 'emp_chks' is violated.

but the INSERT query succeeds, which is unexpected. Apparently, the value of created is not initialized with the default value when the CHECK constraint is validated. I tried the same (although slightly different syntax) on Oracle XE 18 and it worked as expected - both the UPDATE and INSERT queries violated the CHECK constraint.

There is a new table CHECK_CONSTRAINTS in INFORMATION_SCHEMA that provides information on defined check constraints in all sachems. Additional information about table name and whether a CHECK constraint is enforced or not could be retrieved from table TABLE_CONSTRAINTS by filtering the data by the new value CHECK of column CONSTRAINT_TYPE.

Spatial data

After ST_Distance since 8.0.14, now ST_Length function also supports optional second argument unit, so it's possible to calculate lengths in the different supported units as defined in INFORMATION_SCHEMA.ST_UNITS_OF_MEASURE table.

Server does all upgrades

The server is already capable of upgrading everything - the mysql schema, data dictionary and system tables as well as PERFORMANCE_SCHEMA, INFORMATION_SCHEMA, sys and user schemes if necessary (see What the MySQL Upgrade Process Upgrades), so mysql_upgrade command is going into retirement. It's an important administrative feature, because this would make upgrades easier and more convenient. In the past I regularly forgot to execute the command, which led to strange problems later.

In regard to upgrades I had this strange problem that the server wouldn't start with the error message [ERROR] [MY-013384] [Server] Could not create server upgrade info file at '/var/lib/mysql/data/' although the permissions were OK. I was able to find the gist mysql will not start after 8.0.15 to 8.0.16 update on Ubuntu 16.04 LTS with Google and after creating the file mysql_upgrade_info in /var/lib/mysql/data/ and changing its ownership to mysql:mysql the server was able to start successfully. I was upgrading from 8.0.14 in fact, but I think it may be a bug. Perhaps the server expects the file to have been created.

Another interesting new feature in the option --validate-config for validation of server configuration much like system administrators are used to do so with other servers (e.g. Apache). It's really important especially for production environments where unexpected downtime could be extremely unpleasant. Read more in Server Configuration Validation.

System users

MySQL accounts are now categorized and thus distinguished by system (who posses the SYSTEM_USER privilege) and regular users (who does not). This allows for better separation of roles as only system users could perform certain administrative operations on system accounts as previously any user having the appropriate privileges could for example delete any account or kill connections belonging to any user. It's also possible to revoke global privileges partially for certain schemas (i.e. like an exceptions) through the new system variable partial_revokes, which previously required granting privileges separately for each existing schemes and adding grants for each new schema.

Security

The new things are support for TLS 1.3, possibility to update SSL certificates without restarting the running server (see ALTER INSTANCE RELOAD TLS) and information about the certificates in PERFORMANCE_SCHEMA's keyring_keys table).

MySQL Router

The router now has an HTTP component that allows it to expose simple web-interface and REST APIs for the purpose to ensure better observability through integration with external tools for monitoring and management. Other notable changes include dynamic change between single and multi master modes and improved logging. There is now the WITH_ROUTER CMake option to build router together with the MySQL server, which defaults to ON and which I decided to switch OFF, because I plan to continue building router as a separate package.

MySQL Shell

The shell comes with the new Shell Reporting Framework (see Reporting with MySQL Shell), which allows registration, displaying and monitoring of custom reports. I'm eager to try it and I'll write a separate post on this later. See the article MySQL Shell 8.0.16: User Defined Reports by Jesper Wisborg Krogh. It's now also possible to execute SQL without changing the mode, as the \sql command now doesn't change the mode if provided with an SQL statement, which I think is pretty useful.

MySQL Workbench

MySQL Workbench now supports window functions in SELECT (see bug #90620), expressions in DEFAULT (see bug #92900) and as key parts (see bug #92908) as well as the LATERAL keyword (see bug #94012) - all of these implemented into previous 8.0.x versions of the server. Unfortunately, I do not see any support for CHECK constraints (see bug #95143 that was already verified), so again the same version numbers mean nothing.

2019-03-30

Finishing with Dir.bg's mail

From the beginning of last year I had problems with the mail of Dir.bg, which I'm using from 20 or so years. The service was unavailable by IMAP and I was loosing messages. After I didn't mange to get in touch with support I wrote in Linux-bg.org's forums with the hope somebody to know something about. Leaving aside the needless and pompous comments, one of the users responded and my problem was solved (I supposed it was an administrator). Since then there were several short term outages of the service, but last week I noticed that the mail is again not available most of the time. Since yesterday unfortunately again I do not have any access neither through IMAP nor through the web interface.
Now this forces me to think seriously for migrating completely to Gmail. This includes:
  • copying about 1 GB mails, which shouldn't be a problem considering the capacity of today's connections as long as the service becomes available again to which I strongly hope, because I have some important messages.
  • redirection of N in count services to my mail address in Gmail. This would be a tough task, because I'm not even sure I'll be able to enumerate all although I keep the mails with registration confirmation, but for now I'm considering keeping the address in Dir.bg until I transfer everything. I have already started, because I didn't receive the mails from BGOUG.
  • updating web pages in Internet, source code and others.
I'm really not sure why it happened like this and I'm sorry, because the service was useful and comfortable. I used it without problems quite some time (even since Gmail did not existed and there was Yahoo). Apparently however recently the service is not supported well and I think they have problems with spam and other attacks.

That's it. I maybe post something short, when I'm done with the migration.

2019-02-10

First migrations from CVS to Git

Last May I had the good intention to start gradually migrating my old CVS repositories to Git, but I only managed to begin the process this weekend. Before plunging deeper I decided to migrate some smaller repositories with linear history, so I could exercise, verify and master the process. As a result some small projects already landed on my GitHub profile (see guthub.com/gdsotirov). I have read much about migrating from CVS and SVN to Mercurial and Git in relation to an internal project in the company I'm working for. I have also already experimented migrating from CVS to SVN in the past and from CVS to Hg/Git more recently, so I considered myself prepared, but still there are some things to consider when doing such migration.

Repository preparation

Revision log messages. You may have a convention for writing revision log messages in CVS, but I did not have one for my personal projects, so I have written comments on single line and on multiple lines as unnumbered list like this:

* Add this
* Modify that

I had to modify these comments in CVS, before the migration, because these simply do not look good in TortoiseGit and GitHub. I also aligned some similar comments on files committed in few minutes time span, so that the migration could group them in a single revision. Depending on whether you had a convention for the format of the revision log messages in CVS you may need to make other adjustments. The easiest way in such case may be to modify directly the revision control files with perl, sed, awk, etc. while of course paying caution not to ruin your history (i.e. such operations should not be performed on "live" production repository).

Authors. You should be able to identify all your authors, because in CVS repositories they are normally UNIX login names and the users may no longer exist in the system. I've seen even cases where the same username was reused by different developers having the same given and family names. It's hard to identify and fix such revisions unless you're aware of developers' internship in the company, but what if their time in the company overlaps?
Hopefully, I do not have this problem with my personal repositories, but still I had to rewrite my username as full name and email address (see more about this in chapter Step One: The Author Map of Eric Raymond's DVCS migration HOWTO). It's very important to rewrite the authors, because otherwise the commits won't be properly identified in GitHub (see Contributions that are counted). If you need to fix usernames it's preferable to do it in the CVS repository before the conversion. And the easiest way to map usernames to authors is to rewrite the converted Git repository as I write about below.

Tools

In such a process one of the most important things is to chose the right tools, because you simply cannot go without them. Last year at work we evaluated several options and finally choose cvs2git, which is part of cvs2svn project. The command is written in Python and supports various options. For me it is important to make the best possible migration, so I had to make the following modifications in the example options file:

KeywordHandlingPropertySetter('expanded'),

from cvs2svn_lib.keyword_expander import _KeywordExpander
# Ensure dates are expanded as YYYY/MM/DD as in CVS 1.11
_KeywordExpander.use_old_date_format()


The first line ensures that CVS keywords (e.g. $Id$, $Date$, etc.) are expanded in text files (see topic cvs2svn changes 'date' string in source codes). The following three lines ensure that dates in keywords are expanded as in CVS 1.11 or in YYYY/DD/MM format, instead of the default format YYYY-MM-DD as in CVS 1.12 and SVN.

Migration

The way to covert a CVS repository to Git is described in cvs2git's documentation (see chapter Usage), so just read it. I wrote myself a shell script to execute the different steps defined there. Initially I used cvs2git only with command line options, but because of my requirements for keywords expansion (see above), I had to improve the script to generate options file and use it instead. Like this I retained the way I call the script by just providing path to the CVS repository.

For now I haven't had problems with the migration - the history was represented properly in Git. However, with many tags and branches over different files things could get really messy as I found out with the internal project at work. After the migration I check the result and if necessary fix more revision log messages. Then, I just rewrite authors with the script given in Changing author info and finally import the repository to GitHub (e.g. I use git daemon --verbose --export-all --base-path="/path/to/repo" to serve the converted repository). After the import to GitHub I have to add LICENSE and README.md files, but this is quite trivial directly through the web interface.

And that's all. I have more repositories to migrate and I really hope I'll be able to finish the task until the end of the year, but it would all depends on how quickly I'll be able to prepare the repositories, because after this it's the jobs of the tools.

2019-01-31

My MySQL Bugs

Since I started using MySQL 3.23.x during my university years I was trying to solve all my problems myself like with all the other free and open source software I was using. At first it was more about learning (or lack of knowledge if you want), but then the problems become more complex and harder to solve. Well, I was successful most of the times, so I didn't consider reporting bugs or asking for support. However MySQL ecosystem grew large since then, the software become lot more complex and the source code base increased significantly, so nowadays it's native to think one could manage to fix all the problems he encounters without going deep in product's internals or relaying on professional support. The first is possible only for MySQL developers well familiar with the code base and large organizations like Facebook, Booking, etc., which have dedicated teams and regularly contribute patches. The second depends on you and of course the company providing the support.

I haven't reported much bugs for MySQL - just 21, but I now consider reporting everything I found which seems like a bug, because this is the right thing to do. I may as well solve the problem myself, but someone else could be stuck by the same and should at least be aware that it's a known (and eventually already resolved) problem. With each one of these I learned a little something, so here below I make a review of my bugs so far describing their fate in chronological order.
  1. Bug 20098 (my first one) was reported on 2006-05-26 and is with status "No Feedback" as of 2006-07-01. It was about Query browser searching for a file that is not present (preferences.glade). I have forgot about this one and found out it was closed later, because no feedback was provided, which is fun as I have provided the requested information...
  2. Bug 69459 was reported 2013-06-13 and is in status "Can't repeat" as of 2013-12-24. It was about MySQL Workbench crashing on entering dot in a SQL string when automatic code completion is enabled. I reported the bug for version 5.2.47 CE, but I was suggested to try 6.0.7, which worked, so the bug didn't receive more attention since 5.2 series didn't receive more updates.
  3. Bug 73076 (my first confirmed and fixed one) was reported on 2014-06-22 and is with status "Closed" as of 2014-08-26. It was about MySQL Workbench actually committing modifications in a record set when you only apply them. It was a nasty one making my daily work with Workbench difficult as I had to be careful not to apply (pardon commit) too early. The problem appeared in 6.0.x, 6.1.x and was finally fixed with the release of 6.2.2 on 2014-09-05 (see release notes) after being verified by support.
  4. Bug 73079 was reported on 2014-06-23 and is with status "Closed" as of 2014-09-15. It was about MySQL Workbench not refreshing the cell after setting value to NULL. It affected 6.1.x and early 6.2.x versions. Although it wasn't officially confirmed for me it was a regression, but more importantly it was fixed with the release of 6.2.3 on 2014-09-23 (see release notes).
  5. Bug 73708 was reported on 2014-08-25 and is with status "Closed" as of 2014-12-01. I was about MySQL Workbench models being completely mangled. I forgot this one for a few months and provided the requested feedback later to realize the problem was already fixed with the release of 6.2.4 on 2014-11-20 although nothing was mentioned in the release notes.
  6. Bug 73770 (my oldest one still open) was reported on 2014-08-29 and is still with status "Verified" as of 2018-02-05. It's a feature request for MySQL synchronizer between model and database to display the actual differences. Whoever used this functionality should know that applying changes in the model to the database could be quite annoying, because there's always some changes you do not expect and furthermore you do not understand. You only see the update script generated, but it would be nice to see the actual difference, which in some cases may not even be difference (see 90772 below).
  7. Bug 82202 was reported on 2016-07-12 and is with status "Closed" as of 2018-01-31. It was about impossibility to link Connector/ODBC 5.3.6 to the shared libmysqlclient.so due to undefined symbols (e.g. my_malloc, my_free, etc.). The problem was quickly explained, but then closed as "Not A Bug". It was reopened after a Fedora maintainer reported the same problem and finally fixed with release of Connector/ODBC 5.3.10 on 2018-01-30 (see release notes).
  8. Bug 84951 (my first suggested patch) was reported on 2017-02-10 and is with status "Duplicate" as of 2017-02-12. It was about problem building MySQL Workbench 6.3.9, because of compile errors in jsonparser.cpp and jsonview.cpp for which I suggested a patch. The bug was made duplicate of 84886, which also contained a patch provided two days earlier, but it was closed on 2018-05-14 as "Won't fix", because they "no longer support 32bit systems" (!?).
  9. Bug 89608 was reported on 2018-02-09 and is still with status "Verified" as of 2018-02-13. It is a feature request about having password requirements messages in MySQL Workbench match the server configuration, so they're more useful. I provided sample code about how it could be done and even wrote a SQL function myself to test the idea.
  10. Bug 89615 was reported on 2018-02-10 and is with status "Unsupported" as of 2018-08-23. It was about failure to build MySQL Shell 1.0.10, because compilation fails for undefined vio* functions. It was related to 82202 and although confirmed it was later closed with the explanation that "MySQL Shell 1.0 is no longer supported and version 8.0 should be used instead" (!?) to which I reacted a bit harsh, but I really do not understand such answers.
  11. Bug 90619 was reported on 2018-04-25 and it is with status "Duplicate" of 79315 as of 2018-04-25. It was about MySQL Installer not offering upgrade from MySQL 5.7 to 8.0, which was explained as "not a bug", because it was "working as expected", so apparently MySQL Installer doesn't support upgrade between major releases. I hope this changes in future.
  12. Bug 90620 (my first of a series) was reported on 2018-04-25 and is still with status "Verified" as of 2018-04-26. It is about MySQL Workbench 8.0.11's SQL editor displaying error on SELECT query with window functions. You wouldn't expect MySQL Workbench having the same version as the server to lack support for new server functionalities, but more about this later.
  13. Bug 90727 was reported on 2018-05-03 and is with status "Closed" as of 2018-06-01. It was about impossibility to link Connector/C++ 1.1.11 due to missing mysql_sys and mysql_strings libraries. It turned out to be a regression from some Solaris specific modifications not made Solaris specific. I got in touch with the developer and it was declared fixed in 1.1.12, which got released on 2019-01-28 (see release notes), so for 1.1.11 I had to patch it myself.
  14. Bug 90772 was reported on 2018-05-06 (St George's Day) and is still with status "Verified" as of 2018-05-09. It is about MySQL Workbench synchronizer making difference between single quotes escapes (e.g. the statement COMMENT 'Currency\' symbol' is different from the statement COMMENT 'Currency''s symbol'). It is because MySQL Workbench allows backslash escape and the server accepts it execution, but then rewrites it with single quote escape. Thus, on the next synchronization the difference remains (e.g. like you haven't synchronized at all).
  15. Bug 90876 was reported on 2018-05-15 and is with status "Closed" as of 2018-06-21. I was about MySQL Shell 8.0.11 giving error 5115 on adding documents to collection in Server 5.7, but what it was actually about was automatically generated identifiers as I explained in details in my article MySQL Shell 8 and automatically generated document IDs. This one resulted in documentation update and since nobody else experienced this problem I guess it's fine, but for me such problems break backwards compatibility.
  16. Bug 91841 was reported on 2018-07-31 and is with status "Closed" as of 2019-01-25. It was about impossibility to build MySQL Connector/ODBC 5.3.11, because of compilation errors. I was able to fix the compilation errors myself (see patch) until waiting for help and explanation. And the explanation was that it was duplicate of an internal bug, "which explains the reasons for the failure", but which I'm not able to read. The funny thing about this one was that there were more than 400 lines wiped out from the beginning of a header most probably related to copyright text updates. It was fixed with the release of 5.3.12 on 2019-01-28 (see release notes).
  17. Bug 92898 was reported on 2018-10-23 and is with status "Not a Bug" as of 2018-10-23. I consider this as my one and only "support request" and I was pleased, that it was explained the same day.  It is about hitting an incompatible change during upgrade from 8.0.12 to 8.0.13, because I still had same old views using ASC or DESC qualifiers for GROUP BY clause. In this regards I'd like to thank MySQL for the detailed release notes they publish although as in this case even if you read them carefully you may omit something by considering it doesn't concern you. For this reason it's highly recommended to use MySQL Shell's Upgrade Checker Utility, which could tell you about compatibility errors and issues in advance.
  18. Bug 92900 was reported on 2018-10-23 and is with status "Verified" as of 2018-10-23. It is about MySQL Workbench 8.0.13 lacking support for expressions in DEFAULT although the feature is supported by the same version of the server.
  19. Bug 92908 was reported on 2018-10-23 and is with status "Verified" as of 2018-10-23. Is is about MySQL Workbench 8.0.13 lacking support for expressions as key parts although the feature is supported by the same version of the server.
  20. Bug 93835 was reported on 2019-01-07 and is with status "Verified" as of 2019-01-10. It's a feature request for having MySQL Workbench display a warning on using keyword as identifier (e.g. table and column names). I opened it after Frédéric Descamps suggested so in a comment on LinkedIn. Fingers crossed it's implemented.
  21. Bug 94012 was reported on 2019-01-23 and is with status "Verified" as of 2019-01-23. It is about MySQL Workbench 8.0.14 not recognizing the new LATERAL keyword and giving error after it in SQL editor. It's by now the last one of the series on MySQL Workbench not supporting the features of the server even though they're the same version. The bug was nominated as MySQL Bug of the Day by Valerii Kravchuk on the same day.
I hope I haven't bored you with all the information above. It's not always easy to properly explain a problem, but I have quite some experience being on the "other side", so I think I know what information is required for a problem to be investigated thoroughly. Still reporting bugs properly is probably an art and I'm looking forward to Valerii Kravchuk's presentation How to create a useful MySQL bug report at FOSDEM on Saturday.

Some of my bugs are pure technical ones (e.g. 82202, 84951, 89615, 90727 and 91841) for which I reported all the details and still the explanations on some of them seem pretty strange to me (e.g. no longer support 32bit systems, MySQL Shell 1.0 is no longer supported and version 8.0 should be used instead) even though the problem and what should be fixed was quite clear. I do not understand how could you not release a new maintenance version when there are open bugs for the last such release in the series? What vendors should understand is that as a package maintainer for me it's more complex to support patches for problems not fixed upstream.

I've made a series of bugs about unsupported sever features in MySQL Workbench (e.g. 90620, 92900, 92908 and 94012), which made me think about the meaning of version number alignment that was done with the release of MySQL 8.0.11 as General Availability release (see MySQL 8.0: It Goes to 11!). As I understand the post it was done, so that we use the "correct versions" of the all the products. It's generally a good idea, but shouldn't the "correct version" support the same feature set of the server? Users should be presumably less confused if so, but so far this is clearly not the case with Workbench. I hope the situation improves in future.

I'd like to complete these post with some final thoughts. As a developer I'm aware that could never be a software completely free of bugs. However, as a user I expect that when a bug is properly reported and the problem is clear the solution to not take that much time, but of course what really matters are priorities and they are normally as higher as the number of affects users is bigger.