Georgi Sotirov's Blog

My personal and professional life

2020-01-17

MySQL 8.0.19 suite released

MySQL 8.0.19 suite including Community Server, Router and Shell, was released on Monday (01/13). The focus of these releases is the new InnoDB ReplicaSet functionality that simplifies management of classic MySQL replication with single primary and multiple secondaries in the same way as setting up Group Replication with InnoDB Cluster. You could read more about it in the article The all new MySQL InnoDB ReplicaSet. Apart from this there are some other improvements for which you could read below.

SQL syntax

The server now understands the more general and standard ALTER TABLE … DROP/ALTER CONSTRAINT syntax for which I tweeted earlier. It's a feature request since MySQL 4.1 from more than 15 years ago. You could find more in Foreign Keys and Other Constraints section of ALTER TABLE article in documentation.

Table value constructors are another improvement towards the standard allowing generation of rows with the VALUES statement (not to be confused with VALUES keyword for INSERT statements). Due to name conflict with the VALUES function it's necessary to use the verbose form with ROW constructor. This basically means the standalone statement VALUES ROW(1), ROW(2), ROW(3) is now properly understood by the server. VALUES statement could be used also in unions, joins, as derived table and other cases where you could employ SELECT. See the VALUES statement article in documentation for more details.

The new TABLE statement implements another standard feature - explicit table clauses. It could be used like SELECT, because TABLE t is just like SELECT * FROM t, so it could be used in unions, with INTO, as IN subquery and elsewhere where SELECT could be employed.

An interesting new addition is the LIMIT clause for recursive CTEs, which allows the debugging of expressions generating too many results. The idea is simply to limit the result from a recursive CTE to some reasonable number (e.g. 10, 100, etc.), so you could see the generated output and be able to understand how and why it's generated like this. This new features is covered in the article A new, simple way to figure out why your recursive CTE is running away and in Limiting Common Table Expression Recursion section of WITH (Common Table Expressions) in documentation.

It's is now possible to use an alias for the new row and optionally columns with INSERT ... ON DUPLICATE KEY UPDATE statement following the VALUES or SET clause with the use of AS keyword. I guess this is the first step to removing VALUES function (see above) for better standards support. You could find more in the article for the ON DUPLICATE KEY UPDATE Statement.

Datatype changes

The TIMESTAMP and DATETIME types could now include timezone information as offset (e.g. +02:00 or -08:00) appended to the value. The details are in the article The DATE, DATETIME, and TIMESTAMP Types in documentation.

The YEAR(4) data type (i.e. with explicit display width) and the (undocumented) UNSIGNED attribute for YEAR are now deprecated and would be removed in future. This follows other deprecations of display widths and other type attributes from 8.0.17 for which I wrote before, so developers should check and correct any SQL scripts using deprecated types and attributes. With this release statements that print data type definitions (e.g. SHOW CREATE TABLE) no longer show display width for integers with exception for TINYINT(1) and ZEROFILL attribute.

Other changes in server

User-defined functions (UDFs) can now determine character set and collation of string argument as I noticed earlier. You could find more about this in the article A Tale of UDFs with Character Sets.

There are now new standard INFORMATION_SCHEMA views related to roles like ADMINISTRABLE_ROLE_AUTHORIZATION, APPLICABLE_ROLES, ENABLED_ROLES, ROLE_COLUMN_GRANTS, ROLE_ROUTINE_GRANTS and ROLE_TABLE_GRANTS.

InnoDB engine now supports efficient sampling of data for histogram statistics. Sampling no longer requires full table scan, which could be costly for large tables, but samples data pages from the clustered index instead. I should try this on first opportunity.

Function JSON_SCHEMA_VALID now provides detailed information about the reasons for validation failures by using SHOW WARNINGS.

Other notable changes is the possibility to lock user accounts after FAILED_LOGIN_ATTEMPTS login attempts for PASSWORD_LOCK_TIME days (see Failed-Login Tracking and Temporary Account Locking), compression in X protocol (see Connection Compression with X Plugin) and many more as you could find from the release notes.

Bugs fixed in server

The previously reported problem with EXPLAIN ANALYZE not executing and reporting on subqueries in the select list has been fixed (see bug 97296).

Workbench

As I read in the release notes of the new version that "The following MySQL server language features are now supported: value references in INSERT statements, new options (TLS version, cipher suite, compression, and privilege check) in CHANGE MASTER TO replication statements, random passwords in CREATE USER and ALTER USER statements, EXPLAIN ANALYZE, and binary collation names.", I was hopeful that the application is finally catching up with the server, but that was until I reviewed the new SQL syntax (see above).

Two of the bugs I reported previously were fixed - 97416 (for EXPLAIN ANALYZE) and 97281 (for ANALYZE TABLE ... UPDATE HISTOGRAM), but I opened two new for TABLE (see bug 98263) and VALUES (see bug 98266) statements. Such problems should be obvious, so it still strange to me how they continue appearing.

There is a new menu item "Copy Row (with names, tab separated)" for copying headers and rows of queried data with tabs as separators that makes it easy to move results into Excel or LibreOffice Calc.

Happy using and experimenting with the new MySQL releases!

2020-01-09

Using xrdp on various Linuxes and a problem on Fedora

I've been using various Linuxes in virtual machines since more than 15 years in order to keep my skills sharp. It's simply not enough to know just one Linux distribution, but you have to know at least the major ones (e.g. Debian, RPM or based on another package management system ones). So apart from Slackware, I also regularly use Debian, Ubuntu, Fedora, OpenSuSE, CentOS and more recently Oracle Linux (for Oracle XE). Over the years, I have also experimented with Gentoo, FreeBSD, OpenBSD, NetBSD and others.
What I immediately need in any of these is secure shell (SSH) of course, so I could type commands remotely. However, I also like to work and play with GUI software, so I also need remote graphics access. Over the years I used VNC and NoMachine, but I never liked any of these for various reasons. What I have always missed in Linux was an easy to use access like RDP with real session management and all the other goodies that is lightweight and robust (e.g. NoMachine also manages sessions, but encodes video of the desktop, which makes the picture blurred and frequently doesn't work on my Linux virtual machines after upgrades).
So, I discovered xrdp, which is "an open source RDP server". It's easy to install, configure and run on Debian, Ubuntu and OpenSuSE and it just works. The installation for Fedora requires some additional steps, which are described in README.Fedora by xrdp package's maintainer Bojan Smojver. By default, xrdp on Fedora works only with VNC sessions, but I already said I do not like it, so I prefer using Xorg like with my other Linuxes. I'll summarize the steps here:
  1. Install necessary packages: dnf install xrdp xorgrdp
  2. Edit /etc/xrdp/xrdp.ini and enable Xorg sessions;
  3. Enable access by anybody or just specific users, e.g. echo "allowed_users = anybody" >> /etc/X11/Xwrapper.config (see also bug 1450720);
  4. Enable port 3389 on the firewall, e.g. firewall-cmd --zone=dmz --add-port=3389/tcp
  5. Point your RDP client to the Linux machine and have fun.
Well, until yesterday. After upgrading my Fedora virtual machine to Fedora 31 the RDP connection stopped working with Xorg sessions although VNC sessions were working fine. The xrdp log at /var/log/xrdp-session.log was suggesting that the Xorg session is crashing just after it's started, so I was getting a blank screen.

[20200109-10:12:53] [INFO ] setpriv --no-new-privs Xorg :10 -auth .Xauthority -config xrdp/xorg.conf -noreset -nolisten tcp -logfile .xorgxrdp.%s.log
[20200109-10:12:53] [CORE ] waiting for window manager (pid 4902) to exit
[20200109-10:12:57] [CORE ] window manager (pid 4902) did exit, cleaning up session
[20200109-10:12:57] [INFO ] calling auth_stop_session and auth_end from pid 4877

Investigating the reason, I noticed the following in the journal (e.g. with journalctl -r):

Jan 09 10:12:55 fedora systemd[4881]: Failed to start GNOME Shell on X11.
Jan 09 10:12:55 fedora systemd[4881]: gnome-shell-x11.service: Failed with result 'protocol'.
Jan 09 10:12:55 fedora gnome-shell[5157]:   Current serial number in output stream:  359
Jan 09 10:12:55 fedora gnome-shell[5157]:   Serial number of failed request:  358
Jan 09 10:12:55 fedora gnome-shell[5157]:   Minor opcode of failed request:  7 (RRSetScreenSize)
Jan 09 10:12:55 fedora gnome-shell[5157]:   Major opcode of failed request:  139 (RANDR)
Jan 09 10:12:55 fedora gnome-shell[5157]: X Error of failed request:  BadMatch (invalid parameter attributes)
Jan 09 10:12:55 fedora xrdp-sesman[4870]: rdpRRScreenSetSize: not allowing resize
Jan 09 10:12:55 fedora xrdp-sesman[4870]: rdpRRScreenSetSize: width 1688 height 932 mmWidth 447 mmHeight 247
Jan 09 10:12:55 fedora gnome-shell[5157]: Xlib:  extension "DPMS" missing on display ":10.0".
Jan 09 10:12:55 fedora xrdp-sesman[4870]: rdpRRGetInfo:

So, gnome-shell was failing actually. Initially, I found bug 1575792, which seemed similar, but it wasn't resolved. Finally, the problem turned out to be the latest version of xorgxrdp 0.2.12 (see issue #156). I had to simply downgrade to xorgxrdp 0.2.11, by forcefully installing the RHEL 8's xorgxrdp package like this:

rpm --nodeps -i xorgxrdp-0.2.11-1.el8.x86_64.rpm

And things are again back to normal.

Update 2020-01-14: Of course it's good to eclude package xorgxrdp from upgrades (see Excluding Packages From Transactions) or simply echo "excludepkgs=xorgxrdp" >> /etc/dnf/dnf.conf as one line command.

Happy using your favorite Linux distro!

2019-12-25

Bugzilla in Bulgarian on GitHub

Exactly 14 years ago on this day I committed the localization of Bugzilla 2.18.1 in Mozilla-BG project repository on OpenFMI. I have considered using Bugzilla as bug tracker for my home server since the spring of 2005 and when I was reviewing the project I noticed that there is no Bulgarian localization, so I decided to contribute. The translation was done in the summer of 2005 and in the autumn I began searching for a way to publish it. Initially, I considered SourceForge, but finally decided to become part of Mozilla-BG project after I was invited by Ognyan Kulev.
The localization was moved to BitBucket in 2012 after OpenFMI was decommissioned, so the repository was converted from Subversion (SVN) to Mercurial (Hg). It was then when some problems with the SVN repository become evident, but I found no time to fix these until now. Well, it was about time to migrate the project to Git and publish it on GitHub and I accomplished it last week. The Bugzilla-BG project is now on GitHub with all its history since December 2005. And ever since you could try the localization at Sotirov-BG.Net Bugzilla.
The localization went a long way from CVS on my home server initially to SVN, then Hg and finally ended up in Git. My only project that went through this path :-)

Have fun!
Merry Christmas!

2019-12-05

Picture-in-picture mode for videos in Firefox 71

Firefox 71 arrived two days ago (see release notes) and I already absolutely love the new picture-in-picture mode for playing videos while you're browsing. It's a real deal for me, because this is the usual way I'm browsing today - having a video playing in the background and listening to it while surfing websites. The only drawback was that I had to switch to the video tab from time to time, when I'd like to actually see what's playing. This sometimes involved searching for the tab, because I usually have many tabs open in different groups (organized with Panorama View extension).
Now my online experience is much more pleasant, because I could just start a video, pop it up into a separate small window wherever on the screen I like and then continue surfing or do something else. Like this I could still keep an eye on the video even when I go out of Firefox. It's super useful even when I'm working on something, because I normally use many online resources. The feature is only available on Windows for now, but Mozilla promises to make it available for MacOS and Linux with the next release due in a month or so.
Thank you Mozilla for constantly making my online experience better!

2019-10-18

New features in MySQL 8.0.18

On Monday, Oracle released MySQL 8.0.18 (see also release notes) and unlike with previous releases there was lot of information about what would be included into this one. I personally attended several presentations and read some posts in social media. Like I already wrote the major new features in this release are hash joins and EXPLAIN ANALYZE, which aim to respectively improve the performance of some queries and give better insight about how optimizer plan compares to actual execution.

Hash join optimization

Before 8.0.18 the only type of join done by MySQL was nested loop (a loop within a loop, an inner loop within an outer one). And there were two algorithms implementing it:
  • Nested-Loop Join - it's the simplest where each row from the outer loop is passed to the inner loop for processing. The obvious drawback of this one is that the inner table needs to be read many times.
  • Block Nested Loop (BNL) - uses buffering of rows from the outer table to reduce the number of times the inner table is read. In this case instead of passing just one row to the inner loop multiple rows from the buffer could be passed at once. This greatly reduces the number of time the inner table needs to be read. Of course, this requires more memory (see join_buffer_size). This algorithm is used for range, index and ALL join types.
There is also Batched Key Access (BKA) algorithm, which uses buffering like BNL, but puts keys for the rows in the buffer and passes them in a batch to the database engine for index lookups. The keys are then used to fetch the rows from the inner table. This algorithm can be used when there is index access to the inner table.

Here come hash joins. The idea is to build a hash table for the values from the outer table, which normally is the smallest. Then, read the inner table and find matches in the hash table. Like this both tables could be read at best just once. Hash joins are best for large results sets where indexes cannot be used. Due to their nature they are useful only for equality joins. Since 8.0.18 MySQL would choose hash join before BNL for any query that uses equality join condition (e.g. ref or eq_ref join types) and uses no indexes. Let's try it out.

For the example, I added a new table called job_sal with salary ranges per job to the dept_emp schema and I also generated 1 million more employees. Now let's say you want to find all the employees whose salary is outside the range. I would come up with a query like this:

SELECT E.ename, E.sal, JS.sal_min, JS.sal_max
  FROM emp     E,

       job_sal JS
 WHERE E.job = JS.job

   AND E.sal NOT BETWEEN JS.sal_min AND JS.sal_max;


Which given the absence of indexes would be executed using BNL algorithm in MySQL prior to 8.0.18, but in the new release the same query would benefit from the hash join optimization as visualized by the following execution plan in tree format:

+----------------------------------------------------------------------------------------+
| EXPLAIN                                                                                |
+----------------------------------------------------------------------------------------+
| -> Filter: (E.sal not between JS.sal_min and JS.sal_max)  (cost=499211.71 rows=442901)
    -> Inner hash join (E.job = JS.job)  (cost=499211.71 rows=442901)
        -> Table scan on E  (cost=1962.39 rows=996514)
        -> Hash
            -> Table scan on JS  (cost=0.75 rows=5)
 |
+----------------------------------------------------------------------------------------+
1 row in set (0.0023 sec)


Visual Explain Plan
The plan reveals that the smaller table JS would be hashed and the join would be done using the hash table. Unfortunately, the hash join optimization is not visible in traditional explain plan, JSON and thus into MySQL Workbench. The visual explain plan (look right) would be thus misleading showing BNL as the join algorithm. This is understandable, because the hash join optimization is possible only with the new iterator executor, which actually generates the explain plan in TREE format. This executor is not able to explain some queries, so you may see only the message "not executable by iterator executor". However, I really hope this is improved in future releases, because the explain plan should be consistent between formats. I reported a feature request as bug 97280.

I tested the performance of the query for 1M employees and its execution time was 0.89 sec. Disabling the hash join optimization with NO_HASH_JOIN hint increased the execution time to 1.26 sec. Hash join for sure would be much more beneficial when outer table has more rows.

EXPLAIN ANALYZE

This new feature also comes on top of TREE explain plan and it represents kind of a profiling tool, because apart from information about how optimizer plans to execute the query (see above) there's also information from its actual execution. This information includes:
  • the time for returning the first row (in ms);
  • the time for returning all rows (in ms);
  • the number of rows read;
  • the number of loops.
It makes it possible to compare how optimizer estimations compare to the real execution of the query. Let's try it with the previous query. I have intentionally shortened the output.

+-------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                               |
+-------------------------------------------------------------------------------------------------------+
| -> Filter: (E.sal not between JS.sal_min and JS.sal_max)
       (cost=502072.82 rows=442901) (actual time=0.372..747.742 rows=915768 loops=1)
    -> Inner hash join (E.job = JS.job)
         (cost=502072.82 rows=442901) (actual time=0.355..575.011 rows=1000014 loops=1)
        -> Table scan on E  (cost=2534.62 rows=996514) (actual time=0.185..353.877 rows=1000014 loops=1)
        -> Hash
            -> Table scan on JS  (cost=0.75 rows=5) (actual time=0.133..0.144 rows=5 loops=1)
 |
+-------------------------------------------------------------------------------------------------------+
1 row in set (0.7754 sec)


If it looks familiar to you check PostgreSQL's EXPLAIN ANALYZE. Please, note that times are milliseconds, not seconds! I think it's easy to note that number of rows estimated by the optimizer for the scan of table E differs, because table's statistics are not accurate. I have added 1M rows to table emp, so it's necessary to increase the number of sampling pages (see innodb_stats_persistent_sample_pages) and run ANALYZE TABLE again. After doing it the optimizer estimation and actual number of rows match. However, the optimizer is also wrong about the rows involved into the upper level operations - the hash join and filter, but this cannot be fixed with index statistics. According to Norvald H. Ryeng (see his article MySQL EXPLAIN ANALYZE) both "estimated and actual numbers" of rows "are averages over all loop iterations", but I have just one loop for all the operations in the plan.

In any case EXPLAIN ANALYZE is a nice addition to the instrumentation of the optimizer. Unfortunately, neither TREE format nor EXPLAIN ANALYZE are available currently even in the latest MySQL Workbench, so another feature request from me as bug 97282.

MySQL is OpenSSL only

With this version the support for YaSSL and WolfSSL libraries are removed, so MySQL could be compiled only with OpenSSL. I have personally always built MySQL with OpenSSL (i.e. using option -DWITH_SSL=system), because this is what comes with Slackware, but of course the more important is that "the MySQL/OpenSSL combination is very well tested and production proven" as Georgi Kodinov explained in his post MySQL is OpenSSL-only now !.

Spatial

Function ST_Distance now accepts SRS arguments of all geometry types and not only arguments of types Point and Point, or Point and MultiPoint as before.
Happy using MySQL!

Update 2019-11-09: That timings are in milliseconds wasn't specified in the manual, so I filed bug 97492 as I also think that it would be good to print the unit next to timings.

2019-09-29

New usages of TREE explain plan in MySQL

As I wrote back in June (see TREE explain format in MySQL 8.0.16) a new execution plan format was introduced - TREE. It was a representation of the operations taken by the optimizer to satisfy a query in a nested form like a tree. The new format was defined experimental, so I was waiting to see how it would evolve in next versions. Well, last Friday as I was browsing the manual I noticed it's already updated for two new features coming in MySQL 8.0.18:
  • HASH joins would be visible only in TREE format. I find this rather strange, because you would expect to see this operation no matter the format of the plan, so I hope it's extended in future.
  • EXPLAIN ANALYZE would actually run the query and provide timing information about it's execution. I would also include additional iterator-based information about how the optimizer estimations compares to the actual execution.
I'm already eager to try the new version, which I presume should be released somewhere mid-October. Stay tuned as I'll write about these new features when they arrive and I'm able to play with them.

2019-07-31

New features in MySQL 8.0.17

Last Tuesday (July 27-th), Oracle released MySQL 8.0.7 following the three months release cycle introduced last year with 8 series. This is the next "maintenance" release introducing a few new features as well as depreciating some non-standard ones, so here are my picks starting with the ones I consider important for developers.

Multi-valued indexes

With the help of generated columns (introduced with MySQL 5.7.6) and functional indexes (introduced with MySQL 8.0.13 for which I wrote before) it become possible to index data in complex column values like JSON. But in JSON you could have scalars as well as arrays, so searching into arrays with the help of an index wasn't possible. Multi-valued indexes come to solve this by allowing multiple records in the index to point to the same data record. Such indexes are created like any other functional index and used automatically by the optimizer when possible. Let's see an example - a register for translators with their spoken languages.

CREATE TABLE translators (
  id INT AUTO_INCREMENT,
  jdata JSON,

  PRIMARY KEY(id)
);

  • Case 1 - array of strings
Let's create some data:

INSERT INTO translators (jdata)
VALUES ('{"name": "T1", "langs": ["English", "French", "Spanish"]}'),
       ('{"name": "T2", "langs": ["English", "Spanish"]}'),
       ('{"name": "T3", "langs": ["French", "Spanish"]}');


Then, let's query the translators that speak English using the new MEMBER OF operator:

SELECT id, jdata->>'$.name', jdata->'$.langs'
  FROM translators
 WHERE 'English' MEMBER OF (jdata->'$.langs');


Or the new function JSON_OVERLAPS like this:

SELECT id, jdata->>'$.name', jdata->'$.langs'
  FROM translators
 WHERE JSON_OVERLAPS(jdata->'$.langs', '["English"]');


Both queries are leading to the same result:

+----+------------------+----------------------------------+
| id | jdata->>'$.name' | jdata->'$.langs'                 |
+----+------------------+----------------------------------+
|  1 | T1               | ["English", "French", "Spanish"] |
|  2 | T2               | ["English", "Spanish"]           |
+----+------------------+----------------------------------+
2 rows in set (0.00 sec)


Given the data as expected the queries return T1 and T2, but not T3. However, these queries are doing a full table scan, so their performance would degrade with the accumulation of data in the table.

Execution plan without index

Fortunately, it's now possible to add a multi-valued index like this:

ALTER TABLE translators
  ADD INDEX idx_langs_arr ((CAST(jdata->'$.langs' AS CHAR(8) ARRAY)));

It's a functional index into which it's necessary to use function CAST with the new ARRAY keyword. With the index the execution plan of the SELECT queries above become respectively:
Execution plan of MEMBER OF with indexExecution plan of JSON OVERLAPS with index

  • Case 2 - array of objects
It's slightly different for arrays of objects, but only for the JSONPath expression. Let's create some data (after cleaning Case 1):

INSERT INTO translators (jdata)
VALUES ('{"name": "T1", "langs": [{"lang": "English"}, {"lang": "French"}, {"lang": "Spanish"}]}'),
       ('{"name": "T2", "langs": [{"lang": "English"}, {"lang": "Spanish"}]}'),
       ('{"name": "T3", "langs": [{"lang": "French"}, {"lang": "Spanish"}]}');


Then, let's query the translators that speak English the same two ways:

SELECT id, jdata->>'$.name', jdata->'$.langs[*].lang'
  FROM translators
 WHERE 'English' MEMBER OF (jdata->'$.langs[*].lang');


SELECT id, jdata->>'$.name', jdata->'$.langs[*].lang'
  FROM translators
 WHERE JSON_OVERLAPS(jdata->'$.langs[*].lang', '["English"]');


Just like in the first case the queries use full table scan, which could now easily be changed by adding a multi-valued index like this:

ALTER TABLE translators
  ADD INDEX idx_langs_obj ((CAST(jdata->'$.langs[*].lang' AS CHAR(8) ARRAY)));


Notice the slightly different syntax of the JSONPath expression. For the index to work it's of course necessary to use the same expression in WHERE clause as in the index definition. Developers that prefer to keep data directly into JSON columns should be happy by this new feature, as it makes it possible to index not only scalar variables, but also arrays.

JSON

Apart from the already mentioned new standard operator MEMBER OF for searching values in JSON arrays there are three new functions JSON_OVERLAPS, JSON_SCHEMA_VALID and JSON_SCHEMA_VALIDATION_REPORT.
  • Function JSON_OVERLAPS compares two JSON documents and returns true "if the two document have any key-value pairs or array elements in common". Like MEMBER OF and JSON_CONTAINS function JSON_OVERLAPS may benefit from multi-value indexes.
  • Functions JSON_SCHEMA_VALID and JSON_SCHEMA_VALIDATION_REPORT are added in regards to JSON Schema support. The first one validates a JSON document against a JSON schema and returns true if valid otherwise false, so its usable as a CHECK Constraint. The second one would provide insights into validation errors in the form of JSON document.

Cast to FLOAT, DOUBLE and REAL

Functions CAST and CONVERT now could cast to floating point data types such as FLOAT, DOUBLE and REAL. Let's try it with an incorrect value:

SELECT CAST('1.23.34' AS FLOAT) cast_res, CONVERT('1.23.34', FLOAT) conv_res;

+--------------------+--------------------+
| cast_res           | conv_res           |
+--------------------+--------------------+
| 1.2300000190734863 | 1.2300000190734863 |
+--------------------+--------------------+
1 row in set, 2 warnings (0.00 sec)

There are two warnings, so let's see them with show warnings:

+---------+------+---------------------------------------------+
| Level   | Code | Message                                     |
+---------+------+---------------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: '1.23.34' |
| Warning | 1292 | Truncated incorrect DOUBLE value: '1.23.34' |
+---------+------+---------------------------------------------+
2 rows in set (0.00 sec)

Casting to DOUBLE and REAL produced different result and the same warnings:

SELECT CAST('1.23.34' AS DOUBLE) cast_res, CONVERT('1.23.34', DOUBLE) conv_res;
SELECT CAST('1.23.34' AS REAL) cast_res, CONVERT('1.23.34', REAL) conv_res;


+----------+----------+
| cast_res | conv_res |
+----------+----------+
|     1.23 |     1.23 |
+----------+----------+
1 row in set, 2 warnings (0.00 sec)

CLONE command

For many years MySQL administrators had to dump masters, transfer the dump over network and load it into replicas in order to initialize their state (see Copy a Database from one Server to Another). I'm not that much into database administration, but it was cumbersome and annoying procedure especially in case of unrecoverable replication errors where I had to re-initialize the state of replicas, so I considered it a burden. Why I could not just "clone" the master after wiping the replica? Well, the new version makes it possible to easily create new instances or re-initialize existing by implementing native provisioning into the server with the CLONE command. It's made possible by the new MySQL Clone Plugin. You could find more about this by reading the following articles:
It's the focus of this release and I'm sure it's going to be a game changer for organizations using extensively MySQL replication.

Deprecations

This is a list of features that are depreciated with this release and would be removed in future releases:
  • Function FOUND_ROWS and query modifier SQL_CALC_FOUND_ROWS. Documentation suggests using COUNT(*) to find the number of rows.
  • Numeric data type attributes:
    • Integer display width. There is now warning "1681 Integer display width is deprecated and will be removed in a future release." if you try to create tables with INT(11) for example. I personally have such definitions as provided by mysqldump and MySQL Workbench, so I'll have to remove them all, before the statements are rejected with syntax error in future.
    • The ZEROFILL attribute. I never used it anyway.
    • The UNSIGNED attribute for FLOAT, DOUBLE, and DECIMAL data types. I haver used it as well.
    • AUTO_INCREMENT support for FLOAT and DOUBLE data types. Anyone?
  • The syntax  FLOAT(M,D) and DOUBLE(M,D) for specifying number of digits for floating point types.
  • Logical operators && (double ampersand), whish is synonym for AND, || (double pipe), which is synonym for OR and ! (exclamation mark), which is synonym for NOT. I'm not sure I ever used any of these even in daily queries, because I find using ANDOR and NOT a lot more expressive.
  • BINARY keyword for specifying _bin collations. I never used this as well.
Developers should consider getting rid of these nonstandard features to avoid unpleasant surprises in future. I'll personally review my projects and correct as soon as possible.

Bug fixes

In regards to CHECK constraints I found two bugs in the prevoius release and one of them was fixed (see bug #95189 CHECK constraint comparing columns is not always enforced with UPDATE queries). For the full example please, see bug_95189_test_case.sql script, so let's just verify it:

SET binlog_format = 'STATEMENT';
SET binlog_row_image = 'minimal';

UPDATE tst SET end_date = '2019-04-20' WHERE id = 1;
/* Error Code: 3819. Check constraint 'chk_dat' is violated. */

So it's fixed, because as expected the UPDATE query fails even with the specific binlog settings. The other one (see bug #95192 CHECK constraint comparing column with default value is not enforced) would have to wait.

That wraps up my review of new features in MySQL 8.0.17 maintenance release. I hope new releases bring more features for developers.

2019-06-05

TREE explain format in MySQL 8.0.16

I have omitted an interesting new feature in latest MySQL release 8.0.16, which I covered last month. It is mentioned in the release notes (see Functionality Added or Changed and scroll down or search for TREE) and also in the documentation (see Obtaining Execution Plan Information). As it turns out new TREE format for the explain plan was added.

Until now there were two options for format of the explain plans generated with EXPLAIN statement:
  • TRADITIONAL - the old tabular explain plan suitable for console;
  • JSON - the explain plan in JSON format used by MySQL Workbench to display nice graphics in Visual Explain, which I use in presentations.
Whoever is familiar with the traditional tabular output knows that it's not indented and sometimes it's hard to understand the order of operations. Now there is the option to use TREE format, which prints the "generated iterator tree" with the operations indented to right similarly to Oracle execution plans. This should help users to understand better how query execution is actually set up, but it's still experimental, so both the syntax and output may change in future.

Let's take the following two queries that generate the same result - minimal, average and maximum salaries for the departments in the organization based on the DEPT and EMP schema (see my adaptation for MySQL and EMP and DEPT script). The second query is possible only with MySQL 8.0.14 and later (see New development features in MySQL 8.0.14).

Query 1: Using LEFT OUTER JOIN and derived table

SELECT D.dname, DT.min_sal, DT.avg_sal, DT.max_sal
  FROM dept D
       LEFT OUTER JOIN
       (SELECT E.deptno,
               MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
          FROM emp E
         GROUP BY E.deptno
       ) AS DT
       ON DT.deptno = D.deptno;

Query 2: Using LATERAL derived table

SELECT D.dname, LDT.min_sal, LDT.avg_sal, LDT.max_sal
  FROM dept D,
       LATERAL
       (SELECT MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
          FROM emp E
         WHERE E.deptno = D.deptno
       ) AS LDT;


The tabular execution plans of the two queries (with columns partitions, possible_keys and filtered removed to shorten the output) and the extended output are as follows:

Query 1

+----+-------------+------------++-------++-------------+---------+-------------------+------++-------+
| id | select_type | table      || type  || key         | key_len | ref               | rows || Extra |
+----+-------------+------------++-------++-------------+---------+-------------------+------++-------+
|  1 | PRIMARY     | D          || ALL   || NULL        | NULL    | NULL              |    4 || NULL  |
|  1 | PRIMARY     | <derived2> || ref   || <auto_key0> | 5       | dept_emp.D.deptno |    2 || NULL  |
|  2 | DERIVED     | E          || index || fk_deptno   | 5       | NULL              |   14 || NULL  |
+----+-------------+------------++-------++-------------+---------+-------------------+------++-------+
3 rows in set, 1 warning (0.01 sec)


+-------+------+-----------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                             |
+-------+------+-----------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `dept_emp`.`D`.`dname` AS `dname`... left join (
                 /* select#2 */ select `dept_emp`.`E`.`deptno` AS `deptno`,min(`dept_emp`.`E`.`sal`) AS `min_sal`... |
+-------+------+-----------------------------------------------------------------------------------------------------+

1 row in set (0.00 sec)

Query 2

+----+-------------------+------------++------++-----------+---------+-------------------+------++----------------------------+
| id | select_type       | table      || type || key       | key_len | ref               | rows || Extra                      |
+----+-------------------+------------++------++-----------+---------+-------------------+------++----------------------------+
|  1 | PRIMARY           | D          || ALL  || NULL      | NULL    | NULL              |    4 || Rematerialize (<derived2>) |
|  1 | PRIMARY           | <derived2> || ALL  || NULL      | NULL    | NULL              |    2 || NULL                       |
|  2 | DEPENDENT DERIVED | E          || ref  || fk_deptno | 5       | dept_emp.D.deptno |    1 || NULL                       |
+----+-------------------+------------++------++-----------+---------+-------------------+------++----------------------------+
3 rows in set, 2 warnings (0.0006 sec)

+-------+------+-----------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                             |
+-------+------+-----------------------------------------------------------------------------------------------------+
| Note  | 1276 | Field or reference 'dept_emp.D.deptno' of SELECT #2 was resolved in SELECT #1                       |
| Note  | 1003 | /* select#1 */ select `dept_emp`.`D`.`dname` AS `dname`, ... join lateral (
                 /* select#2 */ select min(`dept_emp`.`E`.`sal`) AS `min_sal`...                                     |
+-------+------+-----------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)


That's not nice, right? The additional information printed with SHOW WARNINGS only marks the involved queries, so you could more easily identify them in the plan (corresponding to first column id). For the second query there is also a note about how the reference from the second query was resolved in the first query.

Now, here is how the new execution plan outputs look like:

Query 1

+-----------------------------------------------------------------+
| EXPLAIN                                                         |
+-----------------------------------------------------------------+
| -> Nested loop left join
    -> Table scan on D
    -> Index lookup on DT using <auto_key0> (deptno=d.deptno)
        -> Materialize
            -> Group aggregate: min(e.sal), avg(e.sal), max(e.sal)
                -> Index scan on E using fk_deptno                |
+-----------------------------------------------------------------+
1 row in set (0.0025 sec)

Query 2

+----------------------------------------------------------------------+
| EXPLAIN                                                              |
+----------------------------------------------------------------------+
| -> Nested loop inner join
    -> Invalidate materialized tables (row from D)
        -> Table scan on D
    -> Table scan on LDT
        -> Materialize (invalidate on row from D)
            -> Aggregate: min(e.sal), avg(e.sal), max(e.sal)
                -> Index lookup on E using fk_deptno (deptno=d.deptno) |
+----------------------------------------------------------------------+
1 row in set, 1 warning (0.0007 sec)
Note (code 1276): Field or reference 'dept_emp.D.deptno' of SELECT #2 was resolved in SELECT #1


That's nice, right? You see all the operations that would be executed and the indentation helps understand how they're nested or in other words which is executed before others. I will have to see how this new format would evolve, but even now I think it would be quite useful for complex queries, so it's very good they added it.

Thank you MySQL developers!

2019-05-24

Translations of Bugzilla 5.0.5 and 5.0.6

A few days ago I accidentally noticed that there are new versions of Bugzilla namely 5.0.5 (from 2019-01-30) and 5.0.6 (from 2019-02-09), which apparently appeared silently in the beginning of the year. Silently, because at leas I haven't received any notification and apparently I haven't logged in the web interfaces of my installation quite a while to notice the notification on the main page. I would have to check with the developers abut the notifications by e-mail, because they may have figured out something new. You could download the last version from the official page, but I highly recommend you to upgrade with Git - it's lot more easier and faster, especially if you have additional extensions, skins and other changes (see Upgrade below).

About the translation

Anyway, today I managed to update the translation. The changes were not much, because these are pure maintenance releases without significant new functionalities, which to come with new templates. Everything is already pushed to the repository, which I want to reorganized and eventually migrate to Git, but I still do not find the time. Generally the project seems not to develop well in the past 2-3 years, so I do not know it worth it. I'm leaving it as a task eventually for the summar.

Upgrade

This morning, after I prepared the translations, I managed to upgrade my installation, which was absolutely flawless following the instructions for direct upgrade from Git. With 5.0.5 the table bugs_fulltext was converted from MyISAM to InnoDB, and with 5.0.6 the type of the columns type_id in the tables flaginclusions, flagexclusions and flags was changed from  smallint to mediumint (see Integer types), which show allow for many more flagtypes. I got delayed a little only because I had to cleanup some local changes and hide (with .gitignore) some files (logs, patches, archives with translations, and etc.). That reminded me to issue a pull request for one of my local changes - adding of viewport meta tag, which is important nowadays for support of mobile devices.

Download the translations

You could download the translations form the usual places by FTP or HTTPS or download directly  version 5.0.6 and upgrade the translation by just de-archive in the base directory.

Happy using! And of course happy Bulgarian Education and Culture and Slavonic Literature Day (a.k.a Saints Cyril and Methodius' Day)!

2019-04-30

New features in MySQL 8.0.16 ecosystem

Last Thursday (2019-04-25), just before the Orthodox Easter holidays Oracle released version 8.0.16 of the products in the MySQL ecosystem (including server, router, shell, connectors and workbench GUI). The new versions of the server continued the tradition established with MySQL 8 of introducing new features even though they're still labeled "Maintenance Release". On the last Pre-FOSDEM MySQL Day Giuseppe Maxia suggested in a tweet that semantic versioning should be used and I stay behind this as I understand and support semantic versioning.

This would be really helpful, because version 8.0.15 of the server was as a true maintenance release fixing only an important group replication issue. Following the new numbering scheme (see MySQL 8.0: It Goes to 11!) all other products from the ecosystem were bumped to 8.0.15 even without any changes whatsoever. I'm really not sure this is necessary, because as soon as you're using the same major (and minor) versions then there shouldn't be a problem with compatibility. And the same version numbers doesn't really ensure full compatibility between the products as my bugs on MySQL Workbench (e.g. 90620, 92900, 92908 and 94012) show.

MySQL Server

So, what's new in the new release of the server. Here are my picks.

CHECK constraints

Personally, I was waiting for this feature since MySQL 4 when they started adding major new features into the server and it started becoming more like other more advanced databases. Bug #3465 was opened in April 2004 (and was there promised to be fixed in MySQL 5.1), so this SQL feature is finally implemented after more than 15 years.

Until now (as you may have noticed) the CHECK keyword in CREATE TABLE was parsed, but silently ignored. We used workarounds in the past. For example one could ensure only nonnegative values in column with numeric type (integers and fixed/floating point types) by using the nonstandard attribute UNSIGNED (see Numeric Type Attributes). More complex constraints could be implemented through the use of triggers and raising conditions (see SIGNAL syntax) as I've shown in my presentation MySQL 8 for developers on BGOUG Spring 2018 Conference and these would still be required.

MySQL 8.0.16 now supports both table constraints and column constraints (see CHECK constraints) for all storage engines. Table constraints appear anywhere in CREATE TABLE outside column definitions and could refer to one or more columns even with forward references (i.e. for columns defined later in the statement). Column constraints appear within the definition of a column and could refer only to this column. Unfortunately, the expression that specifies the check condition could use only literals, deterministic built-in functions, and operators (i.e. stored routines, UDFs, variables and subqueries are not permitted). It is also not possible to use CHECK constraints on columns with foreign key referential actions (i.e. ON UPDATE, ON DELETE).

The thing to consider with CHECK constraints is that error is raised only if the condition evaluates to FALSE. If it evaluates to UNKNOWN due to NULL values error won't be raised. This is covered very well by Markus Winand in his article The Three-Valued Logic of SQL, so I highly recommended you to read the whole article.

CHECK constraints have a name of up to 64 characters and if not specified the server would generate a name like [table name]_chk_[ordinal number], because the names must be unique per schema. A useful feature is the possibility to create, but not enforce the constraint (i.e. with NOT ENFORCED clause), which is like enable/disable for triggers that is a feature I'd like to see implemented in future releases.

For basic usage and simple examples of CHECK constraints in MySQL, please check the manual and MySQL 8.0.16 Introducing CHECK constraint blog post. There's a more interesting example of CHECK constraint on JSON data in the article MySQL 8.0.16: how to validate JSON values in NoSQL with check constraint.

Here's my first example. I think it's quite common to have two dates in a table (e.g. start and end date) that should represent the beginning and end of something, so they should be in chronological order. Let's try to do this on the emp table from the example DEPT and EMP schema of Oracle, which I adapted for MySQL.

ALTER TABLE emp
  ADD COLUMN retdate DATE AFTER hiredate,
  ADD CONSTRAINT ret_after_hire CHECK (
retdate > hiredate);

However, when I tried to update an employee row that already has value for hire date with the following query (the retirement date presumably coming from wrong user input and/or bad application):

UPDATE emp
   SET retdate = STR_TO_DATE('1019-04-25', '%Y-%m-%d')
 WHERE empno = 7369;

1 row(s) affected Rows matched: 1  Changed: 1  Warnings: 0

the query unexpectedly succeed and there was no "constraint violated" error. After discussing with Frédéric Deschamps on Slack I filed bug #95189. It seems to be related to replication, because on my master MySQL 8.0.16 server on windows the problem isn't reproducible and the query works as expected, but the test I've done was on my MySQL 8.0.16 server running on Slackware64 -current, which is replicating from my main MySQL 5.7.26 server.

Here's another example. A table constraint that ensures hire date is today or in future and salary is positive number.

ALTER TABLE emp
  ADD CONSTRAINT emp_chks CHECK (hiredate >= CURDATE() AND sal > 0);

Such a constraint may seem reasonable (if you consider just inserting data), but of course results in:

Error Code: 3814. An expression of a check constraint 'emp_chks' contains disallowed function: curdate.

because CURDATE is non-deterministic function. Such constraint is possible in PostgreSQL (which also supports usage of stored routines and UDFs), but not in Oracle (that supports usage of storage routines) and MariaDB (which pretty much has the same restrictions as MySQL although not clearly listed). The problems are how do you validate the constraint for existing rows and how do you update rows, because the value of CURDATE changes every day. The solution is to create an additional column preserving the current date when the row was created and then use it to validate the constraint.

If you try to directly create the column and enforce the constraint (i.e. with an ALTER TABLE) you'll of course get:

Error Code: 3819. Check constraint 'emp_chks' is violated.

because check constraints, like other constraints (e.g. primary and foreign keys, unique indexes, NOT NULL), are validated for all the rows when created and should always remain valid. So, let's try another way.

First, initialize the new column from existing data (e.g. based on hiredate column as expressions are possible as default values since 8.0.13):

ALTER TABLE emp
  ADD COLUMN created DATE DEFAULT (hiredate);

Then, modify the column and add the check constraint:

ALTER TABLE emp
  MODIFY COLUMN created DATE NOT NULL DEFAULT (CURDATE()),
  ADD CONSTRAINT emp_chks CHECK (hiredate >= created AND sal > 0);

Now, let's try some queries:

UPDATE emp SET hiredate = '1979-04-25' WHERE empno = 7369;

INSERT INTO emp
  (empno, ename, job, mgr, hiredate, sal)
VALUES
  (9999, 'MULDER', 'INVESTIG.', 7839, '2019-04-25', 4242);

As expected, the UPDATE query produces error:

Error Code: 3819. Check constraint 'emp_chks' is violated.

but the INSERT query succeeds, which is unexpected. Apparently, the value of created is not initialized with the default value when the CHECK constraint is validated. I tried the same (although slightly different syntax) on Oracle XE 18 and it worked as expected - both the UPDATE and INSERT queries violated the CHECK constraint.

There is a new table CHECK_CONSTRAINTS in INFORMATION_SCHEMA that provides information on defined check constraints in all sachems. Additional information about table name and whether a CHECK constraint is enforced or not could be retrieved from table TABLE_CONSTRAINTS by filtering the data by the new value CHECK of column CONSTRAINT_TYPE.

Spatial data

After ST_Distance since 8.0.14, now ST_Length function also supports optional second argument unit, so it's possible to calculate lengths in the different supported units as defined in INFORMATION_SCHEMA.ST_UNITS_OF_MEASURE table.

Server does all upgrades

The server is already capable of upgrading everything - the mysql schema, data dictionary and system tables as well as PERFORMANCE_SCHEMA, INFORMATION_SCHEMA, sys and user schemes if necessary (see What the MySQL Upgrade Process Upgrades), so mysql_upgrade command is going into retirement. It's an important administrative feature, because this would make upgrades easier and more convenient. In the past I regularly forgot to execute the command, which led to strange problems later.

In regard to upgrades I had this strange problem that the server wouldn't start with the error message [ERROR] [MY-013384] [Server] Could not create server upgrade info file at '/var/lib/mysql/data/' although the permissions were OK. I was able to find the gist mysql will not start after 8.0.15 to 8.0.16 update on Ubuntu 16.04 LTS with Google and after creating the file mysql_upgrade_info in /var/lib/mysql/data/ and changing its ownership to mysql:mysql the server was able to start successfully. I was upgrading from 8.0.14 in fact, but I think it may be a bug. Perhaps the server expects the file to have been created.

Another interesting new feature in the option --validate-config for validation of server configuration much like system administrators are used to do so with other servers (e.g. Apache). It's really important especially for production environments where unexpected downtime could be extremely unpleasant. Read more in Server Configuration Validation.

System users

MySQL accounts are now categorized and thus distinguished by system (who posses the SYSTEM_USER privilege) and regular users (who does not). This allows for better separation of roles as only system users could perform certain administrative operations on system accounts as previously any user having the appropriate privileges could for example delete any account or kill connections belonging to any user. It's also possible to revoke global privileges partially for certain schemas (i.e. like an exceptions) through the new system variable partial_revokes, which previously required granting privileges separately for each existing schemes and adding grants for each new schema.

Security

The new things are support for TLS 1.3, possibility to update SSL certificates without restarting the running server (see ALTER INSTANCE RELOAD TLS) and information about the certificates in PERFORMANCE_SCHEMA's keyring_keys table).

MySQL Router

The router now has an HTTP component that allows it to expose simple web-interface and REST APIs for the purpose to ensure better observability through integration with external tools for monitoring and management. Other notable changes include dynamic change between single and multi master modes and improved logging. There is now the WITH_ROUTER CMake option to build router together with the MySQL server, which defaults to ON and which I decided to switch OFF, because I plan to continue building router as a separate package.

MySQL Shell

The shell comes with the new Shell Reporting Framework (see Reporting with MySQL Shell), which allows registration, displaying and monitoring of custom reports. I'm eager to try it and I'll write a separate post on this later. See the article MySQL Shell 8.0.16: User Defined Reports by Jesper Wisborg Krogh. It's now also possible to execute SQL without changing the mode, as the \sql command now doesn't change the mode if provided with an SQL statement, which I think is pretty useful.

MySQL Workbench

MySQL Workbench now supports window functions in SELECT (see bug #90620), expressions in DEFAULT (see bug #92900) and as key parts (see bug #92908) as well as the LATERAL keyword (see bug #94012) - all of these implemented into previous 8.0.x versions of the server. Unfortunately, I do not see any support for CHECK constraints (see bug #95143 that was already verified), so again the same version numbers mean nothing.