Georgi Sotirov's Blog

My personal and professional life

2019-02-10

First migrations from CVS to Git

Last May I had the good intention to start gradually migrating my old CVS repositories to Git, but I only managed to begin the process this weekend. Before plunging deeper I decided to migrate some smaller repositories with linear history, so I could exercise, verify and master the process. As a result some small projects already landed on my GitHub profile (see guthub.com/gdsotirov). I have read much about migrating from CVS and SVN to Mercurial and Git in relation to an internal project in the company I'm working for. I have also already experimented migrating from CVS to SVN in the past and from CVS to Hg/Git more recently, so I considered myself prepared, but still there are some things to consider when doing such migration.

Repository preparation

Revision log messages. You may have a convention for writing revision log messages in CVS, but I did not have one for my personal projects, so I have written comments on single line and on multiple lines as unnumbered list like this:

* Add this
* Modify that

I had to modify these comments in CVS, before the migration, because these simply do not look good in TortoiseGit and GitHub. I also aligned some similar comments on files committed in few minutes time span, so that the migration could group them in a single revision. Depending on whether you had a convention for the format of the revision log messages in CVS you may need to make other adjustments. The easiest way in such case may be to modify directly the revision control files with perl, sed, awk, etc. while of course paying caution not to ruin your history (i.e. such operations should not be performed on "live" production repository).

Authors. You should be able to identify all your authors, because in CVS repositories they are normally UNIX login names and the users may no longer exist in the system. I've seen even cases where the same username was reused by different developers having the same given and family names. It's hard to identify and fix such revisions unless you're aware of developers' internship in the company, but what if their time in the company overlaps?
Hopefully, I do not have this problem with my personal repositories, but still I had to rewrite my username as full name and email address (see more about this in chapter Step One: The Author Map of Eric Raymond's DVCS migration HOWTO). It's very important to rewrite the authors, because otherwise the commits won't be properly identified in GitHub (see Contributions that are counted). If you need to fix usernames it's preferable to do it in the CVS repository before the conversion. And the easiest way to map usernames to authors is to rewrite the converted Git repository as I write about below.

Tools

In such a process one of the most important things is to chose the right tools, because you simply cannot go without them. Last year at work we evaluated several options and finally choose cvs2git, which is part of cvs2svn project. The command is written in Python and supports various options. For me it is important to make the best possible migration, so I had to make the following modifications in the example options file:

KeywordHandlingPropertySetter('expanded'),

from cvs2svn_lib.keyword_expander import _KeywordExpander
# Ensure dates are expanded as YYYY/MM/DD as in CVS 1.11
_KeywordExpander.use_old_date_format()


The first line ensures that CVS keywords (e.g. $Id$, $Date$, etc.) are expanded in text files (see topic cvs2svn changes 'date' string in source codes). The following three lines ensure that dates in keywords are expanded as in CVS 1.11 or in YYYY/DD/MM format, instead of the default format YYYY-MM-DD as in CVS 1.12 and SVN.

Migration

The way to covert a CVS repository to Git is described in cvs2git's documentation (see chapter Usage), so just read it. I wrote myself a shell script to execute the different steps defined there. Initially I used cvs2git only with command line options, but because of my requirements for keywords expansion (see above), I had to improve the script to generate options file and use it instead. Like this I retained the way I call the script by just providing path to the CVS repository.

For now I haven't had problems with the migration - the history was represented properly in Git. However, with many tags and branches over different files things could get really messy as I found out with the internal project at work. After the migration I check the result and if necessary fix more revision log messages. Then, I just rewrite authors with the script given in Changing author info and finally import the repository to GitHub (e.g. I use git daemon --verbose --export-all --base-path="/path/to/repo" to serve the converted repository). After the import to GitHub I have to add LICENSE and README.md files, but this is quite trivial directly through the web interface.

And that's all. I have more repositories to migrate and I really hope I'll be able to finish the task until the end of the year, but it would all depends on how quickly I'll be able to prepare the repositories, because after this it's the jobs of the tools.

2019-01-31

My MySQL Bugs

Since I started using MySQL 3.23.x during my university years I was trying to solve all my problems myself like with all the other free and open source software I was using. At first it was more about learning (or lack of knowledge if you want), but then the problems become more complex and harder to solve. Well, I was successful most of the times, so I didn't consider reporting bugs or asking for support. However MySQL ecosystem grew large since then, the software become lot more complex and the source code base increased significantly, so nowadays it's native to think one could manage to fix all the problems he encounters without going deep in product's internals or relaying on professional support. The first is possible only for MySQL developers well familiar with the code base and large organizations like Facebook, Booking, etc., which have dedicated teams and regularly contribute patches. The second depends on you and of course the company providing the support.

I haven't reported much bugs for MySQL - just 21, but I now consider reporting everything I found which seems like a bug, because this is the right thing to do. I may as well solve the problem myself, but someone else could be stuck by the same and should at least be aware that it's a known (and eventually already resolved) problem. With each one of these I learned a little something, so here below I make a review of my bugs so far describing their fate in chronological order.
  1. Bug 20098 (my first one) was reported on 2006-05-26 and is with status "No Feedback" as of 2006-07-01. It was about Query browser searching for a file that is not present (preferences.glade). I have forgot about this one and found out it was closed later, because no feedback was provided, which is fun as I have provided the requested information...
  2. Bug 69459 was reported 2013-06-13 and is in status "Can't repeat" as of 2013-12-24. It was about MySQL Workbench crashing on entering dot in a SQL string when automatic code completion is enabled. I reported the bug for version 5.2.47 CE, but I was suggested to try 6.0.7, which worked, so the bug didn't receive more attention since 5.2 series didn't receive more updates.
  3. Bug 73076 (my first confirmed and fixed one) was reported on 2014-06-22 and is with status "Closed" as of 2014-08-26. It was about MySQL Workbench actually committing modifications in a record set when you only apply them. It was a nasty one making my daily work with Workbench difficult as I had to be careful not to apply (pardon commit) too early. The problem appeared in 6.0.x, 6.1.x and was finally fixed with the release of 6.2.2 on 2014-09-05 (see release notes) after being verified by support.
  4. Bug 73079 was reported on 2014-06-23 and is with status "Closed" as of 2014-09-15. It was about MySQL Workbench not refreshing the cell after setting value to NULL. It affected 6.1.x and early 6.2.x versions. Although it wasn't officially confirmed for me it was a regression, but more importantly it was fixed with the release of 6.2.3 on 2014-09-23 (see release notes).
  5. Bug 73708 was reported on 2014-08-25 and is with status "Closed" as of 2014-12-01. I was about MySQL Workbench models being completely mangled. I forgot this one for a few months and provided the requested feedback later to realize the problem was already fixed with the release of 6.2.4 on 2014-11-20 although nothing was mentioned in the release notes.
  6. Bug 73770 (my oldest one still open) was reported on 2014-08-29 and is still with status "Verified" as of 2018-02-05. It's a feature request for MySQL synchronizer between model and database to display the actual differences. Whoever used this functionality should know that applying changes in the model to the database could be quite annoying, because there's always some changes you do not expect and furthermore you do not understand. You only see the update script generated, but it would be nice to see the actual difference, which in some cases may not even be difference (see 90772 below).
  7. Bug 82202 was reported on 2016-07-12 and is with status "Closed" as of 2018-01-31. It was about impossibility to link Connector/ODBC 5.3.6 to the shared libmysqlclient.so due to undefined symbols (e.g. my_malloc, my_free, etc.). The problem was quickly explained, but then closed as "Not A Bug". It was reopened after a Fedora maintainer reported the same problem and finally fixed with release of Connector/ODBC 5.3.10 on 2018-01-30 (see release notes).
  8. Bug 84951 (my first suggested patch) was reported on 2017-02-10 and is with status "Duplicate" as of 2017-02-12. It was about problem building MySQL Workbench 6.3.9, because of compile errors in jsonparser.cpp and jsonview.cpp for which I suggested a patch. The bug was made duplicate of 84886, which also contained a patch provided two days earlier, but it was closed on 2018-05-14 as "Won't fix", because they "no longer support 32bit systems" (!?).
  9. Bug 89608 was reported on 2018-02-09 and is still with status "Verified" as of 2018-02-13. It is a feature request about having password requirements messages in MySQL Workbench match the server configuration, so they're more useful. I provided sample code about how it could be done and even wrote a SQL function myself to test the idea.
  10. Bug 89615 was reported on 2018-02-10 and is with status "Unsupported" as of 2018-08-23. It was about failure to build MySQL Shell 1.0.10, because compilation fails for undefined vio* functions. It was related to 82202 and although confirmed it was later closed with the explanation that "MySQL Shell 1.0 is no longer supported and version 8.0 should be used instead" (!?) to which I reacted a bit harsh, but I really do not understand such answers.
  11. Bug 90619 was reported on 2018-04-25 and it is with status "Duplicate" of 79315 as of 2018-04-25. It was about MySQL Installer not offering upgrade from MySQL 5.7 to 8.0, which was explained as "not a bug", because it was "working as expected", so apparently MySQL Installer doesn't support upgrade between major releases. I hope this changes in future.
  12. Bug 90620 (my first of a series) was reported on 2018-04-25 and is still with status "Verified" as of 2018-04-26. It is about MySQL Workbench 8.0.11's SQL editor displaying error on SELECT query with window functions. You wouldn't expect MySQL Workbench having the same version as the server to lack support for new server functionalities, but more about this later.
  13. Bug 90727 was reported on 2018-05-03 and is with status "Closed" as of 2018-06-01. It was about impossibility to link Connector/C++ 1.1.11 due to missing mysql_sys and mysql_strings libraries. It turned out to be a regression from some Solaris specific modifications not made Solaris specific. I got in touch with the developer and it was declared fixed in 1.1.12, which got released on 2019-01-28 (see release notes), so for 1.1.11 I had to patch it myself.
  14. Bug 90772 was reported on 2018-05-06 (St George's Day) and is still with status "Verified" as of 2018-05-09. It is about MySQL Workbench synchronizer making difference between single quotes escapes (e.g. the statement COMMENT 'Currency\' symbol' is different from the statement COMMENT 'Currency''s symbol'). It is because MySQL Workbench allows backslash escape and the server accepts it execution, but then rewrites it with single quote escape. Thus, on the next synchronization the difference remains (e.g. like you haven't synchronized at all).
  15. Bug 90876 was reported on 2018-05-15 and is with status "Closed" as of 2018-06-21. I was about MySQL Shell 8.0.11 giving error 5115 on adding documents to collection in Server 5.7, but what it was actually about was automatically generated identifiers as I explained in details in my article MySQL Shell 8 and automatically generated document IDs. This one resulted in documentation update and since nobody else experienced this problem I guess it's fine, but for me such problems break backwards compatibility.
  16. Bug 91841 was reported on 2018-07-31 and is with status "Closed" as of 2019-01-25. It was about impossibility to build MySQL Connector/ODBC 5.3.11, because of compilation errors. I was able to fix the compilation errors myself (see patch) until waiting for help and explanation. And the explanation was that it was duplicate of an internal bug, "which explains the reasons for the failure", but which I'm not able to read. The funny thing about this one was that there were more than 400 lines wiped out from the beginning of a header most probably related to copyright text updates. It was fixed with the release of 5.3.12 on 2019-01-28 (see release notes).
  17. Bug 92898 was reported on 2018-10-23 and is with status "Not a Bug" as of 2018-10-23. I consider this as my one and only "support request" and I was pleased, that it was explained the same day.  It is about hitting an incompatible change during upgrade from 8.0.12 to 8.0.13, because I still had same old views using ASC or DESC qualifiers for GROUP BY clause. In this regards I'd like to thank MySQL for the detailed release notes they publish although as in this case even if you read them carefully you may omit something by considering it doesn't concern you. For this reason it's highly recommended to use MySQL Shell's Upgrade Checker Utility, which could tell you about compatibility errors and issues in advance.
  18. Bug 92900 was reported on 2018-10-23 and is with status "Verified" as of 2018-10-23. It is about MySQL Workbench 8.0.13 lacking support for expressions in DEFAULT although the feature is supported by the same version of the server.
  19. Bug 92908 was reported on 2018-10-23 and is with status "Verified" as of 2018-10-23. Is is about MySQL Workbench 8.0.13 lacking support for expressions as key parts although the feature is supported by the same version of the server.
  20. Bug 93835 was reported on 2019-01-07 and is with status "Verified" as of 2019-01-10. It's a feature request for having MySQL Workbench display a warning on using keyword as identifier (e.g. table and column names). I opened it after Frédéric Descamps suggested so in a comment on LinkedIn. Fingers crossed it's implemented.
  21. Bug 94012 was reported on 2019-01-23 and is with status "Verified" as of 2019-01-23. It is about MySQL Workbench 8.0.14 not recognizing the new LATERAL keyword and giving error after it in SQL editor. It's by now the last one of the series on MySQL Workbench not supporting the features of the server even though they're the same version. The bug was nominated as MySQL Bug of the Day by Valerii Kravchuk on the same day.
I hope I haven't bored you with all the information above. It's not always easy to properly explain a problem, but I have quite some experience being on the "other side", so I think I know what information is required for a problem to be investigated thoroughly. Still reporting bugs properly is probably an art and I'm looking forward to Valerii Kravchuk's presentation How to create a useful MySQL bug report at FOSDEM on Saturday.

Some of my bugs are pure technical ones (e.g. 82202, 84951, 89615, 90727 and 91841) for which I reported all the details and still the explanations on some of them seem pretty strange to me (e.g. no longer support 32bit systems, MySQL Shell 1.0 is no longer supported and version 8.0 should be used instead) even though the problem and what should be fixed was quite clear. I do not understand how could you not release a new maintenance version when there are open bugs for the last such release in the series? What vendors should understand is that as a package maintainer for me it's more complex to support patches for problems not fixed upstream.

I've made a series of bugs about unsupported sever features in MySQL Workbench (e.g. 90620, 92900, 92908 and 94012), which made me think about the meaning of version number alignment that was done with the release of MySQL 8.0.11 as General Availability release (see MySQL 8.0: It Goes to 11!). As I understand the post it was done, so that we use the "correct versions" of the all the products. It's generally a good idea, but shouldn't the "correct version" support the same feature set of the server? Users should be presumably less confused if so, but so far this is clearly not the case with Workbench. I hope the situation improves in future.

I'd like to complete these post with some final thoughts. As a developer I'm aware that could never be a software completely free of bugs. However, as a user I expect that when a bug is properly reported and the problem is clear the solution to not take that much time, but of course what really matters are priorities and they are normally as higher as the number of affects users is bigger.

2019-01-23

New development features in MySQL 8.0.14

With the release of MySQL 8.0.14 Oracle keeps the already established practice of introducing new development features even with maintenance releases, which usually include only small improvements and mostly bug fixes. I took a look at 8.0.14's release notes, the post The MySQL 8.0.14 Maintenance Release is Generally Available by and of course the manual, experimented and here below are my picks related to development.

Lateral derived tables

Before MySQL 8.0.14 it wasn't possible to refer to (depend on) columns of preceding tables in the same FROM clause. Now this restriction is removed with the addition of LATERAL keyword (see Lateral derived tables). The LATERAL keyword means that the derived table depends on the previous table on its left side. You could have more than one LATERAL derived table in a query and each one would depend only on the previous table or derived table much like with other join types. Lateral derived tables are so called "for each" loop of SQL and this makes possible some operations that were otherwise not possible or less efficient.
Here's an example. Let's say you want to calculate the minimum, average and maximum salaries for each department in the organization. Previously you would have to write it like this:
SELECT D.dname, DT.min_sal, DT.avg_sal, DT.max_sal
  FROM dept D,
       (SELECT E.deptno, MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
          FROM emp E
         GROUP BY E.deptno
       ) AS DT
 WHERE DT.deptno = D.deptno;

So use a derived table DT to calculate min/avg/max salary for all departments from emp table and then join with the dept table to produce the following result:
+------------+---------+
| dname      | max_sal |
+------------+---------+
| ACCOUNTING | 5000.00 |
| RESEARCH   | 3000.00 |
| SALES      | 2850.00 |
+------------+---------+
3 rows in set (0.0008 sec)

The derived table is completely independent of the other joined table as it could generate result by itself (i.e. it doesn't depend on the column values of the other table). The explain plan for this query is given on the right and it confirms that the result set of the derived table is first materialized, so it could be joined to the other table.
Another approach would be to use sub queries in the SELECT clause like this:
SELECT D.dname,
       (SELECT MIN(E.sal) FROM emp E WHERE E.deptno = D.deptno) AS min_sal,
       (SELECT AVG(E.sal) FROM emp E WHERE E.deptno = D.deptno) AS avg_sal,
       (SELECT MAX(E.sal) FROM emp E WHERE E.deptno = D.deptno) AS max_sal
  FROM dept D;

which won't be efficient (imagine a sales table and thousands of salesman if you want to evaluate their salaries) as three queries would have to do the job of one and for each row from the main table. Having to select just one column in SELECT sub queries is a drawback and MySQL server generates error Error Code: 1241. Operand should contain 1 column(s) if you try to select more.
If you try to connect the derived table to the other table with a query like the following one:
SELECT D.dname, DT.min_sal, DT.avg_sal, DT.max_sal
  FROM dept D,
       (SELECT MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal
          FROM emp E
         WHERE E.deptno = D.deptno
       ) AS DT;

you'll get Error Code: 1054. Unknown column 'D.deptno' in 'where clause', because D table is not known to the derived table. The query is illegal in SQL-92, but in SQL-1999 it becomes legal if the derived table it's proceeded by the LATERAL keyword:
SELECT D.dname, LDT.min_sal, LDT.avg_sal, LDT.max_sal
  FROM dept D,
       LATERAL
       (SELECT

MIN(E.sal) min_sal, AVG(E.sal) avg_sal, MAX(E.sal) max_sal 
          FROM emp E
         WHERE E.deptno = D.deptno
       ) AS LDT;

and produces the following result:
+------------+---------+
| dname      | max_sal |
+------------+---------+
| ACCOUNTING | 5000.00 |
| RESEARCH   | 3000.00 |
| SALES      | 2850.00 |
| OPERATIONS |    NULL |
| REPAIRS    |    NULL |
+------------+---------+
5 rows in set (0.0006 sec)

As clear from the explain plan graph on the right in this case there's no grouping, but MySQL gives higher query cost, because access to the derived table is by full table scan. The more interesting information is however into the tabular explain plan (column partitions is intentionally hidden):

+----+-------------------+------------++------+---------------+-----------+---------+-------------------+------+----------+----------------------------+
| id | select_type       | table      || type | possible_keys | key       | key_len | ref               | rows | filtered | Extra                      |
+----+-------------------+------------++------+---------------+-----------+---------+-------------------+------+----------+----------------------------+
|  1 | PRIMARY           | D          || ALL  | NULL          | NULL      | NULL    | NULL              |    4 |      100 | Rematerialize (<derived2>) |
|  1 | PRIMARY           | <derived2> || ALL  | NULL          | NULL      | NULL    | NULL              |    2 |      100 | NULL                       |
|  2 | DEPENDENT DERIVED | E          || ref  | fk_deptno     | fk_deptno | 5       | dept_emp.D.deptno |    4 |      100 | NULL                       |
+----+-------------------+------------++------+---------------+-----------+---------+-------------------+------+----------+----------------------------+
3 rows in set, 2 warnings (0.0010 sec)
Note (code 1276): Field or reference 'dept_emp.D.deptno' of SELECT #2 was resolved in SELECT #1
Note (code 1003): /* select#1 */ select `dept_emp`.`d`.`dname` AS `dname`,`ldt`.`min_sal` AS `min_sal`,`ldt`.`avg_sal` AS `avg_sal`,`ldt`.`max_sal` AS `max_sal` from `dept_emp`.`dept` `d` join lateral (/* select#2 */ select min(`dept_emp`.`e`.`sal`) AS `min_sal`,avg(`dept_emp`.`e`.`sal`) AS `avg_sal`,max(`dept_emp`.`e`.`sal`) AS `max_sal` from `dept_emp`.`emp` `e` where (`dept_emp`.`e`.`deptno` = `dept_emp`.`d`.`deptno`)) `ldt`

There are two new information and an additional note. The plan clearly shows that the derived table E (derived2) is DEPENDENT on the other table and that its rematerialized for each row of D (see EXPLAIN extra information). This is the reason for which lateral derived tables are also known as the "for each" loop of SQL. The note informs about how the external reference in the derived table was resolved. If you compare the results you should also notice two additional rows (i.e. departments without employees), which means LATERAL keyword has actually made left join.

Of course MySQL Workbench (even upgraded to 8.0.14 as well) is again not familiar with the new syntax (see my previous post New development features in MySQL 8.0.13), because it doesn't properly color the new keyword and shows error in SQL editor just after it. I reported this as bug 94012, but I have no much hope as 90620, 92900 and 92908 were verified yet still open. Version numbers doesn't mean much nowadays :-)

Please, note that it's possible even to connect to the outer table if the derived table is in a sub query (see an example in WL#461).

JSON aggregation functions can now be used as window functions

It is now possible to use JSON_ARRAYAGG and JSON_OBJECTAGG aggregate functions as window functions by using the OVER clause (see Window Function Concepts and Syntax). This makes all (except COUNT(DISTINCT) and GROUP_CONCAT) of the aggregate functions possible for use as window functions after bitwise AND/OR/XOR functions were made so with MySQL 8.0.12. Here's an example:

SELECT E.ename, E.sal,
       AVG(E.sal) OVER dw AS avg_sal,
       JSON_OBJECTAGG(D.dname, E.sal) OVER dw AS dept_sal
  FROM emp  E,
       dept D
 WHERE E.deptno = D.deptno
WINDOW dw AS (PARTITION BY D.deptno);

+--------+---------+-------------+------------------------+
| ename  | sal     | avg_sal     | dept_sal               |
+--------+---------+-------------+------------------------+
| CLARK  | 2450.00 | 2916.666667 | {"ACCOUNTING": 1300.0} |
| KING   | 5000.00 | 2916.666667 | {"ACCOUNTING": 1300.0} |
| MILLER | 1300.00 | 2916.666667 | {"ACCOUNTING": 1300.0} |
| SMITH  |  800.00 | 2175.000000 | {"RESEARCH": 3000.0}   |
| JONES  | 2975.00 | 2175.000000 | {"RESEARCH": 3000.0}   |
| SCOTT  | 3000.00 | 2175.000000 | {"RESEARCH": 3000.0}   |
| ADAMS  | 1100.00 | 2175.000000 | {"RESEARCH": 3000.0}   |
| FORD   | 3000.00 | 2175.000000 | {"RESEARCH": 3000.0}   |
| ALLEN  | 1600.00 | 1566.666667 | {"SALES": 950.0}       |
| WARD   | 1250.00 | 1566.666667 | {"SALES": 950.0}       |
| MARTIN | 1250.00 | 1566.666667 | {"SALES": 950.0}       |
| BLAKE  | 2850.00 | 1566.666667 | {"SALES": 950.0}       |
| TURNER | 1500.00 | 1566.666667 | {"SALES": 950.0}       |
| JAMES  |  950.00 | 1566.666667 | {"SALES": 950.0}       |
+--------+---------+-------------+------------------------+
14 rows in set (0.0021 sec)

It's important to note that MySQL does not permit duplicate keys in JSON data type, so in a window without ordering function JSON_OBJECTAGG would return the last value for the key, which may not be deterministic.

X Protocol improvements

According to the release notes data is now always converted to utf8mb4 character set (using the utf8mb4_general_ci collation). The other notable improvement is the support for "prepare functionality". The release notes provide no reference about this new functionality, but has mentioned WL#9270 in his article, so I believe this is about preparing CRUD operations (see Preparing CRUD Statements). A simple example in JavaScript would be the following:


MySQL Shell 8.0.14
Copyright (c) 2016, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.
 MySQL  JS > \connect user@localhost
Creating a session to 'user@localhost'
Your MySQL connection id is 22 (X protocol)
Server version: 8.0.14 MySQL Community Server - GPL
 MySQL  localhost:33060+ ssl  JS > \use test
Default schema `test` accessible through db.
 MySQL  localhost:33060+ ssl  test  JS > var usr = db.createCollection('users')
 MySQL  localhost:33060+ ssl  test  JS > usr.add({name:"User1",age:15})
Query OK, 1 item affected (0.0108 sec)
 MySQL  localhost:33060+ ssl  test  JS > usr.add({name:"User2",age:17})
Query OK, 1 item affected (0.0138 sec)
 MySQL  localhost:33060+ ssl  test  JS > usr.add({name:"User3",age:20})
Query OK, 1 item affected (0.0105 sec)
 MySQL  localhost:33060+ ssl  test  JS > usr.add({name:"User4",age:19})
Query OK, 1 item affected (0.0137 sec)
 MySQL  localhost:33060+ ssl  test  JS > usr.add({name:"User5",age:16})
Query OK, 1 item affected (0.0118 sec)
 MySQL  localhost:33060+ ssl  test  JS > usr.find()
[
    {"_id": "00005c46e7e5000000000000000a","age": 15,"name": "User1"},
    {"_id": "00005c46e7e5000000000000000e","age": 17,"name": "User2"},
    {"_id": "00005c46e7e5000000000000000f","age": 20,"name": "User3"},
    {"_id": "00005c46e7e50000000000000010","age": 19,"name": "User4"},
    {"_id": "00005c46e7e50000000000000011","age": 16,"name": "User5"}
]
5 documents in set (0.0006 sec)
 MySQL  localhost:33060+ ssl  test  JS > var fcmd = usr.find('age >= :page')
 MySQL  localhost:33060+ ssl  test  JS > fcmd.bind('page', 18)
[
    {"_id": "00005c46e7e5000000000000000f","age": 20,"name": "User3"},
    {"_id": "00005c46e7e50000000000000010","age": 19,"name": "User4"}
]
2 documents in set (0.0006 sec)
 MySQL  localhost:33060+ ssl  test  JS > fcmd.bind('page', 16)
[
    {"_id": "00005c46e7e5000000000000000e","age": 17,"name": "User2"},
    {"_id": "00005c46e7e5000000000000000f","age": 20,"name": "User3"},
    {"_id": "00005c46e7e50000000000000010","age": 19,"name": "User4"},
    {"_id": "00005c46e7e50000000000000011","age": 16,"name": "User5"}
]
4 documents in set (0.0003 sec)

 MySQL  localhost:33060+ ssl  test  JS > fcmd.bind('page', 17)
[
    {"_id": "00005c46e7e5000000000000000e","age": 17,"name": "User2"},
    {"_id": "00005c46e7e5000000000000000f","age": 20,"name": "User3"},
    {"_id": "00005c46e7e50000000000000010","age": 19,"name": "User4"}
]
3 documents in set (0.0004 sec)

The example creates a collection of users with their names and ages, then prints the whole collection. The interesting part starts with the line highlighted in yellow. It prepares a statement using a named parameter (anonymous parameters with ? are not supported by the X protocol), but does not execute it. The execution happens after a value is bind to the parameter and this could be done many times producing different results. It's interesting that in the general log the first bind actually executes a query, then there's prepare and then follow executes:

Query    SELECT doc FROM `test`.`users` -> usr.find()
Query    SELECT doc FROM `test`.`users` WHERE (JSON_EXTRACT(doc,'$.age') >= 18) -> fcmd.bind('page', 18)
Prepare    SELECT doc FROM `test`.`users` WHERE (JSON_EXTRACT(doc,'$.age') >= ?)
Execute    SELECT doc FROM `test`.`users` WHERE (JSON_EXTRACT(doc,'$.age') >= 16) -> fcmd.bind('page', 16)
Execute    SELECT doc FROM `test`.`users` WHERE (JSON_EXTRACT(doc,'$.age') >= 17) -> fcmd.bind('page', 17)

Using prepared statements for repeatedly executed statements could lead do performance improvements, because of the saved parse time, so this is something you should consider if you need to improve the performance of your applications and scripts.

Spatial improvements

The function ST_Distance now accepts an optional third parameter specifying the unit for the returned value. The possible values are defined in INFORMATION_SCHEMA.ST_UNITS_OF_MEASURE table along with conversion factor relative to the basic unit metre, which is the default. Here is an example for calculation of the distance between Sofia and Sydney in kilometers and nautical miles in SRID 4326:

SELECT ST_Distance(ST_PointFromText('POINT( 42.69751 23.32415)', 4326),
                   ST_PointFromText('POINT(-33.86667 151.20000)', 4326)) / 1000 dist_km;

+--------------------+
| dist_km            |
+--------------------+
| 15431.933058990671 |
+--------------------+
1 row in set (0.0023 sec)

SELECT ST_Distance(ST_PointFromText('POINT( 42.69751 23.32415)', 4326),
                   ST_PointFromText('POINT(-33.86667 151.20000)', 4326), 'nautical mile') dist_nm;

+------------------+
| dist_nm          |
+------------------+
| 8332.57724567531 |
+------------------+
1 row in set (0.0008 sec)

This wraps up my review. There is of course lot more in MySQL 8.0.14 not only for developers, so I encourage to explore and find out more. The references at the beginning of this article are a good starting point.

2019-01-10

The things I learned about myself in 2018

For me 2018 was a year of hopes, some of which didn't realized. I was hoping for some changes, but at the end they didn't happen however there's nothing tragic in this as I only learned more about myself. In this article I take a look back of what I learned about myself in the past year.

I do not have enough experience with open source

I was told this somewhere in beginning of March and I do not disagree at all. I have contributed small patches, translations and bugs to various open source projects in the past about 20 years, I have published some of my personal projects under free licenses, but I wish I could have done more. It's true I could have contributed more, but unfortunately, I simply haven't found the time for this. I'm still a strong believer in free and open source software (FOSS), which I used since my university years, I still use and I would continue using for the time being. However, nowadays I'm not that picky that FOSS should also be free as in "free beer", which brings me closer to GNU's philosophy.

I do not read enough/your books

I was suggested to read more books by some. Yes, perhaps I could read more, but I'm still reading. And I may not be reading the same as you and you're not the one to tell me what to read just like I'm not telling you. Apart from books I'm also reading some magazines like I've been since my teenage years. And some online resources as well. With this post I'd like to thank my family and friends who gave me books as gifts last year, so there are currently about 10 books on my reading list for 2019. Sorry, the list is private ;-)

I do not have enough experience with this or that technology

I learned this in the summer. I still consider myself a curious person about technology, but I simply cannot have experience with everything out there and I believe nobody could. For me it's more interesting how a person perceives technology and how it deals with changes. Two decades ago software was quite different as well as the tools and technologies used to implement it. The peace of changes has increased as well as the persons involved in developing software. There are many persons that started software development one, two, three or more decades ago that easily adapt to new tools and technologies, because they have the basic understanding about how software is build and works. And more importantly they know they have to learn constantly and be involved. This means reading the manuals, following development progress, discussing, reporting bugs, making feature request, contributing as much as possible and generally getting involved with the community.

I'm not agile

This was something I was told before the Christmas holidays. Yes, perhaps I'm not that agile after all. I know and understand agile values and principles, but at the same time I do not share them completely.
  • I agree that individuals and interactions are more important than processes and tools, but only the later could make the former simple and efficient. Without clear processes and proper tools even best individuals would fail to interact in a way that benefits software development. I believe that such individuals would first create the necessary processes and tools.
  • I agree that working software is more important than comprehensive documentation, but I've seen software written with scarce or no documentation at all. After some time (e.g. 5, 10, 15 years) when such software stops working, has to be upgraded or has to be explained to client it first has to be reverse engineered as most probably original developers have long left the company. So, in short term and for small projects I completely support this, but in the long term and for large projects it won't help anyone. The problem is that if not written on time documentation would most probably never be written.
  • I agree that customer collaboration is more important than contract negotiation, but not all customers could collaborate effectively. Some customers do not even know what exactly they want, they do not read documents (even refuse to do so when asked) hence they do not know the software they're using and they do not properly transfer knowledge over time. These are only my humble observations, but I believe that customer collaboration should be open in terms that it should be accessible to all the involved parties now and in future. I've seen cases where software was implemented in "close collaboration" between a developer and a person at the customer through personal messages, which afterwards nobody knows anything about and doesn't want to.
  • I agree that responding to change is more important than following a plan, but responding to constant changes means no project plan and scope at all. Projects that start as a bubble and constantly change their form easily become malformed and reach out of control. Such projects usually fail. I've seen some never-ending projects in my practice and it's one of the things I truly dislike.
These and perhaps other reasons make organizations choose hybrid approaches to software development as they find agile too extreme and inefficient in large organizations and I have worked in such organizations in the past more than 15 years, so perhaps my views are twisted from this perspective.

I'm an 'advanced user'

This was written to me by a support representative. It was after asking (and insisting) for some simple (in my view) improvements in the software provided and supported by a big vendor for whom this person was working. He wrote it in the sense that it's only me that is asking for these improvements, which made it automatically clear to me that they won't be implemented. Well, if this defines me as "advanced user", so be it, but I'm quite disappointed, because I was expecting more from a SOHO device using free and open source software. I may write a separate article on this later.

I'm myself

Last, but not least I'm still myself with all the pros and cons as a person and professional. I do not pretend to be everything to everyone, I do not pretend to know everything and I do not pretend to be the most experienced out there. Everything I learned about myself in 2018 was based on some experience and I'm sure it would help me become a better person and professional in future.

2018-12-29

Functional dependencies recognition in groupping by MySQL

If you've used MySQL in the past you should remember it was possible by default to aggregate in queries without a GROUP BY clause. So you could write a query like this one

SELECT D.dname, COUNT(E.empno) cnt
  FROM dept D,
       emp  E
 WHERE D.deptno = E.deptno;

most probably with the idea to count the employees in each department. However, the result you'll get (without error) would be:

+------------+-----+
| dname      | cnt |
+------------+-----+
| ACCOUNTING |  14 |
+------------+-----+

which is simply incorrect and most probably not what you actually want. Such query would result in an error in other databases (e.g. Oracle would give ORA-00937: not a single-group group function, PostgreSQL would say SQL state: 42803 ERROR: column "d.dname" must appear in the GROUP BY clause or be used in an aggregate function, etc.).

The reason for this behavior was the default SQL mode of the server and more precisely the absence of ONLY_FULL_GROUP_BY in the list with modes. There are some applications (like Bugzilla and Cacti) that did not allow me to enable this mode on my server although it was long suggested by some (e.g. I remember the presentations of Ronald Bradford from BGOUG Autumn 2010). I remember having problems with these applications, so I put a comment in the defaults file to not enable ONLY_FULL_GROUP_BY. I must admit I haven't checked recently, so the situation could have improved, but with Bugzilla's current state it's doubtful.

The default SQL modes changed much with the latest MySQL releases:
The only difference in 8.0 is in fact the absence of  NO_AUTO_CREATE_USER, because it was removed after being deprecated in previous releases. It's highly recommended to stick to the defaults especially when starting new application, so you do not have to fix queries and data later.

When ONLY_FULL_GROUP_BY was enabled with MySQL 5.7.5 (released on 2014-09-25) Oracle took care to implement a sophisticated functional dependencies as Guilhem Bichot wrote in December 2014. The effort was praised by Markus Winand in his great article One Giant Leap For SQL: MySQL 8.0 Released from April 2018, who wrote that MySQL now has "the most complete functional dependencies checking among the major SQL databases". Let's explore this, but before it you may refer to Wikipedia for a quick review of the definition of functional dependency. To put it simply Y is functionally dependent on X if the values of X unique determine the values of Y. For example the column values in a table are functionally dependent of the primary key, because its value uniquely determines all other values in a row.
Take a look at the following example with the primary key. It's the query from above, but with a GROUP BY clause:

SELECT D.dname, COUNT(E.empno) cnt
  FROM dept D,
       emp  E
 WHERE D.deptno = E.deptno
 GROUP BY D.deptno;

Is this query valid? The grouping is by D.deptno, but the column list in SELECT consist only of D.dname and aggregation on E.empno, so it's wrong, right? Not in MySQL 5.7 and above with enabled ONLY_FULL_GROUP_BY, so the old rule to put everything from GROUP BY into SELECT or use aggregation doesn't apply (i.e. no more SQL-92 restrictions). The result is:

+------------+-----+
| dname      | cnt |
+------------+-----+
| ACCOUNTING |   3 |
| RESEARCH   |   5 |
| SALES      |   6 |
+------------+-----+

Why so? MySQL properly determines that D.dname is functionally dependent on D.deptno, because this is the primary key for dept table. Same query works properly also in PostgreSQL, but not in Oracle and other major SQL databases. MySQL could properly determine functional dependencies also on unique keys, equalities in the WHERE clause or [LEFT] JOIN ON clauses and from selected expressions in a view’s or derived table’s body. It's quite exciting, right? Be sure to use it next time when you consider writing a GROUP BY query.

For more details, please refer to chapter Detection of Functional Dependence in the manual.

2018-11-28

Execution plan change in MySQL 8.0

About 6 weeks ago at work I had to write a new report query, because an application page was loading slowly (about 25 seconds). It turned out there is a problem in the application, because it was first executing a relatively slow query (taking about 6 seconds) and then a relatively fast query for each returned row from the first query. That second query was taking between 0.3 and 0.5 seconds per execution, but it was executed about 60 times, so the delay accumulated and explained the slow page load.
I decided to provide a single query that produces the whole result set, so I had to join 6 tables. One table (msg_progs) had just 13 rows, another two had just 64 rows (bldenv and prjprocstat) and another one just about 100 rows (p) and these four had no tendency of growing much more. There was one table with several tens of thousands of rows (m) and another with several hundreds thousands of rows (mtp), which for sure were going to grow on a daily basis. To improve the selectivity in the second largest table (m) I created a multi-column index on three columns involved in the query, but the problem was that sill much of its data was selected. Anyway, I ended up with a query that was executing for 6-7 seconds, which was much more acceptable from the initial 25 seconds. I'm not able to provide all the details, but I have captured the execution plan and I consider it safe to publish.
Query explain plan in MySQL 5.7.20
Remark: MySQL Workbench nicely visualizes the execution plans graphically (notice the lines connecting the different operations - the higher the number of rows the thicker the line). Textual execution plans are thing of the past, right?

That was until 3 weeks ago when we upgraded to MySQL 8.0.13 (from 5.7.20). The same query and in the same database started running for 24 seconds. When I checked the execution plan it has changed (see below).
Original explain plan in MySQL 8.0
I think you could easily notice that the join order has changed. Previously the second largest table was considered in the first nested loop, then the largest table and finally the smaller tables. Now it's the opposite as the optimizer first considers the smaller tables and only then the largest ones. I could have used a hint to fix the join order (e.g. JOIN_ORDER), but I noticed that if I re-create the multi-column index on just two of the columns (that are more selective in fact) the optimizer chooses a different plan, but the query execution time is still about 14 seconds. I finally decided to remove one of the tables in the joins (the smallest msg_progs with just 13 rows), which made the optimizer choose a different execution plan (here below) and the query now runs for about 4 seconds.
Finaly explain plan in MySQL 8.0
I'm not sure where this difference come from apart from the server, but I cannot blame without a proof (e.g. a reference in the manual that explains the difference), which unfortunately I haven't had the time to search. If you know the reason for the change of execution plans in MySQL 8.0 please, let me know.
The situation reminded me of Oracle database where after each major release we had queries with modified execution plans and had to use query hints or other tricks to make them fast again. It appears to be no different with MySQL nowadays.
Update 2018-12-17: I made a premature conclusion above, because the change of the plan happened (or was at least noticed) after we upgraded the server. However, as it turned out the real reason was the number of rows into tables m and mtp for which the server was choosing a different execution plan. We found it out experimenting with different in volume data sets.

2018-11-27

Game of authorities

In the management theory and practice there are cases where the use of authority is necessary (e.g. for enforcing deadlines, for resolving conflicts, etc.). However, while I understand the need, I really do not understand the managers that have to use authority when left with no other argument. I had situations in the past when authority was necessary, but I was always reluctant and I have never used it without a strong argument (which has been kindly acknowledged). What I mean is that I cannot simply enforce something onto someone if I do not have a good reason for this and if I have not justified it in front of the involved parties.
The point is that sometimes the arguments of the parties in a discussion may seem equally important and valid, so arbitration and use of authority may be inevitable. One way to solve this is to base on previous facts like previous written discussions, agreements, procedures and practices about the same or similar matters. What if the other party simply doesn't accept the additional arguments? Well, I really do not know, but I think it speaks enough of the other party. In enterprise communities without good documentation management and constant change of authorities knowledge is easily lost over time. New rules and procedures are created every day, because nobody is aware or willing to accept what was previously defined. That is how legends are created.

2018-11-15

MySQL Community Data Masking and De-Identification

Oracle recently introduced a Data Masking functionality in MySQL, but it's available only in the Enterprise edition (see MySQL Enterprise Data Masking and De-Identification in the manual). I'm not a customer, but I'm acquainted with the concept as well as stored routines, so I decided to try emulating what MySQL has implemented for their Enterprise edition of the database.

I started by reading the manual, then reviewed formats of payment card and social security numbers to finally came up with a simple implementation this evening. I have implemented only the general and special purpose masking functions and the functions for generation of random data (e.g. e-mail, primary account number, social security number (SSN) and US phone number). I have an additional function gen_rnd_string for generation of a random string with given length (only alphabetic characters by default). I've made some test and the results look promising, but I have to experiment more and perhaps generate better data that could eventually pass verifications, because an application using randomly generated data could easily fail if it cannot validate the data. It remains me to implement the function for generation of random data using dictionaries for which I think I could use LOAD DATA INFILE and CREATE TEMPORARY TABLE for loading dictionary.

It's just my simple idea of how one could implement data masking and de-identification even in MySQL Community edition and as stored routines, because MySQL's implementation if based on user-defined functions (UDF). I wonder why they chose this approach and here is my take on the possible answers:
  • Visibility: A user-defined function has global visibility, so it could be called without proceeding schema name just like any other built-in function. My implementation has to be loaded into a schema and thus each function would have to be called with schema name in front.
  • Default arguments: MySQL still doesn't support default arguments for stored routines, but this is possible for user-defined functions as these could be written only "in C or C++ (or another language that can use C calling conventions)". The general purpose masking functions mask_inner and mask_outer accept an optional argument for the masking character. In a stored procedures implementation this could be only a user-defined variable. I'll implement this later.
  • Hiding implementation details: User-defined functions confine MySQL's implementation into the shared library binary. The source code of the stored routines could be easily reviewed by a user with enough permissions (e.g. an administrator). To my knowledge MySQL doesn't offer obfuscation utility like Oracle (see Obfuscating PL/SQL Source Code). A side point to this is that the manual also doesn't provide enough details on the possible values by the data generation functions.
  • Security: This could be related to how the server is handling the memory for UDF and stored routines calls, but that's too much internals.
  • Performance: User-defined functions by presumption should have better performance compared to stored routines, because the former are function calls in natively compiled shared library and the later are interpreted by the server on each invocation.
I'm eager to discuss this tomorrow on BGOUG Autumn Conference 2018 where there would be a presentation exactly on this topic. I'll update this post or write another one afterwards.

Update 2018-11-16 19:01 EET: The BGOUG conference confirmed my presumptions about the reasons to implement data masking as UDF functions. One of the reasons cited was exactly "performance".

Update 2018-11-18 19:23 EET: After playing a bit more and trying to implement the dictionary functions today, I hit two restrictions I haven't though about before. The first restriction is the impossibility to use prepared statements into a function (i.e. from chapter Prepared SQL Statement Syntax in the manual "SQL syntax for prepared statements can be used within stored procedures, but not in stored functions or triggers"). The second restriction is the impossibility to use LOAD DATA INFILE as prepared statement (see section SQL Syntax Allowed in Prepared Statements in the manual). The first restrictions means that dictionary functions could be implemented only as procedures. The second restrictions means that it's impossible to load dictionary from file, so implementation of function gen_dictionary_load cannot be completed - it could create the temporary table for the dictionary, but it cannot load it. I still created the procedures and was able go generate random US cities. I also benchmarked the performance with gen_rnd_pan function as it does both random number generation and Luhn number calculation. On my MySQL 5.7 server running on Intel Pentium G3420 @ 3.2 GHz with 8 GB DDR3 RAM I ran SELECT BENCHMARK(count, gen_rnd_pan()) and the results for different execution counts were the following:
  • 1000 times - 0.28 seconds;
  • 10000 times - 3.13 seconds;
  • 100000 times - 31.41 seconds;
  • 1000000 times - 278.19 seconds.
Unless MySQL removes the restrictions for me the topic is concluded. I could try to implement the data masking functions as UDF like MySQL if I have the time and if I have the real need.

2018-10-23

New development features in MySQL 8.0.13

Ah, what a week already and it's not over! First, Oracle released Oracle XE 18c (as I wrote on Sunday) and yesterday (October 22-nd) MySQL 8.0.13 arrived along with (as already usual) the whole feet (Router, Shell, Workbench and C++/J/ODBC/NET/Python/Node.js Connectors). I was eager to read the change log and as the refereed article from MySQL's Server Blog also suggests there are some interesting new features for designers and developers. Here below are my picks and few words on upgrading from 8.0.12.

SQL Default Values now could be also function or expression

It's now possible to have default values as function or expression. This is an important new feature. Before only literal values were allowed, which was quite limiting. The removal of this limitation means that it is now possible:
  • to generate UUID values by default (by using DEFAULT (uuid_to_bin(uuid())) for example);
  • to generate default values for geometry data types (by using DEFAULT (POINT(0,0)) or DEFAULT (ST_PointFromText('POINT(42.69751 23.32415)', 4326)) for example);
  • to generate default values for JSON columns (by using DEFAULT (JSON_ARRAY()) or DEFAULT ('[]') for example); and
  • other complex values (like DEFAULT (CURRENT_DATE + INTERVAL 1 DAY) or DEFAULT (PI() * POW(r, 2)) for example) using function or functions in an expression.
Remark: Please, note the parentheses surrounding the expression in DEFAULT (see Handling of Explicit Defaults as of MySQL 8.0.13 in the manual). MySQL Workbench 8.0.13 again has some problems parsing the new syntax (see screenshot below), so I opened bug 92900 and it already got verified:

MySQL Workbench 8.0.13 erorr on default expression

I tried the new functionality with the following:

CREATE TABLE def_expr (
  id        INT           NOT NULL AUTO_INCREMENT,
  uuid_def  BINARY(16)    DEFAULT (uuid_to_bin(uuid())),
  geo_def   POINT         DEFAULT (Point(0,0)),
  geo_def2  GEOMETRY      DEFAULT (ST_PointFromText('POINT(42.69751 23.32415)', 4326)),
  json_def  JSON          DEFAULT (JSON_ARRAY()),
  json_def2 JSON          DEFAULT ('[]') /* this works too */,
  tomorrow  DATE          DEFAULT (CURDATE() + INTERVAL 1 DAY),
  radius    INT           DEFAULT (FLOOR(1 + (RAND() * 10))),
  area      DECIMAL(10,3) DEFAULT (ROUND(PI() * POW(radius, 2), 3)),

  PRIMARY KEY (id),
  UNIQUE INDEX id_UNIQUE (id ASC) VISIBLE
);



Then inserted two dummy lines and got the following results:

INSERT INTO def_expr VALUES (); /* x 2 */

SELECT id,
       bin_to_uuid(uuid_def) uuid_def,
       ST_AsText(geo_def)    geo_def,
       ST_AsText(geo_def2)   geo_def2,
       json_def, json_def2,
       tomorrow, radius, area
  FROM def_expr;


+----+--------------------------------------+------------+--------------------------+----------+-----------+------------+--------+---------+
| id | uuid_def                             | geo_def    | geo_def2                 | json_def | json_def2 | tomorrow   | radius | area    |
+----+--------------------------------------+------------+--------------------------+----------+-----------+------------+--------+---------+
|  1 | a2747ee0-d6ee-11e8-b02e-02004c4f4f50 | POINT(0 0) | POINT(42.69751 23.32415) | []       | []        | 2018-10-24 |      2 |  12.566 |
|  2 | a2ff920b-d6ee-11e8-b02e-02004c4f4f50 | POINT(0 0) | POINT(42.69751 23.32415) | []       | []        | 2018-10-24 |     10 | 314.159 |
+----+--------------------------------------+------------+--------------------------+----------+-----------+------------+--------+---------+
2 rows in set (0.0021 sec)


That's cool, right? I'm eager to use it in my projects.

SQL Functional Indexes


It's now possible to have functional indexes (see Functional Key Parts in the manual), which means that the index is build from expression values, instead of column values or column prefix values. In other words you could index values not stored in the table, which I think could be quite powerful in some cases. In composite indexes with multiple key parts you could have mix of functional and nonfunctional (regular) key parts. Functional key parts are implemented as hidden virtual generated columns (see CREATE TABLE and Generated Columns in the manual) available since MySQL 5.7.6 (from 2015-03-09). This also means that functional key parts have the same restrictions (i.e. deterministic built-in functions and operators are permitted, but subqueries, parameters, variables, stored and user-defined functions are not permitted).
To try the new functionality I though that IP addresses are usually stored as string in databases and then parsed, but the address could be represented as an (unsigned) integer value, which requires just 4 bytes. So I used the following:

CREATE TABLE func_index (
  id        INT        NOT NULL AUTO_INCREMENT,
  ipaddr4   INT UNSIGNED /* 4 bytes */,

  PRIMARY KEY (id),
  UNIQUE INDEX id_UNIQUE (id ASC) VISIBLE,
  INDEX func_idx ((INET_NTOA(ipaddr4)))
);

Remark: Please, note the parentheses surrounding the expression. Without them there would be цоде 1064 (syntax error). Not surprisingly MySQL Workbench 8.0.13 wasn't ready for the new syntax, so I opened bug 92908 and it also already got verified.

MySQL Workbench 8.0.13 erorr on functional index

Then inserted a line and explained two SELECT queries below:

INSERT INTO func_index (ipaddr4) VALUES (INET_ATON('192.168.1.1'));

SELECT * FROM func_index WHERE ipaddr4 = INET_ATON('192.168.1.1');
SELECT * FROM func_index WHERE INET_NTOA(ipaddr4) = '192.168.1.1';

The first one would result in full table scan, while the second one would read the functional index func_idx. In order for the query to use the index it's important that the expression in the WHERE clause matches index's specification.

Option to Disallow Tables without a Primary Key

There is now the system variable sql_require_primary_key, which prevents the creation or altering of tables without primary key if set. In such case CREATE or ALTER queries would now fail with error code 3750 (see also SQL state ER_TABLE_WITHOUT_PK).
I tried this new possibility by setting the global variable in a session like this:

SET GLOBAL sql_require_primary_key = ON;

but I was able to create table without primary key. I have to check it again tomorrow and eventually report a bug. Setting the variable in my.ini and restarting the server worked just fine:

SQL> CREATE TABLE tab_no_pk (test INT);
ERROR: 3750: Unable to create a table without PK, when system variable 'sql_require_primary_key' is set. Add a PK to the table or unset this variable to avoid this message. Note that tables without PK can cause performance problems in row-based replication, so please consult your DBA before changing this setting.

I'll finally have a break from developers not familiar with relational concepts :-)

Geometry transformation

It's now possible to transform geometry from one spatial reference system (SRS) to another using built-in function ST_Transform, which accepts as second argument the target SRS. I'm not that much into GIS, so I didn't tested this one, but it seemed worth mentioning.

A few words on upgrading (from 8.0.12)

Upgrading my Windows installation was flawless with MySQL Installer for Windows. However, I had problem with my Linux installation, which consist of more databases. The server wasn't able to start and in the error log I found errors like the following:

2018-10-23T12:47:49.627732Z 1 [ERROR] [MY-013235] [Server] Error in parsing View 'db'.'aView' during upgrade. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc' at line 1

I opened bug 92898 for this and the problem already got explained by support with an incompatible change (see SQL Syntax Notes from the release notes). I'm pretty sure I've read about ASC or DESC qualifiers for GROUP BY clauses being deprecated and I really do not remember why I used such syntax, but it got me off guard, so I had to update view definitions.

There's a lot more in MySQL 8.0.13 and I strongly suggest you review The MySQL 8.0.13 Maintenance Release is Generally Available article and release notes, then explore for yourself.