Georgi Sotirov's Blog

My personal and professional life

2020-10-20

MySQL 8.0.22 released

Yesterday, Oracle released MySQL 8.0.22, so it came as a kind of birthday present to me :-) This is of course the next purely maintenance release in the 8.0 series, so there are no big new features, but I normally checked the release notes and the official blog post. There are many small changes and bug fixes, but the list below are the ones that caught my attention:

  • Derived condition pushdown optimization for eligible queries - it could improve the performance of some queries, by rewriting the following query
    • SELECT * FROM (SELECT i, j FROM t1) as dt WHERE i > 10; into
    • SELECT * FROM (SELECT i, j FROM t1 WHERE i > 10) as dt;
  • Parenthesized Query Expressions enables you writing statements like ( query_expression ) [order_by_clause] [limit_clause] [into_clause]. Just a few examples from the manual to get the idea:
    • (SELECT 1 AS result UNION SELECT 2);
    • (SELECT 1 AS result UNION SELECT 2) LIMIT 1;
    • (SELECT 1 AS result UNION SELECT 2) LIMIT 1 OFFSET 1;
    • (SELECT 1 AS result UNION SELECT 2) ORDER BY result DESC LIMIT 1;
    • (SELECT * FROM t1 ORDER BY a) UNION (SELECT * FROM t2 ORDER BY b) ORDER BY z;
  • CAST() or CONVERT() to YEAR - e.g. SELECT CAST('1945ad' AS YEAR) yr; or SELECT CONVERT('1945ad', YEAR) yr; both produce 1945.
  • Retrieve a stored TIMESTAMP value in UTC - please, note "stored" here, because you cannot convert CURRENT_TIMESTAMP to another time zone like this (i.e the statement SELECT CAST(CURRENT_TIMESTAMP() AT TIME ZONE 'UTC' AS DATETIME) utc_tz; results into Error Code: 3998. Cannot cast value to TIMESTAMP WITH TIME ZONE). Given that chapter The DATE, DATETIME, and TIMESTAMP Types in the manual stipulates that "MySQL converts TIMESTAMP values from the current time zone to UTC for storage, and back from UTC to the current time zone for retrieval." it should mean the value is retrieved as it's stored, which may be what you want if you'd like to offset the conversion out of the MySQL server in case its time zone support is not suitable for your needs.
  • Read only option for schema provides a way to completely prohibit writes to entities in a schema. It's done with ALTER DATABASE statement (e.g. ALTER DATABASE test READ ONLY = 0) with the purpose to simplify online data migrations.
  • Support for SRV DNS records in the client (see Connecting to the Server Using DNS SRV Records).

I had problems building the new version on Slackware -current with Protobuf library 3.2.1, because apparently the minimum requirement on the library version was changed without notice (see bug 101230) and furthermore is not properly detected and reported by the build system leading to compilation errors afterwards (see bug 101231).

2020-07-20

MySQL 8.0.21 released

On Monday (July 13-th) Oracle released MySQL 8.0.21 server and related products, which is the next maintenance release from the 8.0 series that does not contain many new features for developers. The ones I spotted are the following.

New JSON function JSON_VALUE

There is new, SQL 2016 standard function JSON_VALUE, which is the same as calling CAST(JSON_UNQUOTE( JSON_EXTRACT(json_doc, path) ) AS type). It makes code shorter and clearer as well as more standards compliant, but the direct effect is that it simplifies creation of indexes on JSON fields by removing the need for generated columns.

Safer LOAD DATA LOCAL INFILE on clients

The problem with this command was that the server could request any file (see security considerations), so now there is the option load-data-local-dir with which clients could define what is allowed. I hope support for the command would now be integrated in Workbench, so that remote data loads are faster.

MySQL Shell

There is now support for JSON schema validation through the X DevAPI with the validation JSON object to ensure a certain structure of the documents being created or updated in a collection. See more in JSON Schema Validation article of X DevAPI's User Guide or in Shell's release notes.

MySQL Workbench

The new version comes with syntax support for various 8.0.19 and 8.0.20 server language features. This fixes bugs 98263 and 98266 which I reported.

There are of course many other new features and improvements as well as bug fixes for which you could read in details into the release notes and if you care only for the highlights then check out the article in the MySQL Server Blog.

2020-05-23

Happy 25th Birthday MySQL!

Today, May 23-rd 2020, MySQL turns 25 years old. I first got acquainted with it during my university years as I was searching for a free and open RDBMS system to learn and experiment with SQL. And I used it ever since then. I thus followed the evolvement of "The world's most popular open source database" over the past about 20 years since 3.2x versions. It's a big anniversary. So what does MySQL mean for me?
  • First and foremost, MySQL is the relational (and also non-relational) database of choice for both my personal and professional projects. For me it's easier to think new database schemes in the terms of MySQL and architect new systems on top of MySQL as a database system;
  • MySQL is getting better and better with the years, which keeps me interested. I still remember the arrival of:
    • 4.0 (with support for unions, etc.),
    • then 4.1 (sub-queries and prepared statements, etc.),
    • then 5.0 (stored routines, triggers and views, distributed transactions, etc.),
    • then 5.1 (table partitioning, XML and event scheduler, etc.),
    • then 5.5 (SIGNAL and RESIGNAL, PERFORMANCE_SCHEMA),
    • then 5.6 (Full-text for InnoDB, etc.),
    • then 5.7 (JSON, named triggers, generated columns, optimizer hints, etc.),
    • and more recently 8.0 (CTEs, window functions, regular expressions improvements, GROUPING function, document store, CHECK constraints, LATERAL, functional key parts and defaults and many, many more);
  • I'm building MySQL server (and other products) for Slackware Linux since 2005, because at first I needed newer versions than the ones provided by the official repository and then, because Slackware switched to MariaDB in 2013, but I choose to stick with MySQL. This doesn't mean I do not use MariaDB at all and in some aspects I believe it's ahead;
  • I'm reporting bugs for MySQL and related products, so I could get problems resolved (see the story of some of my bug reports);
  • I'm blogging about MySQL for the past 2 years since the release of MySQL 8.0, which helps me get familiar with the new features;
  • I'm also presenting about MySQL from time to time, which helps me engage with the community, learn from and exchange ideas with the more experienced then me;
  • Last, but definitely not least, there is a great and very strong community around MySQL, with which I got the chance to engage more in the past 3 years. This is what makes a project like MySQL strong over so many years.
To put it simply for me MySQL is a lot more than a database. MySQL has been with me for a significant part of my life so far and it makes me happy to see the database continues to evolve. Still I believe that improved SQL standard support and new development features should have arrived earlier, but better later than never. Also the development could be more open, but I'm confident that the future is open and bright.

Here's to the next 25 years and beyond! Happy birthday MySQL! Cheers to all that use and love MySQL!

    2020-05-03

    MySQL 8.0.20 released

    On Monday (April 27-th) Oracle released MySQL 8.0.20 Server and related products. You could check the release notes and official blog post. The focus of this release is binary log transaction compression to save disk space and network bandwidth, so there are no much new features for developers. Still here are the tings I noticed in the fore mentioned resources.
    • New index-level optimizer hints for existing index hints (mind the difference!) that include the following.
      • GROUP_INDEX/NO_GROUP_INDEX as equivalent to FORCE INDEX FOR GROUP BY/IGNORE INDEX FOR GROUP BY index hints;
      • INDEX/NO_INDEX as equivalent to FORCE INDEX/IGNORE INDEX index hints;
      • JOIN_INDEX/NO_JOIN_INDEX as equivalent to FORCE INDEX FOR JOIN/IGNORE INDEX FOR JOIN index hints;
      • ORDER_INDEX/NO_ORDER_INDEX as equivalent to FORCE INDEX FOR ORDER BY/IGNORE INDEX FOR ORDER BY index hints;
    • Hash join optimizations now supports also outer, anti and semi join types, which means that for the hash join to work it is no longer necessary to have at least one equi-join condition.
    • There are several improvements to EXPLAIN statement:
      • with FORMAT=TREE includes new inversion information for windowing functions. Unfortunately, I found no further information on this;
      • with FORMAT=TREE provides more information about evaluated window functions matching that supplied for regular aggregates. You should now see lines like "Window aggregate with buffering";
      • for ANALYZE there is now the possibility to stop the execution with KILL QUERY or CTRL+C key combination on the command line;
      • ANALYZE should now support the FORMAT option with the only possible value of TREE as reflected in the release notes and the manual, but for me it results in error.

        mysql> EXPLAIN ANALYZE FORMAT=TREE SELECT 1+1;
        ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FORMAT=TREE SELECT 1+1' at line 1


        Anyway, I hope this is the first step in implementing support for this command in Workbench in future (see bug 97282).
    There are of course several deprecations that include:
    Nothing changed in MySQL 8.0.20 Workbench, but there is still release and release notes.

    In waiting for the next release you could update your code to avoid surprises from removed previously deprecated features.

    2020-02-29

    Migrations to Git and GitHub continue

    Last year I started migrating from CVS to Git and GitHub (see First migrations from CVS to Git), which is something I was willing to start back in 2018 (see Migration to Git and GitHub), but planned since even longer ago. Well, it proved to be a tough task, because I'm still migrating as many of my CVS repositories required corrections, so the history could be properly transferred in Git. During the migration I was surprised how negligent I was to my sources, because I found some uncommitted changes from as far back as 10 years ago, corrupted revision control files and various history inconsistencies. After migrating some projects, I immediately started working on updates and fixes, so I already continue my work on GitHub. I also enabled continuous integration for some projects using Travis CI and GitHub Actions.

    I'm currently migrating my Slackware package's build scripts, which is a collection of over 300 Shell scripts and related files organized as separate repositories in dedicated directories, but under the same root. In CVS this was completely right, but it does not produce good history in Git, because the version tags or branches (e.g. FFmpeg-3_4_7 or MySQL-5_5) are specific only to some build script and used only by the files in its respective directory. I thus decided to migrate these separately and then combine them with another repository with sub modules. There were many problems with this migration as well.

    These were some the problems:
    • tags or branches with similar names (e.g. TEST-123 and TEST_123). Those are easy to correct - I just delete the wrong tag/branch;
    • misplaced tags or branches. Since in CVS I was tagging and branching as necessary some tags and branches were put on different files at different time, which in Git history resulted in commits with the message "This commit was manufactured by cvs2git to create tag 'TEST-123'" or "This commit was manufactured by cvs2git to create branch 'TEST-123'" with files being added, deleted or modified to adjust history for the tag or branch respectively. I'm fixing such problems by reordering the problematic commits, which is easy by just changing the time in the revision control file in the repository, but requires time to review and understand the reason;
    • unused files not deleted in version history. This also produced the commits with messages mentioned in the previous bullet point. I'm fixing these by deleting the unused files with past date, so such files are not considered by consecutive commits;
    • files belonging to a branch, but committed to main trunk instead. It's similar to the above. In some cases, there are files that should exist only on some branch (e.g. like patches to fix issues with specific software versions), but were committed to main trunk. As in CVS there is no big difference between tags and branches it was enough just to tag the changes for checking out properly. I'm moving these from main trunk into a branch, which means deleting revisions from main trunk and moving them on the branch;
    • spelling errors. Since I mostly committed to CVS on the command line and I did not use spell checker there were many commit messages with spelling errors that irritate my eyes. I'm correcting them after initial migration to Git, so I could check all messages at once.
    Two utilities proved to be of great help after migration - licensee and github-linguist. I use the first to check whether license is detected properly. I had some problems (see issues 361 and 392) with some of my projects, so I now check before pushing to GitHub. The second is useful for fine tunning languages detection (e.g. I'd like to see just SQL and not PLSQL, PLpgSQL, SQLPL or TSQL that linguist would detect although all my published SQL sources are for MySQL. And for me SQL is code, not data).

    Anyway, this post is to note that my migrations to Git continue. I've made great progress last year by migrating 28 repositories. And this year by now I migrated 114 more. So hopefully by the end of the year I'll be free from using CVS (or at least only by exception). Initially I planned migrating everything, but now I consider skipping some low value projects or examples, which if necessary, I could migrate later.

    2020-01-17

    MySQL 8.0.19 suite released

    MySQL 8.0.19 suite including Community Server, Router and Shell, was released on Monday (01/13). The focus of these releases is the new InnoDB ReplicaSet functionality that simplifies management of classic MySQL replication with single primary and multiple secondaries in the same way as setting up Group Replication with InnoDB Cluster. You could read more about it in the article The all new MySQL InnoDB ReplicaSet. Apart from this there are some other improvements for which you could read below.

    SQL syntax

    The server now understands the more general and standard ALTER TABLE … DROP/ALTER CONSTRAINT syntax for which I tweeted earlier. It's a feature request since MySQL 4.1 from more than 15 years ago. You could find more in Foreign Keys and Other Constraints section of ALTER TABLE article in documentation.

    Table value constructors are another improvement towards the standard allowing generation of rows with the VALUES statement (not to be confused with VALUES keyword for INSERT statements). Due to name conflict with the VALUES function it's necessary to use the verbose form with ROW constructor. This basically means the standalone statement VALUES ROW(1), ROW(2), ROW(3) is now properly understood by the server. VALUES statement could be used also in unions, joins, as derived table and other cases where you could employ SELECT. See the VALUES statement article in documentation for more details.

    The new TABLE statement implements another standard feature - explicit table clauses. It could be used like SELECT, because TABLE t is just like SELECT * FROM t, so it could be used in unions, with INTO, as IN subquery and elsewhere where SELECT could be employed.

    An interesting new addition is the LIMIT clause for recursive CTEs, which allows the debugging of expressions generating too many results. The idea is simply to limit the result from a recursive CTE to some reasonable number (e.g. 10, 100, etc.), so you could see the generated output and be able to understand how and why it's generated like this. This new features is covered in the article A new, simple way to figure out why your recursive CTE is running away and in Limiting Common Table Expression Recursion section of WITH (Common Table Expressions) in documentation.

    It's is now possible to use an alias for the new row and optionally columns with INSERT ... ON DUPLICATE KEY UPDATE statement following the VALUES or SET clause with the use of AS keyword. I guess this is the first step to removing VALUES function (see above) for better standards support. You could find more in the article for the ON DUPLICATE KEY UPDATE Statement.

    Datatype changes

    The TIMESTAMP and DATETIME types could now include timezone information as offset (e.g. +02:00 or -08:00) appended to the value. The details are in the article The DATE, DATETIME, and TIMESTAMP Types in documentation.

    The YEAR(4) data type (i.e. with explicit display width) and the (undocumented) UNSIGNED attribute for YEAR are now deprecated and would be removed in future. This follows other deprecations of display widths and other type attributes from 8.0.17 for which I wrote before, so developers should check and correct any SQL scripts using deprecated types and attributes. With this release statements that print data type definitions (e.g. SHOW CREATE TABLE) no longer show display width for integers with exception for TINYINT(1) and ZEROFILL attribute.

    Other changes in server

    User-defined functions (UDFs) can now determine character set and collation of string argument as I noticed earlier. You could find more about this in the article A Tale of UDFs with Character Sets.

    There are now new standard INFORMATION_SCHEMA views related to roles like ADMINISTRABLE_ROLE_AUTHORIZATION, APPLICABLE_ROLES, ENABLED_ROLES, ROLE_COLUMN_GRANTS, ROLE_ROUTINE_GRANTS and ROLE_TABLE_GRANTS.

    InnoDB engine now supports efficient sampling of data for histogram statistics. Sampling no longer requires full table scan, which could be costly for large tables, but samples data pages from the clustered index instead. I should try this on first opportunity.

    Function JSON_SCHEMA_VALID now provides detailed information about the reasons for validation failures by using SHOW WARNINGS.

    Other notable changes is the possibility to lock user accounts after FAILED_LOGIN_ATTEMPTS login attempts for PASSWORD_LOCK_TIME days (see Failed-Login Tracking and Temporary Account Locking), compression in X protocol (see Connection Compression with X Plugin) and many more as you could find from the release notes.

    Bugs fixed in server

    The previously reported problem with EXPLAIN ANALYZE not executing and reporting on subqueries in the select list has been fixed (see bug 97296).

    Workbench

    As I read in the release notes of the new version that "The following MySQL server language features are now supported: value references in INSERT statements, new options (TLS version, cipher suite, compression, and privilege check) in CHANGE MASTER TO replication statements, random passwords in CREATE USER and ALTER USER statements, EXPLAIN ANALYZE, and binary collation names.", I was hopeful that the application is finally catching up with the server, but that was until I reviewed the new SQL syntax (see above).

    Two of the bugs I reported previously were fixed - 97416 (for EXPLAIN ANALYZE) and 97281 (for ANALYZE TABLE ... UPDATE HISTOGRAM), but I opened two new for TABLE (see bug 98263) and VALUES (see bug 98266) statements. Such problems should be obvious, so it still strange to me how they continue appearing.

    There is a new menu item "Copy Row (with names, tab separated)" for copying headers and rows of queried data with tabs as separators that makes it easy to move results into Excel or LibreOffice Calc.

    Happy using and experimenting with the new MySQL releases!

    2020-01-09

    Using xrdp on various Linuxes and a problem on Fedora

    I've been using various Linuxes in virtual machines since more than 15 years in order to keep my skills sharp. It's simply not enough to know just one Linux distribution, but you have to know at least the major ones (e.g. Debian, RPM or based on another package management system ones). So apart from Slackware, I also regularly use Debian, Ubuntu, Fedora, OpenSuSE, CentOS and more recently Oracle Linux (for Oracle XE). Over the years, I have also experimented with Gentoo, FreeBSD, OpenBSD, NetBSD and others.
    What I immediately need in any of these is secure shell (SSH) of course, so I could type commands remotely. However, I also like to work and play with GUI software, so I also need remote graphics access. Over the years I used VNC and NoMachine, but I never liked any of these for various reasons. What I have always missed in Linux was an easy to use access like RDP with real session management and all the other goodies that is lightweight and robust (e.g. NoMachine also manages sessions, but encodes video of the desktop, which makes the picture blurred and frequently doesn't work on my Linux virtual machines after upgrades).
    So, I discovered xrdp, which is "an open source RDP server". It's easy to install, configure and run on Debian, Ubuntu and OpenSuSE and it just works. The installation for Fedora requires some additional steps, which are described in README.Fedora by xrdp package's maintainer Bojan Smojver. By default, xrdp on Fedora works only with VNC sessions, but I already said I do not like it, so I prefer using Xorg like with my other Linuxes. I'll summarize the steps here:
    1. Install necessary packages: dnf install xrdp xorgrdp
    2. Edit /etc/xrdp/xrdp.ini and enable Xorg sessions;
    3. Enable access by anybody or just specific users, e.g. echo "allowed_users = anybody" >> /etc/X11/Xwrapper.config (see also bug 1450720);
    4. Enable port 3389 on the firewall, e.g. firewall-cmd --zone=dmz --add-port=3389/tcp
    5. Point your RDP client to the Linux machine and have fun.
    Well, until yesterday. After upgrading my Fedora virtual machine to Fedora 31 the RDP connection stopped working with Xorg sessions although VNC sessions were working fine. The xrdp log at /var/log/xrdp-session.log was suggesting that the Xorg session is crashing just after it's started, so I was getting a blank screen.

    [20200109-10:12:53] [INFO ] setpriv --no-new-privs Xorg :10 -auth .Xauthority -config xrdp/xorg.conf -noreset -nolisten tcp -logfile .xorgxrdp.%s.log
    [20200109-10:12:53] [CORE ] waiting for window manager (pid 4902) to exit
    [20200109-10:12:57] [CORE ] window manager (pid 4902) did exit, cleaning up session
    [20200109-10:12:57] [INFO ] calling auth_stop_session and auth_end from pid 4877

    Investigating the reason, I noticed the following in the journal (e.g. with journalctl -r):

    Jan 09 10:12:55 fedora systemd[4881]: Failed to start GNOME Shell on X11.
    Jan 09 10:12:55 fedora systemd[4881]: gnome-shell-x11.service: Failed with result 'protocol'.
    Jan 09 10:12:55 fedora gnome-shell[5157]:   Current serial number in output stream:  359
    Jan 09 10:12:55 fedora gnome-shell[5157]:   Serial number of failed request:  358
    Jan 09 10:12:55 fedora gnome-shell[5157]:   Minor opcode of failed request:  7 (RRSetScreenSize)
    Jan 09 10:12:55 fedora gnome-shell[5157]:   Major opcode of failed request:  139 (RANDR)
    Jan 09 10:12:55 fedora gnome-shell[5157]: X Error of failed request:  BadMatch (invalid parameter attributes)
    Jan 09 10:12:55 fedora xrdp-sesman[4870]: rdpRRScreenSetSize: not allowing resize
    Jan 09 10:12:55 fedora xrdp-sesman[4870]: rdpRRScreenSetSize: width 1688 height 932 mmWidth 447 mmHeight 247
    Jan 09 10:12:55 fedora gnome-shell[5157]: Xlib:  extension "DPMS" missing on display ":10.0".
    Jan 09 10:12:55 fedora xrdp-sesman[4870]: rdpRRGetInfo:

    So, gnome-shell was failing actually. Initially, I found bug 1575792, which seemed similar, but it wasn't resolved. Finally, the problem turned out to be the latest version of xorgxrdp 0.2.12 (see issue #156). I had to simply downgrade to xorgxrdp 0.2.11, by forcefully installing the RHEL 8's xorgxrdp package like this:

    rpm --nodeps -i xorgxrdp-0.2.11-1.el8.x86_64.rpm

    And things are again back to normal.

    Update 2020-01-14: Of course it's good to eclude package xorgxrdp from upgrades (see Excluding Packages From Transactions) or simply echo "excludepkgs=xorgxrdp" >> /etc/dnf/dnf.conf as one line command.

    Happy using your favorite Linux distro!

    2019-12-25

    Bugzilla in Bulgarian on GitHub

    Exactly 14 years ago on this day I committed the localization of Bugzilla 2.18.1 in Mozilla-BG project repository on OpenFMI. I have considered using Bugzilla as bug tracker for my home server since the spring of 2005 and when I was reviewing the project I noticed that there is no Bulgarian localization, so I decided to contribute. The translation was done in the summer of 2005 and in the autumn I began searching for a way to publish it. Initially, I considered SourceForge, but finally decided to become part of Mozilla-BG project after I was invited by Ognyan Kulev.
    The localization was moved to BitBucket in 2012 after OpenFMI was decommissioned, so the repository was converted from Subversion (SVN) to Mercurial (Hg). It was then when some problems with the SVN repository become evident, but I found no time to fix these until now. Well, it was about time to migrate the project to Git and publish it on GitHub and I accomplished it last week. The Bugzilla-BG project is now on GitHub with all its history since December 2005. And ever since you could try the localization at Sotirov-BG.Net Bugzilla.
    The localization went a long way from CVS on my home server initially to SVN, then Hg and finally ended up in Git. My only project that went through this path :-)

    Have fun!
    Merry Christmas!

    2019-12-05

    Picture-in-picture mode for videos in Firefox 71

    Firefox 71 arrived two days ago (see release notes) and I already absolutely love the new picture-in-picture mode for playing videos while you're browsing. It's a real deal for me, because this is the usual way I'm browsing today - having a video playing in the background and listening to it while surfing websites. The only drawback was that I had to switch to the video tab from time to time, when I'd like to actually see what's playing. This sometimes involved searching for the tab, because I usually have many tabs open in different groups (organized with Panorama View extension).
    Now my online experience is much more pleasant, because I could just start a video, pop it up into a separate small window wherever on the screen I like and then continue surfing or do something else. Like this I could still keep an eye on the video even when I go out of Firefox. It's super useful even when I'm working on something, because I normally use many online resources. The feature is only available on Windows for now, but Mozilla promises to make it available for MacOS and Linux with the next release due in a month or so.
    Thank you Mozilla for constantly making my online experience better!

    2019-10-18

    New features in MySQL 8.0.18

    On Monday, Oracle released MySQL 8.0.18 (see also release notes) and unlike with previous releases there was lot of information about what would be included into this one. I personally attended several presentations and read some posts in social media. Like I already wrote the major new features in this release are hash joins and EXPLAIN ANALYZE, which aim to respectively improve the performance of some queries and give better insight about how optimizer plan compares to actual execution.

    Hash join optimization

    Before 8.0.18 the only type of join done by MySQL was nested loop (a loop within a loop, an inner loop within an outer one). And there were two algorithms implementing it:
    • Nested-Loop Join - it's the simplest where each row from the outer loop is passed to the inner loop for processing. The obvious drawback of this one is that the inner table needs to be read many times.
    • Block Nested Loop (BNL) - uses buffering of rows from the outer table to reduce the number of times the inner table is read. In this case instead of passing just one row to the inner loop multiple rows from the buffer could be passed at once. This greatly reduces the number of time the inner table needs to be read. Of course, this requires more memory (see join_buffer_size). This algorithm is used for range, index and ALL join types.
    There is also Batched Key Access (BKA) algorithm, which uses buffering like BNL, but puts keys for the rows in the buffer and passes them in a batch to the database engine for index lookups. The keys are then used to fetch the rows from the inner table. This algorithm can be used when there is index access to the inner table.

    Here come hash joins. The idea is to build a hash table for the values from the outer table, which normally is the smallest. Then, read the inner table and find matches in the hash table. Like this both tables could be read at best just once. Hash joins are best for large results sets where indexes cannot be used. Due to their nature they are useful only for equality joins. Since 8.0.18 MySQL would choose hash join before BNL for any query that uses equality join condition (e.g. ref or eq_ref join types) and uses no indexes. Let's try it out.

    For the example, I added a new table called job_sal with salary ranges per job to the dept_emp schema and I also generated 1 million more employees. Now let's say you want to find all the employees whose salary is outside the range. I would come up with a query like this:

    SELECT E.ename, E.sal, JS.sal_min, JS.sal_max
      FROM emp     E,

           job_sal JS
     WHERE E.job = JS.job

       AND E.sal NOT BETWEEN JS.sal_min AND JS.sal_max;


    Which given the absence of indexes would be executed using BNL algorithm in MySQL prior to 8.0.18, but in the new release the same query would benefit from the hash join optimization as visualized by the following execution plan in tree format:

    +----------------------------------------------------------------------------------------+
    | EXPLAIN                                                                                |
    +----------------------------------------------------------------------------------------+
    | -> Filter: (E.sal not between JS.sal_min and JS.sal_max)  (cost=499211.71 rows=442901)
        -> Inner hash join (E.job = JS.job)  (cost=499211.71 rows=442901)
            -> Table scan on E  (cost=1962.39 rows=996514)
            -> Hash
                -> Table scan on JS  (cost=0.75 rows=5)
     |
    +----------------------------------------------------------------------------------------+
    1 row in set (0.0023 sec)


    Visual Explain Plan
    The plan reveals that the smaller table JS would be hashed and the join would be done using the hash table. Unfortunately, the hash join optimization is not visible in traditional explain plan, JSON and thus into MySQL Workbench. The visual explain plan (look right) would be thus misleading showing BNL as the join algorithm. This is understandable, because the hash join optimization is possible only with the new iterator executor, which actually generates the explain plan in TREE format. This executor is not able to explain some queries, so you may see only the message "not executable by iterator executor". However, I really hope this is improved in future releases, because the explain plan should be consistent between formats. I reported a feature request as bug 97280.

    I tested the performance of the query for 1M employees and its execution time was 0.89 sec. Disabling the hash join optimization with NO_HASH_JOIN hint increased the execution time to 1.26 sec. Hash join for sure would be much more beneficial when outer table has more rows.

    EXPLAIN ANALYZE

    This new feature also comes on top of TREE explain plan and it represents kind of a profiling tool, because apart from information about how optimizer plans to execute the query (see above) there's also information from its actual execution. This information includes:
    • the time for returning the first row (in ms);
    • the time for returning all rows (in ms);
    • the number of rows read;
    • the number of loops.
    It makes it possible to compare how optimizer estimations compare to the real execution of the query. Let's try it with the previous query. I have intentionally shortened the output.

    +-------------------------------------------------------------------------------------------------------+
    | EXPLAIN                                                                                               |
    +-------------------------------------------------------------------------------------------------------+
    | -> Filter: (E.sal not between JS.sal_min and JS.sal_max)
           (cost=502072.82 rows=442901) (actual time=0.372..747.742 rows=915768 loops=1)
        -> Inner hash join (E.job = JS.job)
             (cost=502072.82 rows=442901) (actual time=0.355..575.011 rows=1000014 loops=1)
            -> Table scan on E  (cost=2534.62 rows=996514) (actual time=0.185..353.877 rows=1000014 loops=1)
            -> Hash
                -> Table scan on JS  (cost=0.75 rows=5) (actual time=0.133..0.144 rows=5 loops=1)
     |
    +-------------------------------------------------------------------------------------------------------+
    1 row in set (0.7754 sec)


    If it looks familiar to you check PostgreSQL's EXPLAIN ANALYZE. Please, note that times are milliseconds, not seconds! I think it's easy to note that number of rows estimated by the optimizer for the scan of table E differs, because table's statistics are not accurate. I have added 1M rows to table emp, so it's necessary to increase the number of sampling pages (see innodb_stats_persistent_sample_pages) and run ANALYZE TABLE again. After doing it the optimizer estimation and actual number of rows match. However, the optimizer is also wrong about the rows involved into the upper level operations - the hash join and filter, but this cannot be fixed with index statistics. According to Norvald H. Ryeng (see his article MySQL EXPLAIN ANALYZE) both "estimated and actual numbers" of rows "are averages over all loop iterations", but I have just one loop for all the operations in the plan.

    In any case EXPLAIN ANALYZE is a nice addition to the instrumentation of the optimizer. Unfortunately, neither TREE format nor EXPLAIN ANALYZE are available currently even in the latest MySQL Workbench, so another feature request from me as bug 97282.

    MySQL is OpenSSL only

    With this version the support for YaSSL and WolfSSL libraries are removed, so MySQL could be compiled only with OpenSSL. I have personally always built MySQL with OpenSSL (i.e. using option -DWITH_SSL=system), because this is what comes with Slackware, but of course the more important is that "the MySQL/OpenSSL combination is very well tested and production proven" as Georgi Kodinov explained in his post MySQL is OpenSSL-only now !.

    Spatial

    Function ST_Distance now accepts SRS arguments of all geometry types and not only arguments of types Point and Point, or Point and MultiPoint as before.
    Happy using MySQL!

    Update 2019-11-09: That timings are in milliseconds wasn't specified in the manual, so I filed bug 97492 as I also think that it would be good to print the unit next to timings.