Georgi Sotirov's Blog

My personal and professional life

2018-11-28

Execution plan change in MySQL 8.0

About 6 weeks ago at work I had to write a new report query, because an application page was loading slowly (about 25 seconds). It turned out there is a problem in the application, because it was first executing a relatively slow query (taking about 6 seconds) and then a relatively fast query for each returned row from the first query. That second query was taking between 0.3 and 0.5 seconds per execution, but it was executed about 60 times, so the delay accumulated and explained the slow page load.
I decided to provide a single query that produces the whole result set, so I had to join 6 tables. One table (msg_progs) had just 13 rows, another two had just 64 rows (bldenv and prjprocstat) and another one just about 100 rows (p) and these four had no tendency of growing much more. There was one table with several tens of thousands of rows (m) and another with several hundreds thousands of rows (mtp), which for sure were going to grow on a daily basis. To improve the selectivity in the second largest table (m) I created a multi-column index on three columns involved in the query, but the problem was that sill much of its data was selected. Anyway, I ended up with a query that was executing for 6-7 seconds, which was much more acceptable from the initial 25 seconds. I'm not able to provide all the details, but I have captured the execution plan and I consider it safe to publish.
Query explain plan in MySQL 5.7.20
Remark: MySQL Workbench nicely visualizes the execution plans graphically (notice the lines connecting the different operations - the higher the number of rows the thicker the line). Textual execution plans are thing of the past, right?

That was until 3 weeks ago when we upgraded to MySQL 8.0.13 (from 5.7.20). The same query and in the same database started running for 24 seconds. When I checked the execution plan it has changed (see below).
Original explain plan in MySQL 8.0
I think you could easily notice that the join order has changed. Previously the second largest table was considered in the first nested loop, then the largest table and finally the smaller tables. Now it's the opposite as the optimizer first considers the smaller tables and only then the largest ones. I could have used a hint to fix the join order (e.g. JOIN_ORDER), but I noticed that if I re-create the multi-column index on just two of the columns (that are more selective in fact) the optimizer chooses a different plan, but the query execution time is still about 14 seconds. I finally decided to remove one of the tables in the joins (the smallest msg_progs with just 13 rows), which made the optimizer choose a different execution plan (here below) and the query now runs for about 4 seconds.
Finaly explain plan in MySQL 8.0
I'm not sure where this difference come from apart from the server, but I cannot blame without a proof (e.g. a reference in the manual that explains the difference), which unfortunately I haven't had the time to search. If you know the reason for the change of execution plans in MySQL 8.0 please, let me know.
The situation reminded me of Oracle database where after each major release we had queries with modified execution plans and had to use query hints or other tricks to make them fast again. It appears to be no different with MySQL nowadays.

2018-11-27

Game of authorities

In the management theory and practice there are cases where the use of authority is necessary (e.g. for enforcing deadlines, for resolving conflicts, etc.). However, while I understand the need, I really do not understand the managers that have to use authority when left with no other argument. I had situations in the past when authority was necessary, but I was always reluctant and I have never used it without a strong argument (which has been kindly acknowledged). What I mean is that I cannot simply enforce something onto someone if I do not have a good reason for this and if I have not justified it in front of the involved parties.
The point is that sometimes the arguments of the parties in a discussion may seem equally important and valid, so arbitration and use of authority may be inevitable. One way to solve this is to base on previous facts like previous written discussions, agreements, procedures and practices about the same or similar matters. What if the other party simply doesn't accept the additional arguments? Well, I really do not know, but I think it speaks enough of the other party. In enterprise communities without good documentation management and constant change of authorities knowledge is easily lost over time. New rules and procedures are created every day, because nobody is aware or willing to accept what was previously defined. That is how legends are created.

2018-11-15

MySQL Community Data Masking and De-Identification

Oracle recently introduced a Data Masking functionality in MySQL, but it's available only in the Enterprise edition (see MySQL Enterprise Data Masking and De-Identification in the manual). I'm not a customer, but I'm acquainted with the concept as well as stored routines, so I decided to try emulating what MySQL has implemented for their Enterprise edition of the database.

I started by reading the manual, then reviewed formats of payment card and social security numbers to finally came up with a simple implementation this evening. I have implemented only the general and special purpose masking functions and the functions for generation of random data (e.g. e-mail, primary account number, social security number (SSN) and US phone number). I have an additional function gen_rnd_string for generation of a random string with given length (only alphabetic characters by default). I've made some test and the results look promising, but I have to experiment more and perhaps generate better data that could eventually pass verifications, because an application using randomly generated data could easily fail if it cannot validate the data. It remains me to implement the function for generation of random data using dictionaries for which I think I could use LOAD DATA INFILE and CREATE TEMPORARY TABLE for loading dictionary.

It's just my simple idea of how one could implement data masking and de-identification even in MySQL Community edition and as stored routines, because MySQL's implementation if based on user-defined functions (UDF). I wonder why they chose this approach and here is my take on the possible answers:
  • Visibility: A user-defined function has global visibility, so it could be called without proceeding schema name just like any other built-in function. My implementation has to be loaded into a schema and thus each function would have to be called with schema name in front.
  • Default arguments: MySQL still doesn't support default arguments for stored routines, but this is possible for user-defined functions as these could be written only "in C or C++ (or another language that can use C calling conventions)". The general purpose masking functions mask_inner and mask_outer accept an optional argument for the masking character. In a stored procedures implementation this could be only a user-defined variable. I'll implement this later.
  • Hiding implementation details: User-defined functions confine MySQL's implementation into the shared library binary. The source code of the stored routines could be easily reviewed by a user with enough permissions (e.g. an administrator). To my knowledge MySQL doesn't offer obfuscation utility like Oracle (see Obfuscating PL/SQL Source Code). A side point to this is that the manual also doesn't provide enough details on the possible values by the data generation functions.
  • Security: This could be related to how the server is handling the memory for UDF and stored routines calls, but that's too much internals.
  • Performance: User-defined functions by presumption should have better performance compared to stored routines, because the former are function calls in natively compiled shared library and the later are interpreted by the server on each invocation.
I'm eager to discuss this tomorrow on BGOUG Autumn Conference 2018 where there would be a presentation exactly on this topic. I'll update this post or write another one afterwards.

Update 2018-11-16 19:01 EET: The BGOUG conference confirmed my presumptions about the reasons to implement data masking as UDF functions. One of the reasons cited was exactly "performance".

Update 2018-11-18 19:23 EET: After playing a bit more and trying to implement the dictionary functions today, I hit two restrictions I haven't though about before. The first restriction is the impossibility to use prepared statements into a function (i.e. from chapter Prepared SQL Statement Syntax in the manual "SQL syntax for prepared statements can be used within stored procedures, but not in stored functions or triggers"). The second restriction is the impossibility to use LOAD DATA INFILE as prepared statement (see section SQL Syntax Allowed in Prepared Statements in the manual). The first restrictions means that dictionary functions could be implemented only as procedures. The second restrictions means that it's impossible to load dictionary from file, so implementation of function gen_dictionary_load cannot be completed - it could create the temporary table for the dictionary, but it cannot load it. I still created the procedures and was able go generate random US cities. I also benchmarked the performance with gen_rnd_pan function as it does both random number generation and Luhn number calculation. On my MySQL 5.7 server running on Intel Pentium G3420 @ 3.2 GHz with 8 GB DDR3 RAM I ran SELECT BENCHMARK(count, gen_rnd_pan()) and the results for different execution counts were the following:
  • 1000 times - 0.28 seconds;
  • 10000 times - 3.13 seconds;
  • 100000 times - 31.41 seconds;
  • 1000000 times - 278.19 seconds.
Unless MySQL removes the restrictions for me the topic is concluded. I could try to implement the data masking functions as UDF like MySQL if I have the time and if I have the real need.

2018-10-23

New development features in MySQL 8.0.13

Ah, what a week already and it's not over! First, Oracle released Oracle XE 18c (as I wrote on Sunday) and yesterday (October 22-nd) MySQL 8.0.13 arrived along with (as already usual) the whole feet (Router, Shell, Workbench and C++/J/ODBC/NET/Python/Node.js Connectors). I was eager to read the change log and as the refereed article from MySQL's Server Blog also suggests there are some interesting new features for designers and developers. Here below are my picks and few words on upgrading from 8.0.12.

SQL Default Values now could be also function or expression

It's now possible to have default values as function or expression. This is an important new feature. Before only literal values were allowed, which was quite limiting. The removal of this limitation means that it is now possible:
  • to generate UUID values by default (by using DEFAULT (uuid_to_bin(uuid())) for example);
  • to generate default values for geometry data types (by using DEFAULT (POINT(0,0)) or DEFAULT (ST_PointFromText('POINT(42.69751 23.32415)', 4326)) for example);
  • to generate default values for JSON columns (by using DEFAULT (JSON_ARRAY()) or DEFAULT ('[]') for example); and
  • other complex values (like DEFAULT (CURRENT_DATE + INTERVAL 1 DAY) or DEFAULT (PI() * POW(r, 2)) for example) using function or functions in an expression.
Remark: Please, note the parentheses surrounding the expression in DEFAULT (see Handling of Explicit Defaults as of MySQL 8.0.13 in the manual). MySQL Workbench 8.0.13 again has some problems parsing the new syntax (see screenshot below), so I opened bug 92900 and it already got verified:

MySQL Workbench 8.0.13 erorr on default expression

I tried the new functionality with the following:

CREATE TABLE def_expr (
  id        INT           NOT NULL AUTO_INCREMENT,
  uuid_def  BINARY(16)    DEFAULT (uuid_to_bin(uuid())),
  geo_def   POINT         DEFAULT (Point(0,0)),
  geo_def2  GEOMETRY      DEFAULT (ST_PointFromText('POINT(42.69751 23.32415)', 4326)),
  json_def  JSON          DEFAULT (JSON_ARRAY()),
  json_def2 JSON          DEFAULT ('[]') /* this works too */,
  tomorrow  DATE          DEFAULT (CURDATE() + INTERVAL 1 DAY),
  radius    INT           DEFAULT (FLOOR(1 + (RAND() * 10))),
  area      DECIMAL(10,3) DEFAULT (ROUND(PI() * POW(radius, 2), 3)),

  PRIMARY KEY (id),
  UNIQUE INDEX id_UNIQUE (id ASC) VISIBLE
);



Then inserted two dummy lines and got the following results:

INSERT INTO def_expr VALUES (); /* x 2 */

SELECT id,
       bin_to_uuid(uuid_def) uuid_def,
       ST_AsText(geo_def)    geo_def,
       ST_AsText(geo_def2)   geo_def2,
       json_def, json_def2,
       tomorrow, radius, area
  FROM def_expr;


+----+--------------------------------------+------------+--------------------------+----------+-----------+------------+--------+---------+
| id | uuid_def                             | geo_def    | geo_def2                 | json_def | json_def2 | tomorrow   | radius | area    |
+----+--------------------------------------+------------+--------------------------+----------+-----------+------------+--------+---------+
|  1 | a2747ee0-d6ee-11e8-b02e-02004c4f4f50 | POINT(0 0) | POINT(42.69751 23.32415) | []       | []        | 2018-10-24 |      2 |  12.566 |
|  2 | a2ff920b-d6ee-11e8-b02e-02004c4f4f50 | POINT(0 0) | POINT(42.69751 23.32415) | []       | []        | 2018-10-24 |     10 | 314.159 |
+----+--------------------------------------+------------+--------------------------+----------+-----------+------------+--------+---------+
2 rows in set (0.0021 sec)


That's cool, right? I'm eager to use it in my projects.

SQL Functional Indexes


It's now possible to have functional indexes (see Functional Key Parts in the manual), which means that the index is build from expression values, instead of column values or column prefix values. In other words you could index values not stored in the table, which I think could be quite powerful in some cases. In composite indexes with multiple key parts you could have mix of functional and nonfunctional (regular) key parts. Functional key parts are implemented as hidden virtual generated columns (see CREATE TABLE and Generated Columns in the manual) available since MySQL 5.7.6 (from 2015-03-09). This also means that functional key parts have the same restrictions (i.e. deterministic built-in functions and operators are permitted, but subqueries, parameters, variables, stored and user-defined functions are not permitted).
To try the new functionality I though that IP addresses are usually stored as string in databases and then parsed, but the address could be represented as an (unsigned) integer value, which requires just 4 bytes. So I used the following:

CREATE TABLE func_index (
  id        INT        NOT NULL AUTO_INCREMENT,
  ipaddr4   INT UNSIGNED /* 4 bytes */,

  PRIMARY KEY (id),
  UNIQUE INDEX id_UNIQUE (id ASC) VISIBLE,
  INDEX func_idx ((INET_NTOA(ipaddr4)))
);

Remark: Please, note the parentheses surrounding the expression. Without them there would be цоде 1064 (syntax error). Not surprisingly MySQL Workbench 8.0.13 wasn't ready for the new syntax, so I opened bug 92908 and it also already got verified.

MySQL Workbench 8.0.13 erorr on functional index

Then inserted a line and explained two SELECT queries below:

INSERT INTO func_index (ipaddr4) VALUES (INET_ATON('192.168.1.1'));

SELECT * FROM func_index WHERE ipaddr4 = INET_ATON('192.168.1.1');
SELECT * FROM func_index WHERE INET_NTOA(ipaddr4) = '192.168.1.1';

The first one would result in full table scan, while the second one would read the functional index func_idx. In order for the query to use the index it's important that the expression in the WHERE clause matches index's specification.

Option to Disallow Tables without a Primary Key

There is now the system variable sql_require_primary_key, which prevents the creation or altering of tables without primary key if set. In such case CREATE or ALTER queries would now fail with error code 3750 (see also SQL state ER_TABLE_WITHOUT_PK).
I tried this new possibility by setting the global variable in a session like this:

SET GLOBAL sql_require_primary_key = ON;

but I was able to create table without primary key. I have to check it again tomorrow and eventually report a bug. Setting the variable in my.ini and restarting the server worked just fine:

SQL> CREATE TABLE tab_no_pk (test INT);
ERROR: 3750: Unable to create a table without PK, when system variable 'sql_require_primary_key' is set. Add a PK to the table or unset this variable to avoid this message. Note that tables without PK can cause performance problems in row-based replication, so please consult your DBA before changing this setting.

I'll finally have a break from developers not familiar with relational concepts :-)

Geometry transformation

It's now possible to transform geometry from one spatial reference system (SRS) to another using built-in function ST_Transform, which accepts as second argument the target SRS. I'm not that much into GIS, so I didn't tested this one, but it seemed worth mentioning.

A few words on upgrading (from 8.0.12)

Upgrading my Windows installation was flawless with MySQL Installer for Windows. However, I had problem with my Linux installation, which consist of more databases. The server wasn't able to start and in the error log I found errors like the following:

2018-10-23T12:47:49.627732Z 1 [ERROR] [MY-013235] [Server] Error in parsing View 'db'.'aView' during upgrade. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc' at line 1

I opened bug 92898 for this and the problem already got explained by support with an incompatible change (see SQL Syntax Notes from the release notes). I'm pretty sure I've read about ASC or DESC qualifiers for GROUP BY clauses being deprecated and I really do not remember why I used such syntax, but it got me off guard, so I had to update view definitions.

There's a lot more in MySQL 8.0.13 and I strongly suggest you review The MySQL 8.0.13 Maintenance Release is Generally Available article and release notes, then explore for yourself.

2018-10-21

First impressions from Oracle XE 18c trying to install it on Fedora 28/29 and Oracle Linux 7

Oracle released Oracle Database Express Edition (or better known as XE) on October 19, 2018. This date also happens to be my birthday, so what a birthday surprise! Let's get the present open and try it!
It was a long wait for Oracle XE 18c since the previous version (based on 11g Release 2) appeared more than 7 years ago (on September 24, 2011) and a release based on Oracle 12c never appeared. I was running XE on Fedora since 11g appeared in 2011, but with the new Fedora releases this become impossible. The database was behaving strange and was not usable after reboot, but I never found the time to investigate deeper these issues. I was thus considering trying Oracle Linux instead (see Which OS is “the best” for Oracle? for an expert opinion).

Fedora 28/29

Since I do not have enough space on the virtual machine I downloaded the RPMs on a server and tried to install them from an NFS mount following the instructions in the Quick Start:

[root@fedora ~]# yum -y localinstall /mnt/vm/oracle-database-preinstall-18c-1.0-1.el7.x86_64.rpm
Last metadata expiration check: 2:01:29 ago on Sun 21 Oct 2018 11:07:06 AM EEST.
Error:
 Problem: conflicting requests
  - nothing provides compat-libcap1 needed by oracle-database-preinstall-18c-1.0-1.el7.x86_64



Hmm... I'm just missing a required package compat-libcap1, but why it's not installed automatically?

[root@fedora ~]# yum install compat-libcap1
Last metadata expiration check: 1:24:06 ago on Sun 21 Oct 2018 11:07:06 AM EEST.
No match for argument: compat-libcap1
Error: Unable to find a match


That's strange, but Fedora doesn't provide this package. I found it in CentOS repository (direct download link) and after installing compat-libcap1 the preinstall RPM installed successfully. However, the database installation failed:

[root@fedora ~]# yum -y localinstall /mnt/vm/oracle-database-xe-18c-1.0-1.x86_64.rpm
Last metadata expiration check: 2:16:29 ago on Sun 21 Oct 2018 11:07:06 AM EEST.
Dependencies resolved.
==========================================================
 Package                Arch   Version Repository    Size
==========================================================
Installing:
 oracle-database-xe-18c x86_64 1.0-1   @commandline 2.4 G

Transaction Summary
==========================================================
Install  1 Package

Total size: 2.4 G
Installed size: 5.2 G
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Error: Transaction check error:
  package oracle-database-xe-18c-1.0-1.x86_64 does not verify: no digest
  installing package oracle-database-xe-18c-1.0-1.x86_64 needs 2472MB on the / filesystem


Error Summary
-------------
Disk Requirements:
   At least 2472MB more space needed on the / filesystem.


Apparently the disk space requirements have increased, because previously I was able to install XE 11gR2 on the same virtual machine. I had to enlarge the disk and extend the volume. However, the problem Error: Transaction check error: package oracle-database-xe-18c-1.0-1.x86_64 does not verify: no digest remains, so I wrote in the forum (see thread 14969247).

To overcome this I used the following command. But to make it work I had to install also libnsl (i.e. dnf install libnsl) and increase the disk space once more. Apparently, the installation of the RPM requires about 5.2 GB.

[root@fedora ~]# rpm -i --nodigest /mnt/vm/oracle-database-xe-18c-1.0-1.x86_64.rpm
warning: /mnt/vm/oracle-database-xe-18c-1.0-1.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID ec551f03: NOKEY
[INFO] Executing post installation scripts...
[INFO] Oracle home installed successfully and ready to be configured.
To configure Oracle Database XE, optionally modify the parameters in '/etc/sysconfig/oracle-xe-18c.conf' and then execute '/etc/init.d/oracle-xe-18c configure' as root.


Then, it's time to configure. For the creation of the database 4.5 GB more disk space is necessary.

[root@fedora ~]# /etc/init.d/oracle-xe-18c configure
Specify a password to be used for database accounts. Oracle recommends that the password entered should be at least 8 characters in length, contain at least 1 uppercase character, 1 lower case character and 1 digit [0-9]. Note that the same password will be used for SYS, SYSTEM and PDBADMIN accounts:
Confirm the password:
Configuring Oracle Listener.
Listener configuration succeeded.
Configuring Oracle Database XE.
[WARNING] [DBT-11209] Current available memory is less than the required available memory (796MB) for creating the database.
   CAUSE: Following nodes do not have required available memory :
 Node:fedora            Available memory:471.8672MB (483192.0KB)

Enter SYS user password:
*************
Enter SYSTEM user password:
*************
Enter PDBADMIN User Password:
**************
Prepare for db operation
7% complete
Copying database files
29% complete
Creating and starting Oracle instance
30% complete
31% complete
34% complete
38% complete
41% complete
43% complete
Completing Database Creation
47% complete
50% complete
Creating Pluggable Databases
54% complete
71% complete
Executing Post Configuration Actions
93% complete
Running Custom Scripts
100% complete
Database creation complete. For details check the logfiles at:
 /opt/oracle/cfgtoollogs/dbca/XE.
Database Information:
Global Database Name:XE
System Identifier(SID):XE
Look at the log file "/opt/oracle/cfgtoollogs/dbca/XE/XE.log" for further details.

Connect to Oracle Database using one of the connect strings:
     Pluggable database: fedora:1539/XEPDB1
     Multitenant container database: fedora:1539
Use https://localhost:5500/em to access Oracle Enterprise Manager for Oracle Database XE


Note: If you receive  Listener configuration failed. Check log '/opt/oracle/cfgtoollogs/netca/netca_configure_out.log' and into the log there is the line No valid IP Address returned for the host fedora simply set the fully qualified domain name (FQDN) into /etc/hostname and start configuration again.

The same procedure applies also to Fedora 29.

Oracle Linux Server 7.5

The installation of preinstall RPM happens smoothly:


[root@oracle ~]# yum -y localinstall /mnt/vm/oracle-database-preinstall-18c-1.0-1.el7.x86_64.rpm
Loaded plugins: ulninfo
Examining /mnt/vm/oracle-database-preinstall-18c-1.0-1.el7.x86_64.rpm: oracle-database-preinstall-18c-1.0-1.el7.x86_64
Marking /mnt/vm/oracle-database-preinstall-18c-1.0-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package oracle-database-preinstall-18c.x86_64 0:1.0-1.el6 will be installed
...
--> Processing Dependency: compat-libcap1 for package: oracle-database-preinstall-18c-1.0-1.el6.x86_64
...
Dependencies Resolved

====================================================================
 Package                         Arch   Version     Repository Size
====================================================================
Installing:                                                  
 oracle-database-preinstall-18c  x86_64 1.0-1.el7   /oracle... 55 k
Installing for dependencies:                                 
...                                                          
 compat-libcap1                  x86_64 1.10-7.el7  ol7_latest 17 k
...
Install  1 Package (+37 Dependent packages)
...
Installed:
  oracle-database-preinstall-18c.x86_64 0:1.0-1.el7

Dependency Installed:
compat-libcap1.x86_64 0:1.10-7.el7
...
Complete!


The dependency on compat-libcap1 was easily resolved as well as 36 other dependencies, because my installation of Oracle Linux is new and I do not have many packages installed yet. The installation of the RPM with the database also happened without problems:

[root@oracle ~]# yum -y localinstall /mnt/vm/oracle-database-xe-18c-1.0-1.x86_64.rpm
Loaded plugins: ulninfo
Examining /mnt/vm/oracle-database-xe-18c-1.0-1.x86_64.rpm: oracle-database-xe-18c-1.0-1.x86_64
Marking /mnt/vm/oracle-database-xe-18c-1.0-1.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package oracle-database-xe-18c.x86_64 0:1.0-1 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

========================================================
 Package                 Arch   Version Repository Size
========================================================
Installing:
 oracle-database-xe-18c  x86_64 1.0-1   /oracle... 5.2 G

Transaction Summary
========================================================
Install  1 Package

Total size: 5.2 G
Installed size: 5.2 G
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : oracle-database-xe-18c-1.0-1.x86_64  1/1
[INFO] Executing post installation scripts...
[INFO] Oracle home installed successfully and ready to be configured.
To configure Oracle Database XE, optionally modify the parameters in '/etc/sysconfig/oracle-xe-18c.conf' and then execute '/etc/init.d/oracle-xe-18c configure' as root.
  Verifying  : oracle-database-xe-18c-1.0-1.x86_64  1/1

Installed:
  oracle-database-xe-18c.x86_64 0:1.0-1

Complete!


And apparently about 5.2 GB is the space really needed for the software itself, because before the installation I had about this space more on the volume. Let's try to configure the database then:

[root@oracle ~]# /etc/init.d/oracle-xe-18c configure
The location '/opt/oracle' specified for database files has insufficient space.
Database creation needs at least '4.5GB' disk space.
Specify a different database file destination that has enough space in the configuration file '/etc/sysconfig/oracle-xe-18c.conf'.


Aha... so the disk space requirements are actually pretty much the same as for Standard Edition on Linux. After resizing the virtual disk and enlarging the volume the configuration passes successfully for about 10 minutes:

[root@oracle ~]# /etc/init.d/oracle-xe-18c configure
Specify a password to be used for database accounts. Oracle recommends that the password entered should be at least 8 characters in length, contain at least 1 uppercase character, 1 lower case character and 1 digit [0-9]. Note that the same password will be used for SYS, SYSTEM and PDBADMIN accounts:
Confirm the password:
Configuring Oracle Listener.
Listener configuration succeeded.
Configuring Oracle Database XE.
Enter SYS user password:
*************
Enter SYSTEM user password:
***************
Enter PDBADMIN User Password:
***********
Prepare for db operation
7% complete
Copying database files
29% complete
Creating and starting Oracle instance
30% complete
31% complete
34% complete
38% complete
41% complete
43% complete
Completing Database Creation
47% complete
50% complete
Creating Pluggable Databases
54% complete
71% complete
Executing Post Configuration Actions
93% complete
Running Custom Scripts
100% complete
Database creation complete. For details check the logfiles at:
 /opt/oracle/cfgtoollogs/dbca/XE.
Database Information:
Global Database Name:XE
System Identifier(SID):XE
Look at the log file "/opt/oracle/cfgtoollogs/dbca/XE/XE.log" for further details.

Connect to Oracle Database using one of the connect strings:
     Pluggable database: oracle.sotirov-bg.net/XEPDB1
     Multitenant container database: oracle.sotirov-bg.net
Use https://localhost:5500/em to access Oracle Enterprise Manager for Oracle Database XE


Even after successful installation and configuration there are few manual steps necessary:
  •  Set environment variables. For this I just added the following to /etc/profile:
export ORACLE_HOME=/opt/oracle/product/18c/dbhomeXE
export ORACLE_SID=XE
pathmunge $ORACLE_HOME/bin after

  • Enable database start at system boot. For this I used the default script at /etc/init.d/oracle-xe-18c and executed one of the following:
chkconfig --add oracle-xe-18c; chkconfig oracle-xe-18c on
systemctl enable oracle-xe-18c
  • Enable access over the network. For this it's necessary to allow the default listener port 1521 on the firewall:
firewall-cmd --permanent --zone=public --add-port=1521/tcp
systemctl reload firewalld

Time to connect to the database:

[root@oracle ~]# sqlplus sys/password@//localhost:1521/XE as sysdba

SQL*Plus: Release 18.0.0.0.0 - Production on Sun Oct 21 19:16:35 2018
Version 18.4.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.

Connected to:
Oracle Database 18c Express Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0

SQL>


Hurray! Now that the database is installed and accessible it's time to load some data and start experimenting.

2018-08-22

LIbreOffice 6.1 and named parameters substitution

I've recently upgraded to LibreOffice 6.1 eager to try to new features and improvements. However, I soon noticed that all my sub forms in Base are broken. Those were sub forms executing SQL statements like:

SELECT id, name FROM persons

So I could get additional information from related tables by mapping master to slave fields. Apparently in earlier versions Base was rewriting the query, but this is no more the case. You'll get errors like:
  • The data content could not be loaded.
  • SQL Status: 07009 Invalid descriptor index.
  • You tried to set a parameter at position '1' but there is/are only '0' parameter(s) allowed. One reason may be that the property "ParameterNameSubstitution" is not set to TRUE in the data source.
I quickly opened the release notes and found what I've omitted previously:

Base
  • The ODBC, JDBC and Firebird SDBC drivers/bridges previously applied named parameter substitution (for ODBC and JDBC: when enabled) to all SQL commands, including those that were tagged in the GUI as "execute SQL command directly", thereby not preserving the SQL command exactly as typed. Named parameter substitution now is applied only to SQL queries that are generated from a LibreOffice SQL parse tree, leaving "execute SQL command directly" commands completely untouched.
So the solution is to rewrite the sub form queries like this (which is the unnamed syntax used by MySQL):

SELECT name FROM persons WHERE id = ?

Or in other words write the named parameter explicitly in the query.

2018-07-11

Why loading data with MySQL Workbench is so slow?

I recently started a small personal project for statistics on 24 Hours of Le Mans endurance race of which I'm a fan. I started by gathering data and for loading them started using data import functionality in MySQL Workbench. The functionality is accessible from the context menu of a table - you click with the right button on a table and then on the item "Table Data Import Wizard". It's a pretty neat feature, because it allows you:
  • first, to select a local file;
  • then, to chose destination - either an existing table (and truncate it if necessary) or a new one;
  • then, to configure settings (e.g. CVS options like line and column separators and enclosure character), file encoding and column mappings;
  • finally, to start the import and watch for errors (a message is printed for each line).
However, with the growing number of lines in the CSV file, I noticed that the data import becomes slower and slower (I didn't pay much attention initially), so for about 4500 lines it now takes about 5 minutes (!) in a local network on a server that is not that much loaded with work. I cannot imagine if I have to load hundreds of thousands or even millions of lines like this.
I tried the same with LOAD DATA INFILE and the same data was loaded in about 1 second, even though I have some triggers on the table checking newly inserted data. The difference is quite obvious.
So why is loading data in MySQL Workbench so slow compared to LOAD DATA INFILE statement?
By checking the general log the difference becomes immediately apparent - MySQL Workbench generates an INSERT statement for each line of the input file an executes these statements consecutively one by one. That's quite expensive and time consuming especially on the network (see Optimizing INSERT Statements).
But why it is so? Two possible explanations:
  1. LOAD DATA INFILE statement requires the file to be available to the server (i.e. on a local file system accessible to the server). However, with LOCAL keyword the client could transfer the file to the server, so that the server could read it. The overhead in this case would be the transfer of the file over the network. This however requires that both server and client have local data loading capabilities enabled (for server see local_infile system variable and for client it's a compile time ENABLED_LOCAL_INFILE CMake option). So although I have this capability enabled on the server apparently it's disabled in the MySQL client used by Workbench, because trying to execute LOAD DATA INFILE with the LOCAL keyword gives Error Code: 1148. The used command is not allowed with this MySQL version (see more in Security Issues with LOAD DATA LOCAL).
  2. Using INSERT statement with multiple VALUES lists to insert many rows at a time would not provide Workbench with the possibility to give information for each processed line from the input file.
If I'm right for the above, then I would expect the data import log in MySQL Workbench to be more advanced. First of all, no line information is printed, so the messages "Data import" (successful?) and various possible errors are useless. I had to copy the log and index the errors in a text editor in order to identify the problem in the data file. Then, I'd like to have information only for errors (e.g. some kind of filtering). There is a progress bar, but without any numbers - I think a message like "Processing line X of NB_LNS_IN_FILE" would be quite useful, but this would require counting the lines in advance. There is also no option to save the import log and copying it requires scrolling (i.e. no Ctrol+A or Select everything) exists as an option.
In any case, I find data import functionality in MySQL Workbench to be quite useful, but be ware if you need to load large amounts of data (i.e. several thousand lines or more). In case you need to load such amounts of data it's highly preferable to transfer the data file locally and use LOAD DATA INFILE, which according to the manual "reads rows from a text file into a table at a very high speed".

2018-06-12

Is it time for Slackware 15?

Last Slackware release (14.2) appeared almost 2 years ago and upgraded many packages, but things are moving faster nowadays, so I just wonder isn't it time for a new release? I'm following -current, so I already see some notable changes and upgrades accumulating there, including:
  • Base system:
    • Linux Kernel 4.14 LTS;
    • GNU C Library 2.27;
  • Servers:
    • Postfix 3.3 (replacing Sendmail as MTA) and Dovecot 2.3 (replacing imapd as IMAP daemon);
    • Apache 2.4.33 (with HTTP/2 support);
    • ProFTPD 1.3.6 (with mod_sftp and mod_facl support);
    • Samba 4.6;
    • MariaDB 10.3.7 (very nice given what I recently wrote about new functionalities);
    • Bind 9.11, dhcp 4.3.4, ntp 4.2.8p11 (now running as ntp:ntp) and more;
  • Development:
    • GNU's Compiler Collection 7.3 and LLVM 6.0;
    • Perl 5.26, PHP 7.2, Python 3.6, Ruby 2.5 and Rust 1.26;
    • Git 2.17 and Mercurial 4.5;
  • GUI:
    • X.Org 1.20;
    • KDE 4.14.36 (I was hoping for Plasma 5.x though);
    • SeaMonkey 2.49, Firefox 59.0 and Thunderbird 52.8;
    • GTK+ applications such as Pidgin 2.13, Gimp 2.10, etc.
There are 66 new packages in total among them FFmpeg 3.4, new libraries like opus, lame, libbluray, speex, id3lib, libwebp and various Python modules (e.g. docutils, idna and six), which I already have on SlackPack. I noted also SDL2 library, which I do not have and which is nice, because some packages already need it.
There are many new packages I was waiting like Postfix, Dovecot, Python 3, MariaDB 10.3, GCC 7.3, which should make a great new release whichever and whenever it's going to be. The only disappointment for me is that KDE is still 4.14 and Qt 4.8, but some applications already need KDE 5 and Qt 5 (e.g. like KDESvn, cppcheck's GUI, etc.).
Package pkgtools is already 15.0 in -current, so is it time for Slackware 15?

2018-05-30

First attempts with in-place upgrade from MySQL 5.7 to 8.0

I have previously written, that in-place upgrade is not possible from MySQL 8.0.4 to 8.0.11, but it's officially supported between GA releases (but not through MySQL Installer - see #90619 and more likely #79315). So today I've made my first attempts to migrate my MySQL 5.7 server to MySQL 8.0 by copying data and configuration in a Slackware virtual machine. Let's start...
  • First attempt:
2018-05-30T16:42:28.485076Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.11) starting as process 18078
2018-05-30T16:42:28.727435Z 1 [ERROR] [MY-012526] [InnoDB] InnoDB: Upgrade after a crash is not supported. This redo log was created with MySQL 5.7.22. Please follow the instructions at http://dev.mysql.com/doc/refman/8.0/en/upgrading.html
2018-05-30T16:42:28.727509Z 1 [ERROR] [MY-012930] [InnoDB] InnoDB: Plugin initialization aborted with error Generic error.
2018-05-30T16:42:29.228125Z 1 [ERROR] [MY-011013] [Server] Failed to initialize DD Storage Engine.
2018-05-30T16:42:29.228520Z 0 [ERROR] [MY-010020] [Server] Data Dictionary initialization failed.
2018-05-30T16:42:29.228554Z 0 [ERROR] [MY-010119] [Server] Aborting
2018-05-30T16:42:29.235976Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.11)  Source distribution.


Well, I really took my chances by just copying the data directory of the running MySQL 5.7 server :-)

Solution: Make a clean copy of the data directory when the server is stopped.
  • Second attempt:
2018-05-30T16:53:06.475573Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.11) starting as process 18236
2018-05-30T16:53:08.332393Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-001287 - 'validate password plugin' is deprecated and will be removed in a future release. Please use validate_password component instead
InnoDB MEMCACHED: Memcached uses atomic increment
 InnoDB_Memcached: Unable to open table 'innodb_memcache/containers'
 InnoDB_Memcached: Please create config table'containers' in database 'innodb_memcache' by running 'innodb_memcached_config.sql. error Table not found'
Failed to initialize instance. Error code: 13
2018-05-30T16:53:13.256478Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2018-05-30T16:53:13.257272Z 0 [ERROR] [MY-010262] [Server] Can't start server: Bind on TCP/IP port: Address already in use
2018-05-30T16:53:13.257283Z 0 [ERROR] [MY-010257] [Server] Do you already have another mysqld server running on port: 3306 ?

2018-05-30T16:53:13.257298Z 0 [ERROR] [MY-010119] [Server] Aborting
 InnoDB_Memcached: Memcached plugin is still initializing. Can't shut down it.
2018-05-30T16:53:29.161133Z 0 [Warning] [MY-011068] [Server] The syntax 'validate password plugin' is deprecated and will be removed in a future release. Please use validate_password component instead.
2018-05-30T16:53:30.304586Z 0 [System] [MY-010910] [Server] /usr/sbin/mysqld: Shutdown complete (mysqld 8.0.11)  Source distribution.


Sure, I have another MySQL 8.0 server listening on port 3306, so let's fix the configuration.

Solution: Change port in configuration file.
  • Third attempt:
2018-05-30T17:04:57.144549Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.11) starting as process 18571
2018-05-30T17:04:57.523116Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-001287 - 'validate password plugin' is deprecated and will be removed in a future release. Please use validate_password component instead
2018-05-30T17:04:57.630947Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2018-05-30T17:04:57.633668Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-001146 - Table 'mysql.component' doesn't exist
2018-05-30T17:04:57.633699Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-003543 - The mysql.component table is missing or has an incorrect definition.
2018-05-30T17:04:57.633776Z 0 [ERROR] [MY-011071] [Server] unknown option '--secure-auth'
2018-05-30T17:04:57.633787Z 0 [Warning] [MY-010952] [Server] The privilege system failed to initialize correctly. If you have upgraded your server, make sure you're executing mysql_upgrade to correct the issue.
2018-05-30T17:04:57.633840Z 0 [ERROR] [MY-010119] [Server] Aborting
2018-05-30T17:04:58.636827Z 0 [Warning] [MY-011068] [Server] The syntax 'validate password plugin' is deprecated and will be removed in a future release. Please use validate_password component instead.


Yeah, data dictionary is not yet upgraded, but for this I need a running server and there is also error about unknown configuration option.

Solution: Remove secure-auth option from configuration file as it was removed in 8.0.3 (see also its release notes).
  • Fourth attempt:
2018-05-30T17:06:21.153201Z 0 [ERROR] [MY-011071] [Server] unknown variable 'max_tmp_tables=64'

Solution: Same case as above. Remove max_tmp_tables option from configuration file.
  • Fifth attempt:
2018-05-30T17:10:36.750614Z 0 [ERROR] [MY-011071] [Server] unknown variable 'query_cache_type=1'

Yep, I forgot that query cache is removed too...

Solution: Remove query_cache_type and query_cache_size options from configuration file as Query Cache was removed in MySQL 8.
  • Sixth attempt:
2018-05-30T17:11:18.941821Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.11) starting as process 18803
2018-05-30T17:11:19.333818Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
2018-05-30T17:11:19.335972Z 0 [ERROR] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-001146 - Table 'mysql.component' doesn't exist
2018-05-30T17:11:19.336010Z 0 [Warning] [MY-013129] [Server] A message intended for a client cannot be sent there as no client-session is attached. Therefore, we're sending the information to the error-log instead: MY-003543 - The mysql.component table is missing or has an incorrect definition.
2018-05-30T17:11:19.348988Z 0 [Warning] [MY-010929] [Server] Storage engine 'MyISAM' does not support system tables. [mysql.user].
2018-05-30T17:11:19.349014Z 0 [Warning] [MY-010929] [Server] Storage engine 'MyISAM' does not support system tables. [mysql.db].
2018-05-30T17:11:19.349020Z 0 [Warning] [MY-010929] [Server] Storage engine 'MyISAM' does not support system tables. [mysql.tables_priv].
2018-05-30T17:11:19.349026Z 0 [Warning] [MY-010929] [Server] Storage engine 'MyISAM' does not support system tables. [mysql.columns_priv].
2018-05-30T17:11:19.349031Z 0 [Warning] [MY-010929] [Server] Storage engine 'MyISAM' does not support system tables. [mysql.procs_priv].
2018-05-30T17:11:19.349035Z 0 [Warning] [MY-010929] [Server] Storage engine 'MyISAM' does not support system tables. [mysql.proxies_priv].

2018-05-30T17:11:19.352944Z 0 [ERROR] [MY-013143] [Server] Column count of mysql.user is wrong. Expected 49, found 45. The table is probably corrupted
2018-05-30T17:11:19.352980Z 0 [Warning] [MY-010966] [Server] ACL table mysql.role_edges missing. Some operations may fail.
2018-05-30T17:11:19.352987Z 0 [Warning] [MY-010966] [Server] ACL table mysql.default_roles missing. Some operations may fail.
2018-05-30T17:11:19.352992Z 0 [Warning] [MY-010966] [Server] ACL table mysql.global_grants missing. Some operations may fail.
2018-05-30T17:11:19.352997Z 0 [Warning] [MY-010966] [Server] ACL table mysql.password_history missing. Some operations may fail.

2018-05-30T17:11:19.353420Z 0 [ERROR] [MY-010965] [Server] Missing system table mysql.global_grants; please run mysql_upgrade to create it.
2018-05-30T17:11:19.365069Z 0 [Warning] [MY-010727] [Server] System table 'func' is expected to be transactional.
2018-05-30T17:11:19.366847Z 0 [Warning] [MY-010405] [Repl] Info table is not ready to be used. Table 'mysql.slave_master_info' cannot be opened.

2018-05-30T17:11:19.366874Z 0 [ERROR] [MY-010422] [Repl] Error in checking mysql.slave_master_info repository info type of TABLE.
2018-05-30T17:11:19.368238Z 0 [ERROR] [MY-010415] [Repl] Error creating master info: Error checking repositories.
2018-05-30T17:11:19.368273Z 0 [ERROR] [MY-010426] [Repl] Slave: Failed to initialize the master info structure for channel ''; its record may still be present in 'mysql.slave_master_info' table, consider deleting it.

2018-05-30T17:11:19.368280Z 0 [ERROR] [MY-010529] [Repl] Failed to create or recover replication info repositories.
2018-05-30T17:11:19.373821Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.11'  socket: '/var/lib/mysql/mysqld57.sock'  port: 3316  Source distribution.


Success! The server is already running, but apparently needs upgrade as system tables are either missing or still in MyISAM. However, should not this be done automatically? At least I'm getting this impression when reading Step 8 for In-Place Upgrade in the manual. Anyway, step 9 is to run mysql_upgrade, so after running the command the errors in the log are gone.

Of course, I'm was playing here, but at least I now have a configuration file ready for 8.0 and have validated the upgrading procedure for my use case. However, I have to check also my applications compatibility, so the real upgrade would happen later.

P.S. A nice way to check for configuration problems is described in How to check MySQL Config files by Peter Zaitsev.

2018-05-28

MariaDB 10.3 adds even more development features than MySQL 8.0

Last Friday (May 25-th) MariaDB 10.3.7 was released as stable (General Availability). I was finally been able to check the release notes and (not surprisingly) I found some features that we're still missing in MySQL (including the released a month ago 8.0.11 GA for which I wrote). I'd like to highlight the following notable changes related to development (of which I'm interested the most):
Wow. They've been working hard, that's for sure. I'll have yet to test all these new features myself, but I'm quite excited to see them available in "one of the most popular database servers". Unfortunately, neither one of these is available in MySQL. Here below is my list of what's still missing in MySQL (and MariaDB):

Features MariaDB MySQL Standard
SQL Modify table data based on subquery on the same table 10.3.2 (2017-11-09) n/a
EXCEPT or MINUS and INTERSECT set operators 10.3.0 (2017-04-16) n/a SQL:2003
Temporal queries (AS OF) 10.3.4 (2018-01-18) n/a SQL:2011
DDL CHECK constraints 10.2.1 (2016-07-04) n/a SQL:1999
Partitioning of tables with foreign keys n/a n/a
Temporal tables (WITH SYSTEM VERSIONING) 10.3.4 (2018-01-18) n/a SQL:2011
CREATE OR REPLACE for tables, triggers and stored routines 10.0.8 (2014-02-10), 10.1.4 (2015-04-13) and 10.1.3 (2015-03-02) n/a
Enable/disable for triggers n/a n/a
Sequences 10.3.0 (2017-04-16) n/a SQL:2003
A working schema rename (i.e. RENAME {DATABASE | SCHEMA}) statement n/a n/a
Views Saving of the exact view statement in the server n/a n/a n/a
Fully updatable views with INSTEAD OF triggers n/a n/a SQL:2008
Materialized views n/a n/a
Stored routines Default arguments n/a n/a
FOR loop 10.3.3 (2017-12-23) n/a
Anonymous code blocks (or compound statements outside routines) 10.1.1 (2014-10-17) n/a
Records, column and row anchored types 10.3.0 (2017-04-16) n/a
Stored modules or packages 10.3.5 (2018-02-26) n/a
Statement and schema level triggers n/a n/a
Debugging facilities n/a n/a n/a
External routines in JavaScript, Perl, PHP, etc. n/a n/a

It's quite apparent from the above table that MySQL has more gray spots than MariaDB. I've added also the related SQL standard where applicable, so in this regards MariaDB has broader standards support. It seems to me than in its latest stable version MariaDB has implemented more "Oracle like" features, than MySQL which is developed by Oracle.
So is it worth it changing the database? I believe I'll continue using both MySQL and MariaDB (which is default database in most Linux distributions) and the reasons for this are that these are becoming more and more different with the time. There are development features in MySQL (e.g. JSON data type, the Document store), which do not exist in MariaDB and vice versa. I'm still quite used using MySQL Workbench for daily work with MySQL databases, but the support for MariaDB in it is fading and unless MySQL adds the same SQL features, I doubt the new development functionalities would be usable from the Workbench, which impacts my work.