Vitalii Diravka (JIRA)
2018-12-10 17:25:01 UTC
Vitalii Diravka created DRILL-6893:
--------------------------------------
Summary: Invalid output for star and semi-join queries for RDBS Storage Plugin
Key: DRILL-6893
URL: https://issues.apache.org/jira/browse/DRILL-6893
Project: Apache Drill
Issue Type: Bug
Components: Storage - JDBC
Affects Versions: 1.14.0
Environment: mysql-5.7.23-0ubuntu0.18.04.1
mysql-connector-java-5.1.39-bin.jar
Reporter: Vitalii Diravka
Fix For: Future
Invalid output for star and semi-join queries for RDBS Storage Plugin:
{code}
0: jdbc:drill:zk=local> SELECT * FROM (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID` ;
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| PersonID | OrderID | ItemID | GroupID | PersonID0 | OrderID0 | ItemID0 | GroupID0 |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| 10 | 10 | 10 | 10 | null | null | null | null |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
1 row selected (1.402 seconds)
0: jdbc:drill:zk=local> select * from sys.version;
+------------------+-------------------------------------------+------------------------------------------------+----------------------------+----------------------------+----------------------------+
| version | commit_id | commit_message | commit_time | build_email | build_time |
+------------------+-------------------------------------------+------------------------------------------------+----------------------------+----------------------------+----------------------------+
| 1.15.0-SNAPSHOT | 100a68b314230d4cf327477f7a10f9c650720513 | DRILL-540: Allow querying hive views in Drill | 30.11.2018 @ 10:50:46 EET | ***@gmail.com | 10.12.2018 @ 15:46:54 EET |
+------------------+-------------------------------------------+------------------------------------------------+----------------------------+----------------------------+----------------------------+
1 row selected (0.302 seconds)
{code}
{code}
0: jdbc:drill:zk=local> SELECT * FROM (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID`;
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| PersonID | OrderID | ItemID | GroupID | PersonID0 | OrderID0 | ItemID0 | GroupID0 |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| 10 | 10 | 10 | 10 | null | null | null | null |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
1 row selected (1.344 seconds)
0: jdbc:drill:zk=local> select * from sys.version;
+----------+-------------------------------------------+------------------------------------------------------+-----------------------------+----------------------------+----------------------------+
| version | commit_id | commit_message | commit_time | build_email | build_time |
+----------+-------------------------------------------+------------------------------------------------------+-----------------------------+----------------------------+----------------------------+
| 1.11.0 | 4220fb2fffbc81883df3e5fea575fa0a584852b3 | [maven-release-plugin] prepare release drill-1.11.0 | 24.07.2017 @ 16:47:07 EEST | ***@gmail.com | 06.12.2018 @ 14:36:39 EET |
+----------+-------------------------------------------+------------------------------------------------------+-----------------------------+----------------------------+----------------------------+
1 row selected (0.271 seconds)
{code}
But the same query in MySQL:
{code}
mysql> select * from `mscIdentities3` t1 join `mscIdentities3` t2 on t1.`PersonId` = t2.`PersonID` where t1.`PersonID` = 10;
+----------+---------+--------+---------+----------+---------+--------+---------+
| PersonID | OrderID | ItemID | GroupID | PersonID | OrderID | ItemID | GroupID |
+----------+---------+--------+---------+----------+---------+--------+---------+
| 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
+----------+---------+--------+---------+----------+---------+--------+---------+
1 row in set (1.27 sec)
{code}
Drill physical plan for above query is:
{code}
00-00 Screen : rowType = RecordType(INTEGER PersonID, INTEGER OrderID, INTEGER ItemID, INTEGER GroupID, INTEGER PersonID0, INTEGER OrderID0, INTEGER ItemID0, INTEGER GroupID0): rowcount = 15.0, cumulative cost = {31.5 rows, 136.5 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 735
00-01 Project(PersonID=[$0], OrderID=[$1], ItemID=[$2], GroupID=[$3], PersonID0=[$4], OrderID0=[$5], ItemID0=[$6], GroupID0=[$7]) : rowType = RecordType(INTEGER PersonID, INTEGER OrderID, INTEGER ItemID, INTEGER GroupID, INTEGER PersonID0, INTEGER OrderID0, INTEGER ItemID0, INTEGER GroupID0): rowcount = 15.0, cumulative cost = {30.0 rows, 135.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 734
00-02 Jdbc(sql=[SELECT * FROM (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID` ]) : rowType = RecordType(INTEGER PersonID, INTEGER OrderID, INTEGER ItemID, INTEGER GroupID, INTEGER PersonID0, INTEGER OrderID0, INTEGER ItemID0, INTEGER GroupID0): rowcount = 15.0, cumulative cost = {15.0 rows, 15.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 695
{code}
It is clear from above plan that Drill submits modified query to MySQL DB. But even that query works fine in MySQL:
{code}
mysql> SELECT * FROM (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID` ;
+----------+---------+--------+---------+----------+---------+--------+---------+
| PersonID | OrderID | ItemID | GroupID | PersonID | OrderID | ItemID | GroupID |
+----------+---------+--------+---------+----------+---------+--------+---------+
| 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
+----------+---------+--------+---------+----------+---------+--------+---------+
1 row in set (1.26 sec)
{code}
Most likely Drill interprets the obtained Result Set incorrectly in the Project operator.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
--------------------------------------
Summary: Invalid output for star and semi-join queries for RDBS Storage Plugin
Key: DRILL-6893
URL: https://issues.apache.org/jira/browse/DRILL-6893
Project: Apache Drill
Issue Type: Bug
Components: Storage - JDBC
Affects Versions: 1.14.0
Environment: mysql-5.7.23-0ubuntu0.18.04.1
mysql-connector-java-5.1.39-bin.jar
Reporter: Vitalii Diravka
Fix For: Future
Invalid output for star and semi-join queries for RDBS Storage Plugin:
{code}
0: jdbc:drill:zk=local> SELECT * FROM (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID` ;
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| PersonID | OrderID | ItemID | GroupID | PersonID0 | OrderID0 | ItemID0 | GroupID0 |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| 10 | 10 | 10 | 10 | null | null | null | null |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
1 row selected (1.402 seconds)
0: jdbc:drill:zk=local> select * from sys.version;
+------------------+-------------------------------------------+------------------------------------------------+----------------------------+----------------------------+----------------------------+
| version | commit_id | commit_message | commit_time | build_email | build_time |
+------------------+-------------------------------------------+------------------------------------------------+----------------------------+----------------------------+----------------------------+
| 1.15.0-SNAPSHOT | 100a68b314230d4cf327477f7a10f9c650720513 | DRILL-540: Allow querying hive views in Drill | 30.11.2018 @ 10:50:46 EET | ***@gmail.com | 10.12.2018 @ 15:46:54 EET |
+------------------+-------------------------------------------+------------------------------------------------+----------------------------+----------------------------+----------------------------+
1 row selected (0.302 seconds)
{code}
{code}
0: jdbc:drill:zk=local> SELECT * FROM (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID`;
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| PersonID | OrderID | ItemID | GroupID | PersonID0 | OrderID0 | ItemID0 | GroupID0 |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
| 10 | 10 | 10 | 10 | null | null | null | null |
+-----------+----------+---------+----------+------------+-----------+----------+-----------+
1 row selected (1.344 seconds)
0: jdbc:drill:zk=local> select * from sys.version;
+----------+-------------------------------------------+------------------------------------------------------+-----------------------------+----------------------------+----------------------------+
| version | commit_id | commit_message | commit_time | build_email | build_time |
+----------+-------------------------------------------+------------------------------------------------------+-----------------------------+----------------------------+----------------------------+
| 1.11.0 | 4220fb2fffbc81883df3e5fea575fa0a584852b3 | [maven-release-plugin] prepare release drill-1.11.0 | 24.07.2017 @ 16:47:07 EEST | ***@gmail.com | 06.12.2018 @ 14:36:39 EET |
+----------+-------------------------------------------+------------------------------------------------------+-----------------------------+----------------------------+----------------------------+
1 row selected (0.271 seconds)
{code}
But the same query in MySQL:
{code}
mysql> select * from `mscIdentities3` t1 join `mscIdentities3` t2 on t1.`PersonId` = t2.`PersonID` where t1.`PersonID` = 10;
+----------+---------+--------+---------+----------+---------+--------+---------+
| PersonID | OrderID | ItemID | GroupID | PersonID | OrderID | ItemID | GroupID |
+----------+---------+--------+---------+----------+---------+--------+---------+
| 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
+----------+---------+--------+---------+----------+---------+--------+---------+
1 row in set (1.27 sec)
{code}
Drill physical plan for above query is:
{code}
00-00 Screen : rowType = RecordType(INTEGER PersonID, INTEGER OrderID, INTEGER ItemID, INTEGER GroupID, INTEGER PersonID0, INTEGER OrderID0, INTEGER ItemID0, INTEGER GroupID0): rowcount = 15.0, cumulative cost = {31.5 rows, 136.5 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 735
00-01 Project(PersonID=[$0], OrderID=[$1], ItemID=[$2], GroupID=[$3], PersonID0=[$4], OrderID0=[$5], ItemID0=[$6], GroupID0=[$7]) : rowType = RecordType(INTEGER PersonID, INTEGER OrderID, INTEGER ItemID, INTEGER GroupID, INTEGER PersonID0, INTEGER OrderID0, INTEGER ItemID0, INTEGER GroupID0): rowcount = 15.0, cumulative cost = {30.0 rows, 135.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 734
00-02 Jdbc(sql=[SELECT * FROM (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID` ]) : rowType = RecordType(INTEGER PersonID, INTEGER OrderID, INTEGER ItemID, INTEGER GroupID, INTEGER PersonID0, INTEGER OrderID0, INTEGER ItemID0, INTEGER GroupID0): rowcount = 15.0, cumulative cost = {15.0 rows, 15.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 695
{code}
It is clear from above plan that Drill submits modified query to MySQL DB. But even that query works fine in MySQL:
{code}
mysql> SELECT * FROM (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN (SELECT * FROM `testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` ON `t`.`PersonID` = `t0`.`PersonID` ;
+----------+---------+--------+---------+----------+---------+--------+---------+
| PersonID | OrderID | ItemID | GroupID | PersonID | OrderID | ItemID | GroupID |
+----------+---------+--------+---------+----------+---------+--------+---------+
| 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
+----------+---------+--------+---------+----------+---------+--------+---------+
1 row in set (1.26 sec)
{code}
Most likely Drill interprets the obtained Result Set incorrectly in the Project operator.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)