大数据作业 Spark sql
1. 为 Spark SQL 添加一条自定义命令
• SHOW VERSION;• 显示当前 Spark 版本和 Java 版本
Answser
clone spark into local
open
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
file 并依次添加 key word
using maven plugin, in antlr4, double click to translate.
in file
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala
insert below code
override def visitShowSparkVersion(ctx: ShowSparkVersionContext): LogicalPlan = withOrigin(ctx) {
ShowSparkVersionCommand()
}
new file
sql/core/src/main/scala/org/apache/spark/sql/execution/command/ShowSparkVersionCommand.scala
with content
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.spark.sql.execution.command
import org.apache.spark.sql.{Row, SparkSession}
import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference}
import org.apache.spark.sql.types.StringType
case class ShowSparkVersionCommand() extends LeafRunnableCommand {
override val output: Seq[Attribute] =
Seq(AttributeReference("spark_version", StringType, nullable = true)())
override def run(sparkSession: SparkSession): Seq[Row] = {
val outputString = System.getenv("SPARK_VERSION")
Seq(Row(outputString))
}
}
in spark folder , run
build/sbt package -Phive -Phivethrift
then after build, package successful, cd
bin
and lanuch with envriment variableSPARK_VERSION =3.1.2(custom cmd)
spark-sql
8: query:
show spark_version;
will output 3.1.2(custom cmd)
(base) xukaixuan@xukaixuandeMacBook-Pro spark % SPARK_VERSION=3.1.2(CUSTOM CMD) bin/spark-sql
21/09/04 23:41:40 WARN Utils: Your hostname, xukaixuandeMacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 192.168.31.85 instead (on interface en0)
21/09/04 23:41:40 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/09/04 23:41:40 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/09/04 23:41:43 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
21/09/04 23:41:43 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
21/09/04 23:41:45 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
21/09/04 23:41:45 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore xukaixuan@127.0.0.1
Spark master: local[*], Application Id: local-1630770102074
spark-sql> show MYVERSION;
3.1.2(CUSTOM CMD)
Time taken: 2.148 seconds, Fetched 1 row(s)
2. 构建 SQL 满足如下要求
通过 set spark.sql.planChangeLog.level=WARN;查看
构建一条 SQL,同时 apply 下面三条优化规则:CombineFiltersCollapseProjectBooleanSimplification
构建一条 SQL,同时 apply 下面五条优化规则:ConstantFoldingPushDownPredicatesReplaceDistinctWithAggregateReplaceExceptWithAntiJoinFoldablePropagation
answser
before:
using spark-sql create table
CREATE TEMPORARY TABLE finance USING org.apache.spark.sql.json OPTIONS (path 'finances-small.json');
, 其中 finances-small.json
数据为:
{"ID":1,"Account":{"Number":"123-ABC-789","FirstName":"Jay","LastName":"Smith"},"Date":"1/1/2015","Amount":1.23,"Description":"Drug Store"},
{"ID":2,"Account":{"Number":"456-DEF-456","FirstName":"Sally","LastName":"Fuller"},"Date":"1/3/2015","Amount":200.00,"Description":"Electronics"},
{"ID":3,"Account":{"Number":"333-XYZ-999","FirstName":"Brad","LastName":"Turner"},"Date":"1/4/2015","Amount":106.00,"Description":"Gas},
{"ID":4,"Account":{"Number":"987-CBA-321","FirstName":"Justin","LastName":"Pihony"},"Date":"1/4/2015","Amount":0.00,"Description":"Drug Store"},
...
在 sql cmd 中设置: set spark.sql.planChangeLog.level=WARN;
执行 sql:
CREATE TEMPORARY TABLE finance USING org.apache.spark.sql.json OPTIONS (path 'finances-small.json');
-- 1. 构建一条SQL,同时apply下面三条优化规则:
-- CombineFilters
-- CollapseProject
-- BooleanSimplification
select A + B + 1, ID, (case when true then "1" when false then "2" else "3" end) as c
from ( select Amount -1 AS A, Amount + 2 AS B, ID, `Date` from ( select * FROM finance where ID < 30 ) WHERE ID > 5 )
WHERE ID < 20
最终得到 log, 其中 CombineFilters
被优化:
===================================
21/09/05 14:23:17 WARN PlanChangeLogger:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.CollapseProject ===
!Project [((A#47 + B#48) + cast(1 as double)) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 WHEN false THEN 2 ELSE 3 END AS c#49] Project [(((Amount#18 - cast(1 as double)) + (Amount#18 + cast(2 as double))) + cast(1 as double)) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 WHEN false THEN 2 ELSE 3 END AS c#49]
!+- Project [(Amount#18 - cast(1 as double)) AS A#47, (Amount#18 + cast(2 as double)) AS B#48, ID#21L]
+- Filter ((ID#21L > cast(5 as bigint)) AND (ID#21L < cast(20 as bigint)))
! +- Filter ((ID#21L > cast(5 as bigint)) AND (ID#21L < cast(20 as bigint)))
+- Project [Amount#18, ID#21L]
! +- Project [Amount#18, ID#21L]
+- Filter (ID#21L < cast(30 as bigint))
! +- Filter (ID#21L < cast(30 as bigint))
+- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json
! +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json
===================================
===================================
21/09/05 14:23:17 WARN PlanChangeLogger:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals ===
!Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 WHEN false THEN 2 ELSE 3 END AS c#49] Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 ELSE 3 END AS c#49]
+- Filter ((ID#21L > 5) AND (ID#21L < 20))
+- Filter ((ID#21L > 5) AND (ID#21L < 20))
+- Project [Amount#18, ID#21L]
+- Project [Amount#18, ID#21L]
+- Filter (ID#21L < 30)
+- Filter (ID#21L < 30)
+- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json
+- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json
===================================
===================================
21/09/05 14:23:17 WARN PlanChangeLogger:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.PushDownPredicates ===
Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 ELSE 3 END AS c#49] Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 ELSE 3 END AS c#49]
!+- Filter ((ID#21L > 5) AND (ID#21L < 20))
+- Project [Amount#18, ID#21L]
! +- Project [Amount#18, ID#21L]
+- Filter ((ID#21L < 30) AND ((ID#21L > 5) AND (ID#21L < 20)))
! +- Filter (ID#21L < 30)
+- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json
! +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json
===================================
执行
-- 2. 构建一条SQL,同时apply下面五条优化规则:
-- ConstantFolding
-- PushDownPredicates
-- ReplaceDistinctWithAggregate
-- ReplaceExceptWithAntiJoin
-- FoldablePropagation
select A, ID
from ( select distinct ID, `Date`, Amount + 0.2 - 0.1 AS A, Amount + 2 AS B from finance WHERE ID > 5 )
WHERE ID < 20
except DISTINCT select Amount + 0.2 as A, ID from finance WHERE ID > 25
最终得到 log, 最后一个规则FoldablePropagation
没找到, 后面会改进 :
===================================
21/09/05 15:20:11 WARN PlanChangeLogger:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ConstantFolding ===
Aggregate [ID#11L, Date#9, A#30, B#31], [A#30, ID#11L] Aggregate [ID#11L, Date#9, A#30, B#31], [A#30, ID#11L]
!+- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Project [ID#11L, Date#9, ((Amount#8 + 0.2) - 0.1) AS A#30, (Amount#8 + 2.0) AS B#31]
! +- Filter ((ID#11L > cast(5 as bigint)) AND (ID#11L < cast(20 as bigint))) +- Filter ((ID#11L > 5) AND (ID#11L < 20))
+- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json
===================================
===================================
21/09/05 15:20:11 WARN PlanChangeLogger:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.PushDownPredicates ===
Project [A#30, ID#11L] Project [A#30, ID#11L]
!+- Filter (ID#11L < cast(20 as bigint)) +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31]
! +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31]
! +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Filter ((ID#11L > cast(5 as bigint)) AND (ID#11L < cast(20 as bigint)))
! +- Filter (ID#11L > cast(5 as bigint)) +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json
! +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json
===================================
===================================
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate ===
Project [A#30, ID#11L] Project [A#30, ID#11L]
+- Filter (ID#11L < cast(20 as bigint)) +- Filter (ID#11L < cast(20 as bigint))
! +- Distinct +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31]
+- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31]
+- Filter (ID#11L > cast(5 as bigint)) +- Filter (ID#11L > cast(5 as bigint))
+- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json
21/09/05 15:20:11 WARN PlanChangeLogger:
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin ===
!Except false Distinct
Project [A#30, ID#11L] Project [A#30, ID#11L]
+- Filter (ID#11L < cast(20 as bigint)) +- Filter (ID#11L < cast(20 as bigint))
! +- Distinct +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31]
+- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31]
+- Filter (ID#11L > cast(5 as bigint)) +- Filter (ID#11L > cast(5 as bigint))
+- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json
===================================
Spark-Catalyst Optimizer 总结
3. 练习:实现自定义优化规则
第一步实现自定义规则:case class MyPushDown(spark: SparkSession) extends Rule[LogicalPlan] {def apply(plan: LogicalPlan): LogicalPlan = plan transform { …. }}第二步创建自己的 Extension 并注入 class MySparkSessionExtension extends (SparkSessionExtensions => Unit) {override def apply(extensions: SparkSessionExtensions): Unit = {extensions.injectOptimizerRule { session =>new MyPushDown(session)}}}第三步通过 spark.sql.extensions 提交 bin/spark-sql --jars my.jar --conf spark.sql.extensions=com.jikeshijian.MySparkSessionExtension
Answer
新建项目
com.xkx.sql.extension
:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.xkx.sql.extension</groupId>
<artifactId>custom-spark-extention</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<scala.version>2.13.5</scala.version>
<spark.version>2.4.5</spark.version>
<hadoop.version>2.9.2</hadoop.version>
<encoding>UTF-8</encoding>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.9.7</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.44</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.12</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.github.scopt/scopt -->
<dependency>
<groupId>com.github.scopt</groupId>
<artifactId>scopt_2.12</artifactId>
<version>3.5.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scalatest/scalatest -->
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.12</artifactId>
<version>3.2.0</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<pluginManagement>
<plugins>
<!-- 编译scala的插件 -->
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
</plugin>
<!-- 编译java的插件 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.5.1</version>
</plugin>
</plugins>
</pluginManagement>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<executions>
<execution>
<id>scala-compile-first</id>
<phase>process-resources</phase>
<goals>
<goal>add-source</goal>
<goal>compile</goal>
</goals>
</execution>
<execution>
<id>scala-test-compile</id>
<phase>process-test-resources</phase>
<goals>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<!-- 打jar插件 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
在 source : src\main\scala
目录下新建文件:
MyPushDown.scala
MySparkSessionExtension.scala
├───src
│ ├───main
│ │ ├───java
│ │ ├───resources
│ │ └───scala
│ └───test
│ └───java
MyPushDown.scala
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.catalyst.expressions.SubqueryExpression
import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, Sort}
import org.apache.spark.sql.catalyst.rules._
case class MyPushDown(spark: SparkSession) extends Rule[LogicalPlan] {
private def removeTopLevelSort(plan: LogicalPlan): LogicalPlan = {
plan match {
case Sort(_, _, child) => child
case Project(fields, child) => Project(fields, removeTopLevelSort(child))
case other => other
}
}
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
case Sort(_, _, child) => {
print("custom MyPushDown")
child
}
case other => {
print("custom MyPushDown")
logWarning(s"Optimization batch is excluded from the MyPushDown optimizer")
other
}
}
}
MySparkSessionExtension.scala
import org.apache.spark.sql.SparkSessionExtensions
class MySparkSessionExtension extends (SparkSessionExtensions => Unit) {
override def apply(extensions: SparkSessionExtensions): Unit = {
extensions.injectOptimizerRule { session =>
new MyPushDown(session)
}
}
}
之后打包, mvn package
完成后运行:
spark-sql --jars target/custom-spark-extention-1.0-SNAPSHOT.jar --conf spark.sql.extensions=MySparkSessionExtension
待 sql console 启动后, 运行
set spark.sql.planChangeLog.level=WARN;
create temporary view t1 as select * from values
("one", 1),
("two", 2),
("three", 3),
("one", NULL)
as t1(k, v);
SELECT * FROM t1;
会在 log 里面看到自定义 优化规则:MyPushDown: Optimization batch is excluded from the MyPushDown optimizer
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation ===
!Project [k#12, v#13] LocalRelation [k#12, v#13]
!+- LocalRelation [k#12, v#13]
21/09/05 18:45:45 WARN PlanChangeLogger:
=== Result of Batch LocalRelation early ===
!Project [k#12, v#13] LocalRelation [k#12, v#13]
!+- Project [cast(k#14 as string) AS k#12, cast(v#15 as int) AS v#13]
! +- Project [k#14, v#15]
! +- LocalRelation [k#14, v#15]
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Pullup Correlated Expressions has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Subquery has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Replace Operators has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Aggregate has no effect.
custom MyPushDown21/09/05 18:45:45 WARN MyPushDown: Optimization batch is excluded from the MyPushDown optimizer
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Operator Optimization before Inferring Filters has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Infer Filters has no effect.
custom MyPushDown21/09/05 18:45:45 WARN MyPushDown: Optimization batch is excluded from the MyPushDown optimizer
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Operator Optimization after Inferring Filters has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Push extra predicate through join has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Early Filter and Projection Push-Down has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Join Reorder has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Eliminate Sorts has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Decimal Optimizations has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Distinct Aggregate Rewrite has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Object Expressions Optimization has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch LocalRelation has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Check Cartesian Products has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch RewriteSubquery has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch NormalizeFloatingNumbers has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch ReplaceUpdateFieldsExpression has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Optimize Metadata Only Query has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch PartitionPruning has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Pushdown Filters from PartitionPruning has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Cleanup filters that cannot be pushed down has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Extract Python UDFs has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch User Provided Optimizers has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger:
=== Metrics of Executed Rules ===
Total number of runs: 157
Total time: 0.0271689 seconds
Total number of effective runs: 5
Total time of effective runs: 0.0231883 seconds
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Preparations has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger: Batch CleanExpressions has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger:
=== Metrics of Executed Rules ===
Total number of runs: 1
Total time: 4.4E-6 seconds
Total number of effective runs: 0
Total time of effective runs: 0.0 seconds
21/09/05 18:45:45 WARN PlanChangeLogger: Batch CleanExpressions has no effect.
21/09/05 18:45:45 WARN PlanChangeLogger:
=== Metrics of Executed Rules ===
Total number of runs: 1
Total time: 8.8E-6 seconds
Total number of effective runs: 0
Total time of effective runs: 0.0 seconds
one 1
two 2
three 3
one NULL
Time taken: 0.142 seconds, Fetched 4 row(s)
spark-sql>
Clarke
还未添加个人签名 2018.04.15 加入
还未添加个人简介
评论