写点什么

大数据作业 Spark sql

用户头像
Clarke
关注
发布于: 刚刚

1. 为 Spark SQL 添加一条自定义命令

• SHOW VERSION;• 显示当前 Spark 版本和 Java 版本

Answser

  1. clone spark into local

  2. open sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 file 并依次添加 key word


  1. using maven plugin, in antlr4, double click to translate.

  2. in file sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala


insert below code


   override def visitShowSparkVersion(ctx: ShowSparkVersionContext): LogicalPlan = withOrigin(ctx) {    ShowSparkVersionCommand()  }
复制代码


  1. new file sql/core/src/main/scala/org/apache/spark/sql/execution/command/ShowSparkVersionCommand.scala


with content


 /* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements.  See the NOTICE file distributed with * this work for additional information regarding copyright ownership. * The ASF licenses this file to You under the Apache License, Version 2.0 * (the "License"); you may not use this file except in compliance with * the License.  You may obtain a copy of the License at * *    http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */
package org.apache.spark.sql.execution.command
import org.apache.spark.sql.{Row, SparkSession}import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference}import org.apache.spark.sql.types.StringType
case class ShowSparkVersionCommand() extends LeafRunnableCommand {
override val output: Seq[Attribute] = Seq(AttributeReference("spark_version", StringType, nullable = true)())
override def run(sparkSession: SparkSession): Seq[Row] = { val outputString = System.getenv("SPARK_VERSION") Seq(Row(outputString)) }
}
复制代码


  1. in spark folder , run build/sbt package -Phive -Phivethrift

  2. then after build, package successful, cd binand lanuch with envriment variable SPARK_VERSION =3.1.2(custom cmd) spark-sql


8: query:


 show spark_version;
复制代码


will output 3.1.2(custom cmd)


 (base) xukaixuan@xukaixuandeMacBook-Pro spark % SPARK_VERSION=3.1.2(CUSTOM CMD) bin/spark-sql21/09/04 23:41:40 WARN Utils: Your hostname, xukaixuandeMacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 192.168.31.85 instead (on interface en0)21/09/04 23:41:40 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another addressUsing Spark's default log4j profile: org/apache/spark/log4j-defaults.propertiesSetting default log level to "WARN".To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).21/09/04 23:41:40 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable21/09/04 23:41:43 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist21/09/04 23:41:43 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist21/09/04 23:41:45 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.021/09/04 23:41:45 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore xukaixuan@127.0.0.1Spark master: local[*], Application Id: local-1630770102074spark-sql> show MYVERSION;3.1.2(CUSTOM CMD)Time taken: 2.148 seconds, Fetched 1 row(s)
复制代码


2. 构建 SQL 满足如下要求

通过 set spark.sql.planChangeLog.level=WARN;查看


  1. 构建一条 SQL,同时 apply 下面三条优化规则:CombineFiltersCollapseProjectBooleanSimplification

  2. 构建一条 SQL,同时 apply 下面五条优化规则:ConstantFoldingPushDownPredicatesReplaceDistinctWithAggregateReplaceExceptWithAntiJoinFoldablePropagation

answser

before:


using spark-sql create table


CREATE TEMPORARY TABLE finance USING org.apache.spark.sql.json  OPTIONS (path 'finances-small.json');
复制代码


, 其中 finances-small.json 数据为:


{"ID":1,"Account":{"Number":"123-ABC-789","FirstName":"Jay","LastName":"Smith"},"Date":"1/1/2015","Amount":1.23,"Description":"Drug Store"},{"ID":2,"Account":{"Number":"456-DEF-456","FirstName":"Sally","LastName":"Fuller"},"Date":"1/3/2015","Amount":200.00,"Description":"Electronics"},{"ID":3,"Account":{"Number":"333-XYZ-999","FirstName":"Brad","LastName":"Turner"},"Date":"1/4/2015","Amount":106.00,"Description":"Gas},{"ID":4,"Account":{"Number":"987-CBA-321","FirstName":"Justin","LastName":"Pihony"},"Date":"1/4/2015","Amount":0.00,"Description":"Drug Store"},...
复制代码


在 sql cmd 中设置: set spark.sql.planChangeLog.level=WARN;


  1. 执行 sql:


CREATE TEMPORARY TABLE finance USING org.apache.spark.sql.json  OPTIONS (path 'finances-small.json');
-- 1. 构建一条SQL,同时apply下面三条优化规则:-- CombineFilters-- CollapseProject-- BooleanSimplification
select A + B + 1, ID, (case when true then "1" when false then "2" else "3" end) as c from ( select Amount -1 AS A, Amount + 2 AS B, ID, `Date` from ( select * FROM finance where ID < 30 ) WHERE ID > 5 ) WHERE ID < 20
复制代码


最终得到 log, 其中 CombineFilters 被优化:


===================================21/09/05 14:23:17 WARN PlanChangeLogger: === Applying Rule org.apache.spark.sql.catalyst.optimizer.CollapseProject ===!Project [((A#47 + B#48) + cast(1 as double)) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 WHEN false THEN 2 ELSE 3 END AS c#49]   Project [(((Amount#18 - cast(1 as double)) + (Amount#18 + cast(2 as double))) + cast(1 as double)) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 WHEN false THEN 2 ELSE 3 END AS c#49]!+- Project [(Amount#18 - cast(1 as double)) AS A#47, (Amount#18 + cast(2 as double)) AS B#48, ID#21L]             +- Filter ((ID#21L > cast(5 as bigint)) AND (ID#21L < cast(20 as bigint)))!   +- Filter ((ID#21L > cast(5 as bigint)) AND (ID#21L < cast(20 as bigint)))                +- Project [Amount#18, ID#21L]!      +- Project [Amount#18, ID#21L]                   +- Filter (ID#21L < cast(30 as bigint))!         +- Filter (ID#21L < cast(30 as bigint))                      +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json!            +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json===================================
===================================21/09/05 14:23:17 WARN PlanChangeLogger:=== Applying Rule org.apache.spark.sql.catalyst.optimizer.SimplifyConditionals ===!Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 WHEN false THEN 2 ELSE 3 END AS c#49] Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 ELSE 3 END AS c#49] +- Filter ((ID#21L > 5) AND (ID#21L < 20)) +- Filter ((ID#21L > 5) AND (ID#21L < 20)) +- Project [Amount#18, ID#21L] +- Project [Amount#18, ID#21L] +- Filter (ID#21L < 30) +- Filter (ID#21L < 30) +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json===================================
===================================21/09/05 14:23:17 WARN PlanChangeLogger: === Applying Rule org.apache.spark.sql.catalyst.optimizer.PushDownPredicates === Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 ELSE 3 END AS c#49] Project [(((Amount#18 - 1.0) + (Amount#18 + 2.0)) + 1.0) AS ((A + B) + CAST(1 AS DOUBLE))#50, ID#21L, CASE WHEN true THEN 1 ELSE 3 END AS c#49]!+- Filter ((ID#21L > 5) AND (ID#21L < 20)) +- Project [Amount#18, ID#21L]! +- Project [Amount#18, ID#21L] +- Filter ((ID#21L < 30) AND ((ID#21L > 5) AND (ID#21L < 20)))! +- Filter (ID#21L < 30) +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json! +- Relation[Account#17,Amount#18,Date#19,Description#20,ID#21L,_corrupt_record#22] json===================================
复制代码


  1. 执行


-- 2. 构建一条SQL,同时apply下面五条优化规则:-- ConstantFolding-- PushDownPredicates-- ReplaceDistinctWithAggregate-- ReplaceExceptWithAntiJoin-- FoldablePropagation
select A, ID from ( select distinct ID, `Date`, Amount + 0.2 - 0.1 AS A, Amount + 2 AS B from finance WHERE ID > 5 ) WHERE ID < 20except DISTINCT select Amount + 0.2 as A, ID from finance WHERE ID > 25
复制代码


最终得到 log, 最后一个规则FoldablePropagation没找到, 后面会改进 :


===================================21/09/05 15:20:11 WARN PlanChangeLogger:=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ConstantFolding === Aggregate [ID#11L, Date#9, A#30, B#31], [A#30, ID#11L]                                                                                  Aggregate [ID#11L, Date#9, A#30, B#31], [A#30, ID#11L]!+- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31]   +- Project [ID#11L, Date#9, ((Amount#8 + 0.2) - 0.1) AS A#30, (Amount#8 + 2.0) AS B#31]!   +- Filter ((ID#11L > cast(5 as bigint)) AND (ID#11L < cast(20 as bigint)))                                                              +- Filter ((ID#11L > 5) AND (ID#11L < 20))       +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json   ===================================
===================================21/09/05 15:20:11 WARN PlanChangeLogger:=== Applying Rule org.apache.spark.sql.catalyst.optimizer.PushDownPredicates === Project [A#30, ID#11L] Project [A#30, ID#11L]!+- Filter (ID#11L < cast(20 as bigint)) +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31]! +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31]! +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Filter ((ID#11L > cast(5 as bigint)) AND (ID#11L < cast(20 as bigint)))! +- Filter (ID#11L > cast(5 as bigint)) +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json! +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json===================================
====================================== Applying Rule org.apache.spark.sql.catalyst.optimizer.ReplaceDistinctWithAggregate === Project [A#30, ID#11L] Project [A#30, ID#11L] +- Filter (ID#11L < cast(20 as bigint)) +- Filter (ID#11L < cast(20 as bigint))! +- Distinct +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Filter (ID#11L > cast(5 as bigint)) +- Filter (ID#11L > cast(5 as bigint)) +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json
21/09/05 15:20:11 WARN PlanChangeLogger:=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ReplaceExceptWithAntiJoin ===!Except false Distinct Project [A#30, ID#11L] Project [A#30, ID#11L] +- Filter (ID#11L < cast(20 as bigint)) +- Filter (ID#11L < cast(20 as bigint))! +- Distinct +- Aggregate [ID#11L, Date#9, A#30, B#31], [ID#11L, Date#9, A#30, B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Project [ID#11L, Date#9, ((Amount#8 + cast(0.2 as double)) - cast(0.1 as double)) AS A#30, (Amount#8 + cast(2 as double)) AS B#31] +- Filter (ID#11L > cast(5 as bigint)) +- Filter (ID#11L > cast(5 as bigint)) +- Relation[Account#7,Amount#8,Date#9,Description#10,ID#11L,_corrupt_record#12] json ===================================
复制代码

Spark-Catalyst Optimizer 总结

3. 练习:实现自定义优化规则

第一步实现自定义规则:case class MyPushDown(spark: SparkSession) extends Rule[LogicalPlan] {def apply(plan: LogicalPlan): LogicalPlan = plan transform { …. }}第二步创建自己的 Extension 并注入 class MySparkSessionExtension extends (SparkSessionExtensions => Unit) {override def apply(extensions: SparkSessionExtensions): Unit = {extensions.injectOptimizerRule { session =>new MyPushDown(session)}}}第三步通过 spark.sql.extensions 提交 bin/spark-sql --jars my.jar --conf spark.sql.extensions=com.jikeshijian.MySparkSessionExtension

Answer

  1. 新建项目 com.xkx.sql.extension


<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0"         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">    <modelVersion>4.0.0</modelVersion>
<groupId>com.xkx.sql.extension</groupId> <artifactId>custom-spark-extention</artifactId> <version>1.0-SNAPSHOT</version>
<properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <scala.version>2.13.5</scala.version> <spark.version>2.4.5</spark.version> <hadoop.version>2.9.2</hadoop.version> <encoding>UTF-8</encoding> </properties>
<dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency>
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${spark.version}</version> </dependency>
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${spark.version}</version> </dependency>
<dependency> <groupId>joda-time</groupId> <artifactId>joda-time</artifactId> <version>2.9.7</version> </dependency>
<dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.44</version> </dependency>
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_2.12</artifactId> <version>${spark.version}</version> </dependency>
<!-- https://mvnrepository.com/artifact/com.github.scopt/scopt --> <dependency> <groupId>com.github.scopt</groupId> <artifactId>scopt_2.12</artifactId> <version>3.5.0</version> </dependency>
<!-- https://mvnrepository.com/artifact/org.scalatest/scalatest --> <dependency> <groupId>org.scalatest</groupId> <artifactId>scalatest_2.12</artifactId> <version>3.2.0</version> <scope>test</scope> </dependency>

</dependencies>
<build> <pluginManagement> <plugins> <!-- 编译scala的插件 --> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <version>3.2.2</version> </plugin> <!-- 编译java的插件 --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.5.1</version> </plugin> </plugins> </pluginManagement> <plugins> <plugin> <groupId>net.alchim31.maven</groupId> <artifactId>scala-maven-plugin</artifactId> <executions> <execution> <id>scala-compile-first</id> <phase>process-resources</phase> <goals> <goal>add-source</goal> <goal>compile</goal> </goals> </execution> <execution> <id>scala-test-compile</id> <phase>process-test-resources</phase> <goals> <goal>testCompile</goal> </goals> </execution> </executions> </plugin>
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <executions> <execution> <phase>compile</phase> <goals> <goal>compile</goal> </goals> </execution> </executions> </plugin>
<!-- 打jar插件 --> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.4.3</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> </configuration> </execution> </executions> </plugin> </plugins> </build>
</project>
复制代码


在 source : src\main\scala 目录下新建文件:


  1. MyPushDown.scala

  2. MySparkSessionExtension.scala


├───src│   ├───main│   │   ├───java│   │   ├───resources│   │   └───scala│   └───test│       └───java
复制代码


  1. MyPushDown.scala


import org.apache.spark.sql.SparkSessionimport org.apache.spark.sql.catalyst.expressions.SubqueryExpressionimport org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, Sort}import org.apache.spark.sql.catalyst.rules._case class MyPushDown(spark: SparkSession) extends Rule[LogicalPlan] {
private def removeTopLevelSort(plan: LogicalPlan): LogicalPlan = { plan match { case Sort(_, _, child) => child case Project(fields, child) => Project(fields, removeTopLevelSort(child)) case other => other } }
def apply(plan: LogicalPlan): LogicalPlan = plan transform { case Sort(_, _, child) => { print("custom MyPushDown") child } case other => { print("custom MyPushDown") logWarning(s"Optimization batch is excluded from the MyPushDown optimizer") other } }}
复制代码


  1. MySparkSessionExtension.scala


import org.apache.spark.sql.SparkSessionExtensions
class MySparkSessionExtension extends (SparkSessionExtensions => Unit) { override def apply(extensions: SparkSessionExtensions): Unit = { extensions.injectOptimizerRule { session => new MyPushDown(session) } }}
复制代码


之后打包, mvn package


完成后运行:


spark-sql --jars target/custom-spark-extention-1.0-SNAPSHOT.jar  --conf spark.sql.extensions=MySparkSessionExtension
复制代码


待 sql console 启动后, 运行


set spark.sql.planChangeLog.level=WARN;
create temporary view t1 as select * from values ("one", 1), ("two", 2), ("three", 3), ("one", NULL) as t1(k, v);

SELECT * FROM t1;
复制代码


会在 log 里面看到自定义 优化规则:MyPushDown: Optimization batch is excluded from the MyPushDown optimizer


=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation ===!Project [k#12, v#13]            LocalRelation [k#12, v#13]!+- LocalRelation [k#12, v#13]
21/09/05 18:45:45 WARN PlanChangeLogger:=== Result of Batch LocalRelation early ===!Project [k#12, v#13] LocalRelation [k#12, v#13]!+- Project [cast(k#14 as string) AS k#12, cast(v#15 as int) AS v#13]! +- Project [k#14, v#15]! +- LocalRelation [k#14, v#15]
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Pullup Correlated Expressions has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Subquery has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Replace Operators has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Aggregate has no effect.custom MyPushDown21/09/05 18:45:45 WARN MyPushDown: Optimization batch is excluded from the MyPushDown optimizer21/09/05 18:45:45 WARN PlanChangeLogger: Batch Operator Optimization before Inferring Filters has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Infer Filters has no effect.custom MyPushDown21/09/05 18:45:45 WARN MyPushDown: Optimization batch is excluded from the MyPushDown optimizer21/09/05 18:45:45 WARN PlanChangeLogger: Batch Operator Optimization after Inferring Filters has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Push extra predicate through join has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Early Filter and Projection Push-Down has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Join Reorder has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Eliminate Sorts has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Decimal Optimizations has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Distinct Aggregate Rewrite has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Object Expressions Optimization has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch LocalRelation has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Check Cartesian Products has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch RewriteSubquery has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch NormalizeFloatingNumbers has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch ReplaceUpdateFieldsExpression has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Optimize Metadata Only Query has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch PartitionPruning has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Pushdown Filters from PartitionPruning has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Cleanup filters that cannot be pushed down has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch Extract Python UDFs has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch User Provided Optimizers has no effect.21/09/05 18:45:45 WARN PlanChangeLogger:=== Metrics of Executed Rules ===Total number of runs: 157Total time: 0.0271689 secondsTotal number of effective runs: 5Total time of effective runs: 0.0231883 seconds
21/09/05 18:45:45 WARN PlanChangeLogger: Batch Preparations has no effect.21/09/05 18:45:45 WARN PlanChangeLogger: Batch CleanExpressions has no effect.21/09/05 18:45:45 WARN PlanChangeLogger:=== Metrics of Executed Rules ===Total number of runs: 1Total time: 4.4E-6 secondsTotal number of effective runs: 0Total time of effective runs: 0.0 seconds
21/09/05 18:45:45 WARN PlanChangeLogger: Batch CleanExpressions has no effect.21/09/05 18:45:45 WARN PlanChangeLogger:=== Metrics of Executed Rules ===Total number of runs: 1Total time: 8.8E-6 secondsTotal number of effective runs: 0Total time of effective runs: 0.0 seconds
one 1two 2three 3one NULLTime taken: 0.142 seconds, Fetched 4 row(s)spark-sql>
复制代码



用户头像

Clarke

关注

还未添加个人签名 2018.04.15 加入

还未添加个人简介

评论

发布
暂无评论
大数据作业Spark sql