当前位置：首页 > news >正文

西安宝马建设科技股份有限公司网站福州自助建设网站

news 2025/11/15 8:03:43

西安宝马建设科技股份有限公司网站,福州自助建设网站,上海中小企业网站,长沙网络推广公司详细地址Spark spark-submit 提交应用程序 Spark支持三种集群管理方式 Standalone—Spark自带的一种集群管理方式#xff0c;易于构建集群。Apache Mesos—通用的集群管理#xff0c;可以在其上运行Hadoop MapReduce和一些服务应用。Hadoop YARN—Hadoop2中的资源管理器。注意易于构建集群。Apache Mesos—通用的集群管理可以在其上运行Hadoop MapReduce和一些服务应用。Hadoop YARN—Hadoop2中的资源管理器。注意 1、在集群不是特别大并且没有mapReduce和Spark同时运行的需求的情况下用Standalone模式效率最高。 2、Spark可以在应用间通过集群管理器和应用中如果一个SparkContext中有多项计算任务进行资源调度。 Running Spark on YARN cluster mode ./bin/spark-submit --class org.apache.spark.examples.SparkPi \ --master yarn \ --deploy-mode cluster \ --driver-memory 4g \ --executor-memory 2g \ --executor-cores 1 \ lib/spark-examples*.jar \ 10client mode ./bin/spark-submit --class org.apache.spark.examples.SparkPi \ --master yarn \ --deploy-mode client \ --driver-memory 4g \ --executor-memory 2g \ --executor-cores 1 \ lib/spark-examples*.jar \ 10spark-submit 详细参数说明参数名参数说明—mastermaster 的地址提交任务到哪里执行例如 spark://host:port, yarn, local。具体指可参考下面关于Master_URL的列表—deploy-mode在本地 (client) 启动 driver 或在 cluster 上启动默认是 client—class应用程序的主类仅针对 java 或 scala 应用—name应用程序的名称—jars用逗号分隔的本地 jar 包设置后这些 jar 将包含在 driver 和 executor 的 classpath 下—packages包含在driver 和executor 的 classpath 中的 jar 的 maven 坐标—exclude-packages为了避免冲突而指定不包含的 package—repositories远程 repository—conf PROPVALUE指定 spark 配置属性的值例如 -conf spark.executor.extraJavaOptions”-XX:MaxPermSize256m”—properties-file加载的配置文件默认为 conf/spark-defaults.conf—driver-memoryDriver内存默认 1G—driver-java-options传给 driver 的额外的 Java 选项—driver-library-path传给 driver 的额外的库路径—driver-class-path传给 driver 的额外的类路径—driver-coresDriver 的核数默认是1。在 yarn 或者 standalone 下使用—executor-memory每个 executor 的内存默认是1G—total-executor-cores所有 executor 总共的核数。仅仅在 mesos 或者 standalone 下使用—num-executors启动的 executor 数量。默认为2。在 yarn 下使用—executor-core每个 executor 的核数。在yarn或者standalone下使用 Master_URL的值 Master URL含义local使用1个worker线程在本地运行Spark应用程序local[K]使用K个worker线程在本地运行Spark应用程序local使用所有剩余worker线程在本地运行Spark应用程序spark://HOST:PORT连接到Spark Standalone集群以便在该集群上运行Spark应用程序mesos://HOST:PORT连接到Mesos集群以便在该集群上运行Spark应用程序yarn-client以client方式连接到YARN集群集群的定位由环境变量HADOOP_CONF_DIR定义该方式driver在client运行。yarn-cluster以cluster方式连接到YARN集群集群的定位由环境变量HADOOP_CONF_DIR定义该方式driver也在集群中运行。区分clientcluster本地模式下图是典型的client模式spark的drive在任务提交的本机上。下图是cluster模式spark drive在yarn上。三种模式的比较 Yarn ClusterYarn ClientSpark StandaloneDriver在哪里运行Application MasterClientClient谁请求资源Application MasterApplication MasterClient谁启动executor进程Yarn NodeManagerYarn NodeManagerSpark Slave驻内存进程1.Yarn ResourceManager 2.NodeManager1.Yarn ResourceManager 2.NodeManager1.Spark Master 2.Spark Worker是否支持Spark ShellNoYesYes spark-submit提交应用程序示例 # Run application locally on 8 cores(本地模式8核) ./bin/spark-submit \--class org.apache.spark.examples.SparkPi \--master local[8] \/path/to/examples.jar \100 # Run on a Spark standalone cluster in client deploy mode(standalone client模式) ./bin/spark-submit \--class org.apache.spark.examples.SparkPi \--master spark://207.184.161.138:7077 \--executor-memory 20G \--total-executor-cores 100 \/path/to/examples.jar \1000 # Run on a Spark standalone cluster in cluster deploy mode with supervise(standalone cluster模式使用supervise) ./bin/spark-submit \--class org.apache.spark.examples.SparkPi \--master spark://207.184.161.138:7077 \--deploy-mode cluster \--supervise \--executor-memory 20G \--total-executor-cores 100 \/path/to/examples.jar \1000 # Run on a YARN cluster(YARN cluster模式) export HADOOP_CONF_DIRXXX ./bin/spark-submit \--class org.apache.spark.examples.SparkPi \--master yarn \--deploy-mode cluster \ # can be client for client mode--executor-memory 20G \--num-executors 50 \/path/to/examples.jar \1000 # Run on a Mesos cluster in cluster deploy mode with supervise(Mesos cluster模式使用supervise) ./bin/spark-submit \--class org.apache.spark.examples.SparkPi \--master mesos://207.184.161.138:7077 \--deploy-mode cluster \--supervise \--executor-memory 20G \--total-executor-cores 100 \http://path/to/examples.jar \1000 # Run a Python application on a Spark standalone cluster(standalone cluster模式提交python application) ./bin/spark-submit \--master spark://207.184.161.138:7077 \examples/src/main/python/pi.py \1000 一个例子 spark-submit \ --master yarn \ --queue root.sparkstreaming \ --deploy-mode cluster \ --supervise \ --name spark-job \ --num-executors 20 \ --executor-cores 2 \ --executor-memory 4g \ --conf spark.dynamicAllocation.maxExecutors9 \ --files commons.xml \ --class com.***.realtime.helper.HelperHandle \ BSS-ONSS-Spark-Realtime-1.0-SNAPSHOT.jar 500

查看全文

http://www.zqtcl.cn/news/676068/