安哥网络 发表于 2017-3-22 12:30:57

Spark(三): 安装与配置

Spark(三): 安装与配置
参见 HDP2.4安装(五):集群及组件安装,安装配置的spark版本为1.6, 在已安装HBase、hadoop集群的基础上通过 ambari 自动安装Spark集群,基于hadoop yarn 的运行模式。目录:Spark集群安装参数配置测试验证Spark集群安装:在ambari -service 界面选择 “add Service",如图:http://static.yjs001.cn/uploadpic/4/19/41913f987907b0d94288e630e114dfca.jpg在弹出界面选中spark服务,如图:http://static.yjs001.cn/uploadpic/1/27/12774b94bb7e93d3fed5c939d81748f5.jpg"下一步”,分配host节点,因为前期我们已经安装了hadoop 和hbase集群,按向导分配 spark history Server即可分配client,如下图:http://static.yjs001.cn/uploadpic/d/59/d592fe43cd70c80f3b1dae6be1349a6c.jpg发布安装,如下正确状态http://static.yjs001.cn/uploadpic/5/05/5054b9a20c8f5993449533b2971c0def.jpg参数配置:安装完成后,重启hdfs 和 yarn查看 spark服务,spark thriftserver 未正常启动,日志如下:http://static.yjs001.cn/uploadpic/c/de/cdec0645add3fc3c328197dda5c76203.jpg16/08/30 14:13:25 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (512 MB per container)16/08/30 14:13:25 ERROR SparkContext: Error initializing SparkContext.java.lang.IllegalArgumentException: Required executor memory (1024+384 MB) is above the max threshold (512 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.    at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:284)    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:140)    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:144)    at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)    at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:56)    at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:76)    at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)    at java.lang.reflect.Method.invoke(Method.java:498)    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)View Code解决方案:调整yarn相关参数配置 yarn.nodemanager.resource.memory-mb、yarn.scheduler.maximum-allocation-mbyarn.nodemanager.resource.memory-mb表示该节点上YARN可使用的物理内存总量,默认是8192(MB),注意,我本机的hdp2-3内存为4G,默认设置的值是512M,调整为如下图大小yarn.scheduler.maximum-allocation-mb单个任务可申请的最多物理内存量,默认是8192(MB)。http://static.yjs001.cn/uploadpic/c/27/c27860806f0e8e207710b75802f19742.jpg保存配置,重启依赖该配置的服务,正常后如下图:http://static.yjs001.cn/uploadpic/4/18/418f45f8b656389fbbeb600cea5f6a67.jpg测试验证:在任一安装spark client机器(hdp4),将目录切换至 spark 安装目录的 bin目录下命令: ./spark-sqlsql命令: show database; 如下图http://static.yjs001.cn/uploadpic/0/ec/0ec7ec1d2b16ffd35e995f99607c0317.jpg查看历史记录,如下:http://static.yjs001.cn/uploadpic/1/79/17992e5544e34777f5cfb1d885fdaf51.jpg
摘自:http://www.yjs001.cn/bigdata/spark/40552107538437498103.html
Spark(三): 安装与配置
页: [1]
查看完整版本: Spark(三): 安装与配置