Sunday, December 7, 2014

Apache Spark Day One - Installation

Install Spark with Mac OS10.9

1. install scala
For Mac OS, it pretty easy with brew

brew install scala
(The initial installation failed, and it was fixed by installing hadoop)

 go to the spark path
./sbt/sbt clean assembly

3. go to Mac setting, enable remote login
if all steps are done successfully, start the master by 
./sbin/start-master.sh

The GUI would be available at localhost:8080 

4. configuration
 - create spark-env.sh at /conf/spark-env.sh
  try 4 slaves: in the end of the file, add "export SPARK_WORKER_INSTANCES=4" to start 4 workers
   check the GUI, it shows the workers (you might need to input ssh keys for each worker).
  

./bin/stop-all.sh #stop all masters and slaves
./bin/start-all.sh #start all masters and slaves

-  configure logs, conf/log4j.properties

5. now, it is all set up, test python scripts by 
./bin/spark-submit test_pythonscript.py



No comments:

Post a Comment