1. install scala
For Mac OS, it pretty easy with brew
brew install scala
(The initial installation failed, and it was fixed by installing hadoop)
2. download Spark https://spark.apache.org/downloads.html
go to the spark path
./sbt/sbt clean assembly
3. go to Mac setting, enable remote login
if all steps are done successfully, start the master by
./sbin/start-master.sh
The GUI would be available at localhost:8080
4. configuration
- create spark-env.sh at /conf/spark-env.sh
try 4 slaves: in the end of the file, add "export SPARK_WORKER_INSTANCES=4" to start 4 workers
check the GUI, it shows the workers (you might need to input ssh keys for each worker).
./bin/stop-all.sh #stop all masters and slaves
./bin/start-all.sh #start all masters and slaves
- configure logs, conf/log4j.properties
5. now, it is all set up, test python scripts by
./bin/spark-submit test_pythonscript.py