For a constant Learner & Data Analysts

This blog is helpful to those who see their career & passion in data Analysis & data scientist work. Focus would be on concepts and eventually discuss examples in Excel, SAS , R and Python. Happy Learning DataOps :)

Featured Post

Reference Books and material for Analytics

Website for practising R on Statistical conceptual Learning: https://statlearning.com  Reference Books & Materials: 1) Statis...

Tuesday, August 27, 2019

Apache Spark in Google Collaboratory

This is from my learning notes!!!

1.1    Setting up Spark on Google Colab


Google Collaborator is perfect cloud platform for someone to start learning Python. You can access what you practiced from anywhere and everywhere.

This could also be used to learn Spark . Please follow below steps. Make sure you check the file version and do the modification as needed (like look for latest .tgz file etc.)

1.1.1    Install Java, Spark, and Findspark

!apt-get install openjdk-8-jdk-headless -qq > /dev/null
!wget -q http://apache.osuosl.org/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz
!tar xf spark-2.4.3-bin-hadoop2.7.tgz
!pip install -q findspark

1.1.2    Set Environment Variables

import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
os.environ["SPARK_HOME"] = "/content/spark-2.4.3-bin-hadoop2.7"

1.1.3    Start a SparkSession

import findspark
findspark.init()
from pyspark.sql import SparkSession

spark = SparkSession.builder.master("local[*]").getOrCreate()

1.1.4    Use Spark!

df = spark.createDataFrame([{"winner": "Humanity"} for x in range(100)])

df.show(2)


Posted by Ashutosh at 12:42 AM
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest

No comments:

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments (Atom)

Group of Learners

Blog Archive

  • ►  2021 (3)
    • ►  May (2)
    • ►  January (1)
  • ►  2020 (2)
    • ►  April (2)
  • ▼  2019 (3)
    • ►  October (1)
    • ►  September (1)
    • ▼  August (1)
      • Apache Spark in Google Collaboratory
  • ►  2018 (5)
    • ►  April (2)
    • ►  February (3)
  • ►  2016 (16)
    • ►  December (14)
    • ►  November (1)
    • ►  October (1)

Follow this Blog

Posts
Atom
Posts
Comments
Atom
Comments

Wikipedia

Search results

Author

My photo
Ashutosh
IT Professional, Researcher,Pita, Pati,Putra & Data Analyst. I am building my own datamart of Knowledge thru Life Experiences . My performance measure is satisfaction that I get back. Thanks
View my complete profile
Simple theme. Powered by Blogger.