# spark-scala-examples

**Repository Path**: doubo151/spark-scala-examples

## Basic Information

- **Project Name**: spark-scala-examples
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 0
- **Created**: 2021-03-13
- **Last Updated**: 2023-05-12

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

Explanation of all Spark SQL, RDD, DataFrame and Dataset examples present on this project are available at https://sparkbyexamples.com/ , All these examples are coded in Scala language and tested in our development environment.

# Table of Contents (Spark Examples in Scala)

## Spark RDD Examples
  - [Create a Spark RDD using Parallelize](https://sparkbyexamples.com/apache-spark-rdd/how-to-create-an-rdd-using-parallelize/)
  - [Spark – Read multiple text files into single RDD?](https://sparkbyexamples.com/apache-spark-rdd/spark-read-multiple-text-files-into-a-single-rdd/)
  - [Spark load CSV file into RDD](https://sparkbyexamples.com/apache-spark-rdd/spark-load-csv-file-into-rdd/)
  - [Different ways to create Spark RDD](https://sparkbyexamples.com/apache-spark-rdd/different-ways-to-create-spark-rdd/)
  - [Spark – How to create an empty RDD?](https://sparkbyexamples.com/apache-spark-rdd/spark-how-to-create-an-empty-rdd/)
  - [Spark RDD Transformations with examples](https://sparkbyexamples.com/apache-spark-rdd/spark-rdd-transformations/)
  - [Spark RDD Actions with examples](https://sparkbyexamples.com/apache-spark-rdd/spark-rdd-actions/)
  - [Spark Pair RDD Functions](https://sparkbyexamples.com/apache-spark-rdd/spark-pair-rdd-functions/)
  - [Spark Repartition() vs Coalesce()](https://sparkbyexamples.com/spark/spark-repartition-vs-coalesce/)
  - [Spark Shuffle Partitions](https://sparkbyexamples.com/spark/spark-shuffle-partitions/)
  - [Spark Persistence Storage Levels](https://sparkbyexamples.com/spark/spark-persistence-storage-levels/)
  - [Spark RDD Cache and Persist with Example](https://sparkbyexamples.com/apache-spark-rdd/spark-rdd-cache-and-persist-example/)
  - [Spark Broadcast Variables](https://sparkbyexamples.com/spark/spark-broadcast-variables/)
  - [Spark Accumulators Explained](https://sparkbyexamples.com/spark/spark-accumulators/)
  - [Convert Spark RDD to DataFrame | Dataset](https://sparkbyexamples.com/apache-spark-rdd/convert-spark-rdd-to-dataframe-dataset/)
  
## Spark SQL Tutorial
  - [Spark Create DataFrame with Examples](https://sparkbyexamples.com/spark/different-ways-to-create-a-spark-dataframe/)
  - [Spark DataFrame withColumn](https://sparkbyexamples.com/spark/spark-dataframe-withcolumn/)
  - [Ways to Rename column on Spark DataFrame](https://sparkbyexamples.com/spark/rename-a-column-on-spark-dataframes/)
  - [Spark – How to Drop a DataFrame/Dataset column](https://sparkbyexamples.com/spark/spark-drop-column-from-dataframe-dataset/)
  - [Working with Spark DataFrame Where Filter](https://sparkbyexamples.com/spark/spark-dataframe-where-filter/)
  - [Spark SQL “case when” and “when otherwise”](https://sparkbyexamples.com/spark/spark-case-when-otherwise-example/)
  - [Collect() – Retrieve data from Spark RDD/DataFrame](https://sparkbyexamples.com/spark/spark-dataframe-collect/)
  - [Spark – How to remove duplicate rows](https://sparkbyexamples.com/spark/spark-remove-duplicate-rows/)
  - [How to Pivot and Unpivot a Spark DataFrame](https://sparkbyexamples.com/spark/how-to-pivot-table-and-unpivot-a-spark-dataframe/)
  - [Spark SQL Data Types with Examples](https://sparkbyexamples.com/spark/spark-sql-dataframe-data-types/)
  - [Spark SQL StructType & StructField with examples](https://sparkbyexamples.com/spark/spark-sql-structtype-on-dataframe/)
  - [Spark schema – explained with examples](https://sparkbyexamples.com/spark/spark-schema-explained-with-examples/)
  - [Spark Groupby Example with DataFrame](https://sparkbyexamples.com/spark/using-groupby-on-dataframe/)
  - [Spark – How to Sort DataFrame column explained](https://sparkbyexamples.com/spark/spark-how-to-sort-dataframe-column-explained/)
  - [Spark SQL Join Types with examples](https://sparkbyexamples.com/spark/spark-sql-dataframe-join/)
  - [Spark DataFrame Union and UnionAll](https://sparkbyexamples.com/spark/spark-dataframe-union-and-union-all/)
  - [Spark map vs mapPartitions transformation](https://sparkbyexamples.com/spark/spark-map-vs-mappartitions-transformation/)
  - [Spark foreachPartition vs foreach | what to use?](https://sparkbyexamples.com/spark/spark-foreachpartition-vs-foreach-explained/)
  - [Spark DataFrame Cache and Persist Explained](https://sparkbyexamples.com/spark/spark-dataframe-cache-and-persist-explained/)
  - [Spark SQL UDF (User Defined Functions)](https://sparkbyexamples.com/spark/spark-sql-udf/)
  - [Spark SQL DataFrame Array (ArrayType) Column](https://sparkbyexamples.com/spark/spark-array-arraytype-dataframe-column/)
  - [Working with Spark DataFrame Map (MapType) column](https://sparkbyexamples.com/spark/spark-dataframe-map-maptype-column/)
  - [Spark SQL – Flatten Nested Struct column](https://sparkbyexamples.com/spark/spark-flatten-nested-struct-column/)
  - [Spark – Flatten nested array to single array column](https://sparkbyexamples.com/spark/spark-flatten-nested-array-column-to-single-column/)
  - [Spark explode array and map columns to rows](https://sparkbyexamples.com/spark/explode-spark-array-and-map-dataframe-column/)
  
  
   ## Spark SQL Functions
  - [Spark SQL String Functions Explained](https://sparkbyexamples.com/spark/usage-of-spark-sql-string-functions/)
  - [Spark SQL Date and Time Functions](https://sparkbyexamples.com/spark/spark-sql-date-and-time-functions/)
  - [Spark SQL Array functions complete list](https://sparkbyexamples.com/spark/spark-sql-array-functions/)
  - [Spark SQL Map functions – complete list](https://sparkbyexamples.com/spark/spark-sql-map-functions/)
  - [Spark SQL Sort functions – complete list](https://sparkbyexamples.com/spark/spark-sql-sort-functions/)
  - [Spark SQL Aggregate Functions](https://sparkbyexamples.com/spark/spark-sql-aggregate-functions/)
  - [Spark Window Functions with Examples](https://sparkbyexamples.com/spark/spark-sql-window-functions/)
    
   ## Spark Data Source API
   - [Spark Read CSV file into DataFrame](https://sparkbyexamples.com/spark/spark-read-csv-file-into-dataframe/)
   - [Spark Read and Write JSON file into DataFrame](https://sparkbyexamples.com/spark/spark-read-and-write-json-file/)
   - [Spark Read and Write Apache Parquet](https://sparkbyexamples.com/spark/spark-read-write-dataframe-parquet-example/)
   - [Spark Read XML file using Databricks API](https://sparkbyexamples.com/spark/spark-read-write-xml/)
   - [Read & Write Avro files using Spark DataFrame](https://sparkbyexamples.com/spark/read-write-avro-file-spark-dataframe/)
   - [Using Avro Data Files From Spark SQL 2.3.x or earlier](https://sparkbyexamples.com/spark/using-avro-data-files-from-spark-sql-2-3-x/)
   - [Spark Read from & Write to HBase table | Example](https://sparkbyexamples.com/spark/spark-read-write-using-hbase-spark-connector/)
   - [Create Spark DataFrame from HBase using Hortonworks](https://sparkbyexamples.com/spark/create-spark-dataframe-from-hbase-using-hortonworks/)
   - [Spark Read ORC file into DataFrame](https://sparkbyexamples.com/spark/spark-read-orc-file-into-dataframe/)
   - [Spark 3.0 Read Binary File into DataFrame](https://sparkbyexamples.com/spark/spark-read-binary-file-into-dataframe/)
   
   ## Spark Streaming & Kafka
   - [Spark Streaming – Different Output modes explained](https://sparkbyexamples.com/spark/spark-streaming-outputmode/)
   - [Spark Streaming files from a directory](https://sparkbyexamples.com/spark/spark-streaming-read-json-files-from-directory/)
   - [Spark Streaming – Reading data from TCP Socket](https://sparkbyexamples.com/spark/spark-streaming-from-tcp-socket/)
   - [Spark Streaming with Kafka Example](https://sparkbyexamples.com/spark/spark-streaming-with-kafka/)
   - [Spark Streaming – Kafka messages in Avro format](https://sparkbyexamples.com/spark/spark-streaming-consume-and-produce-kafka-messages-in-avro-format/)
   - [Spark SQL Batch Processing – Produce and Consume Apache Kafka Topic](https://sparkbyexamples.com/spark/spark-batch-processing-produce-consume-kafka-topic/)