How to work with Matrix Multiplication in Apache Spark

0 votes
I am trying to perform matrix multiplication using Apache Spark and Java, where I need to create RDD that can represent matrix in Apache Spark. Can anyone help me with this?
Jul 31, 2019 in Apache Spark by Amrita

edited Jul 31, 2019 4,192 views

1 answer to this question.

0 votes

Hey,

You can follow this below solution for your above query:

  • IndexRowMatrix: Can be created directly from a RDD[IndexedRow] where IndexedRow consist of row index and org.apache.spark.mllib.linalg.Vector​

import org.apache.spark.mllib.linalg.{Vectors, Matrices}
import org.apache.spark.mllib.linalg.distributed.{IndexedRowMatrix,
  IndexedRow}

val rows =  sc.parallelize(Seq(
  (0L, Array(1.0, 0.0, 0.0)),
  (0L, Array(0.0, 1.0, 0.0)),
  (0L, Array(0.0, 0.0, 1.0)))
).map{case (i, xs) => IndexedRow(i, Vectors.dense(xs))}

val indexedRowMatrix = new IndexedRowMatrix(rows)

RowMatrix: Similar to IndexedRowMatrix but without meaningful row indices. Can be created directly from RDD[org.apache.spark.mllib.linalg.Vector]

import org.apache.spark.mllib.linalg.distributed.RowMatrix

val rowMatrix = new RowMatrix(rows.map(_.vector))     

BlockMatrix: Can be created from RDD[((Int, Int), Matrix)] where the first element of the tuple contains coordinates of the block and the second one is a local org.apache.spark.mllib.linalg.Matrix

val eye = Matrices.sparse(
  3, 3, Array(0, 1, 2, 3), Array(0, 1, 2), Array(1, 1, 1))

val blocks = sc.parallelize(Seq(
   ((0, 0), eye), ((1, 1), eye), ((2, 2), eye)))

val blockMatrix = new BlockMatrix(blocks, 3, 3, 9, 9)

Hope it helps.

answered Jul 31, 2019 by Gitika
• 65,870 points

Related Questions In Apache Spark

+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 67,715 views
+1 vote
8 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
42,434 views
0 votes
1 answer

How to work with multidimensional arrays in Scala?

Hi, Here is an example you can follow: scala> ...READ MORE

answered Jul 30, 2019 in Apache Spark by Gitika
• 65,870 points
243 views
0 votes
5 answers

How to change the spark Session configuration in Pyspark?

You aren't actually overwriting anything with this ...READ MORE

answered Dec 13, 2020 in Apache Spark by Gitika
• 65,870 points
53,764 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,094 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,139 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
51,322 views
0 votes
1 answer

How to check if a particular keyword exists in Apache Spark?

Hey, You can try this code to get ...READ MORE

answered Jul 22, 2019 in Apache Spark by Gitika
• 65,870 points
1,069 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

answered Jul 22, 2019 in Apache Spark by Gitika
• 65,870 points
1,798 views