How to work with Matrix Multiplication in Apache Spark?

0 votes
I am trying to perform matrix multiplication using Apache Spark and Java, where I need to create RDD that can represent matrix in Apache Spark. Can anyone help me with this?
Jul 31, 2019 in Apache Spark by Amrita

edited Jul 31, 2019 1,336 views

1 answer to this question.

0 votes

Hey,

You can follow this below solution for your above query:

  • IndexRowMatrix: Can be created directly from a RDD[IndexedRow] where IndexedRow consist of row index and org.apache.spark.mllib.linalg.Vector​

import org.apache.spark.mllib.linalg.{Vectors, Matrices}
import org.apache.spark.mllib.linalg.distributed.{IndexedRowMatrix,
  IndexedRow}

val rows =  sc.parallelize(Seq(
  (0L, Array(1.0, 0.0, 0.0)),
  (0L, Array(0.0, 1.0, 0.0)),
  (0L, Array(0.0, 0.0, 1.0)))
).map{case (i, xs) => IndexedRow(i, Vectors.dense(xs))}

val indexedRowMatrix = new IndexedRowMatrix(rows)

RowMatrix: Similar to IndexedRowMatrix but without meaningful row indices. Can be created directly from RDD[org.apache.spark.mllib.linalg.Vector]

import org.apache.spark.mllib.linalg.distributed.RowMatrix

val rowMatrix = new RowMatrix(rows.map(_.vector))     

BlockMatrix: Can be created from RDD[((Int, Int), Matrix)] where the first element of the tuple contains coordinates of the block and the second one is a local org.apache.spark.mllib.linalg.Matrix

val eye = Matrices.sparse(
  3, 3, Array(0, 1, 2, 3), Array(0, 1, 2), Array(1, 1, 1))

val blocks = sc.parallelize(Seq(
   ((0, 0), eye), ((1, 1), eye), ((2, 2), eye)))

val blockMatrix = new BlockMatrix(blocks, 3, 3, 9, 9)

Hope it helps.

answered Jul 31, 2019 by Gitika
• 25,440 points

Related Questions In Apache Spark

0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 33,529 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
16,141 views
0 votes
1 answer

How to work with multidimensional arrays in Scala?

Hi, Here is an example you can follow: scala> ...READ MORE

answered Jul 30, 2019 in Apache Spark by Gitika
• 25,440 points
93 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

answered Dec 10, 2018 in Apache Spark by Vini
19,774 views
+1 vote
1 answer
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
3,887 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,840 points
533 views
+1 vote
11 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
20,603 views
0 votes
1 answer

How to check if a particular keyword exists in Apache Spark?

Hey, You can try this code to get ...READ MORE

answered Jul 22, 2019 in Apache Spark by Gitika
• 25,440 points
162 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

answered Jul 22, 2019 in Apache Spark by Gitika
• 25,440 points
230 views