How to work with Matrix Multiplication in Apache Spark?

0 votes
I am trying to perform matrix multiplication using Apache Spark and Java, where I need to create RDD that can represent matrix in Apache Spark. Can anyone help me with this?
Jul 31 in Apache Spark by Amrita

edited Jul 31 48 views

1 answer to this question.

0 votes

Hey,

You can follow this below solution for your above query:

  • IndexRowMatrix: Can be created directly from a RDD[IndexedRow] where IndexedRow consist of row index and org.apache.spark.mllib.linalg.Vector​

import org.apache.spark.mllib.linalg.{Vectors, Matrices}
import org.apache.spark.mllib.linalg.distributed.{IndexedRowMatrix,
  IndexedRow}

val rows =  sc.parallelize(Seq(
  (0L, Array(1.0, 0.0, 0.0)),
  (0L, Array(0.0, 1.0, 0.0)),
  (0L, Array(0.0, 0.0, 1.0)))
).map{case (i, xs) => IndexedRow(i, Vectors.dense(xs))}

val indexedRowMatrix = new IndexedRowMatrix(rows)

RowMatrix: Similar to IndexedRowMatrix but without meaningful row indices. Can be created directly from RDD[org.apache.spark.mllib.linalg.Vector]

import org.apache.spark.mllib.linalg.distributed.RowMatrix

val rowMatrix = new RowMatrix(rows.map(_.vector))     

BlockMatrix: Can be created from RDD[((Int, Int), Matrix)] where the first element of the tuple contains coordinates of the block and the second one is a local org.apache.spark.mllib.linalg.Matrix

val eye = Matrices.sparse(
  3, 3, Array(0, 1, 2, 3), Array(0, 1, 2), Array(1, 1, 1))

val blocks = sc.parallelize(Seq(
   ((0, 0), eye), ((1, 1), eye), ((2, 2), eye)))

val blockMatrix = new BlockMatrix(blocks, 3, 3, 9, 9)

Hope it helps.

answered Jul 31 by Gitika
• 25,300 points

Related Questions In Apache Spark

0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4 in Apache Spark by anonymous

edited Apr 5 by Omkar 17,987 views
0 votes
7 answers

How to print the contents of RDD in Apache Spark?

Simple and easy: line.foreach(println) READ MORE

answered Dec 10, 2018 in Apache Spark by Kuber
8,466 views
0 votes
1 answer

How to work with multidimensional arrays in Scala?

Hi, Here is an example you can follow: scala> ...READ MORE

answered Jul 30 in Apache Spark by Gitika
• 25,300 points
22 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

answered Dec 10, 2018 in Apache Spark by Vini
12,299 views
0 votes
1 answer
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,510 points
2,408 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,510 points
246 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
12,237 views
0 votes
1 answer

How to check if a particular keyword exists in Apache Spark?

Hey, You can try this code to get ...READ MORE

answered Jul 22 in Apache Spark by Gitika
• 25,300 points
33 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

answered Jul 22 in Apache Spark by Gitika
• 25,300 points
72 views