Hey,
You need to follow some steps to complete the work, here you can follow:
- First load the file as RDD from HDFS on spark using the below code:
numsAsText = sc.textFile(“hdfs://hadoop1.knowbigdata.com/user/student/sgiri/mynumbersfile.txt”);
v = int(str);
return v*v;
- Third, run the function on spark rdd as transformation:
nums = numsAsText.map(toSqInt);
- Fourth, Run the summation as reduce action:
total = nums.reduce(sum)
- Fifth, finally compute the square root. For which we need to import math:
import math;
print math.sqrt(total);
Hope this helps.