As you said you are a beginner in this area, then you should go through the existing modules. You are trying to implement the K-means algorithm. So first learn what is the mathematical concept behind the algorithm. If you are clear with the concept then try to analyze the code of the existing model. These steps will lead you to create your own K-means modules using python or any other language.
Hope this helps!
To know more about Pyspark, it's recommended that you join Pyspark course online.