You can enable WebHDFS to do this task. Follow the below given steps.
Enable WebHDFS in HDFS configuration file. (hdfs-site.xml)
Set dfs.webhdfs.enabled as true.
Restart HDFS daemons.
We can now access HDFS with the WebHDFS API.
Now you can browse your video by using curl command.
$ curl -i "http://<HOST>:<PORT>/webhdfs/v1/<PATH>
As we know video file is combination of arrays. So read your video file in a variable.
Hope this helps!
To know more about Pyspark, it's recommended that you join Pyspark course online.