The best thing with Millions Songs Dataset is that you can download 1GB (about 10000 songs), 10GB, 50GB or about 300GB dataset to your Hadoop cluster and do whatever test you would want. I love using it and learn a lot using this data set.
To start with you can download dataset start with anyone letter from A-Z, which will be range from 1GB to 20GB..
In the following blog, It is shown how to download the 1GB dataset and run Pig scripts.
To know more, It's recommended to join Big Data Hadoop Course today.