I need to sort an RDD. The sort needs to be on multiple fields of my record and I hence need a custom Comparator.
I see that the
sortBy as it accepts only a single key. I chanced upon http://codingjunkie.net/spark-secondary-sort/ and thus used
repartitionAndSortWithinPartitions to achieve the same.
sortBy accept a custom Comparator and sort? Why do I have to repartition just inorder to user a custom Comparator?