Right now it seems like the hashCode for Products of Arity N is the same regardless of the actual class implementing it & that it doesn't distribute very well:
case class Point(x : Int, y : Int)
case class Point2(x : Int, y : Int)
(1,2).hashCode == Point(1,2).hashCode == Point2(1,2).hashCode == 3405
Welcome to Scala version 2.8.0.r20331-b20100101020206 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_15).
List((1,2), (2,1), Point(1,2), Point(2,1), Point2(1,2), Point2(2,1)).map(_.hashCode)
res0: List[Int] = List(3405, 3445, 3405, 3445, 3405, 3445)
In a little more depth: I think this behavior is much better than in 2.7.5 where the ordering of the product elements did not affect the hashCode. (Point(x,y) would clash with Point(y,x))
res0: List[Int] = List(1838716253, 1838716293, -1163646902, -1163646862, -270917823, -270917783)
Digging a little further on what this means for a distribution =>
Keep in mind that a big problem is that java's Integer.hashCode == the number being hashed, which is going to cause problems.
Using this code to get rudimentary collision distribution:
val tot = 500
val ideal = tot * tot * 1.0
val hashes = (for(x <- -(tot/2) until (tot/2); y <- -(tot/2) until (tot/2)) yield (x,y).hashCode ).toList
val hset = Set() ++ hashes
val hsSize = hset.size
val hlSize = hashes.removeDuplicates.size
println((ideal, hsSize, hsSize/ideal, hlSize, hlSize/ideal))
Here is the 2.7.5 result (takes a few minutes on a Core i7 920, removeDuplicates is terribly slow):
and Here is the 2.8 nightly result (surprisingly the same distribution, that takes only a few seconds):
Both cases are going to collide 90+% of the time (notice that changing tot to 100 reduces the collision % quite significantly so I suspect there is a quadratic relationship at play here)
Running with a tot =100 gives:
Still a 58% collide rate, but a lot better than the larger case.
Finally for the 3 Tuple cases with tot = 100
Perhaps I should look into a better case than just integers to see what the collision rates actually are like.