我有一个 List<CustomPoint> points;
其中包含近百万个对象。
从这个列表中我想得到恰好两次出现的对象列表。最快的方法是什么?我也会对非Linq选项感兴趣,因为我可能也必须在C ++中这样做。
public class CustomPoint
{
public double X { get; set; }
public double Y { get; set; }
public CustomPoint(double x, double y)
{
this.X = x;
this.Y = y;
}
}
public class PointComparer : IEqualityComparer<CustomPoint>
{
public bool Equals(CustomPoint x, CustomPoint y)
{
return ((x.X == y.X) && (y.Y == x.Y));
}
public int GetHashCode(CustomPoint obj)
{
int hash = 0;
hash ^= obj.X.GetHashCode();
hash ^= obj.Y.GetHashCode();
return hash;
}
}
基于 这个 回答,我试过,
list.GroupBy(x => x).Where(x => x.Count() = 2).Select(x => x.Key).ToList();
但是这会在新列表中给出零对象。
有人可以指导我吗?
要使代码正常工作,您需要传递一个实例 PointComparer
作为第二个论点 GroupBy
。
您应该在类本身而不是PointComparer中实现Equals和GetHashCode
这个方法适合我:
public class PointCount
{
public CustomPoint Point { get; set; }
public int Count { get; set; }
}
private static IEnumerable<CustomPoint> GetPointsByCount(Dictionary<int, PointCount> pointcount, int count)
{
return pointcount
.Where(p => p.Value.Count == count)
.Select(p => p.Value.Point);
}
private static Dictionary<int, PointCount> GetPointCount(List<CustomPoint> pointList)
{
var allPoints = new Dictionary<int, PointCount>();
foreach (var point in pointList)
{
int hash = point.GetHashCode();
if (allPoints.ContainsKey(hash))
{
allPoints[hash].Count++;
}
else
{
allPoints.Add(hash, new PointCount { Point = point, Count = 1 });
}
}
return allPoints;
}
这样称呼:
static void Main(string[] args)
{
List<CustomPoint> list1 = CreateCustomPointList();
var doubles = GetPointsByCount(GetPointCount(list1), 2);
Console.WriteLine("Doubles:");
foreach (var point in doubles)
{
Console.WriteLine("X: {0}, Y: {1}", point.X, point.Y);
}
}
private static List<CustomPoint> CreateCustomPointList()
{
var result = new List<CustomPoint>();
for (int i = 0; i < 5; i++)
{
for (int j = 0; j < 5; j++)
{
result.Add(new CustomPoint(i, j));
}
}
result.Add(new CustomPoint(1, 3));
result.Add(new CustomPoint(3, 3));
result.Add(new CustomPoint(0, 2));
return result;
}
CustomPoint
执行:
public class CustomPoint
{
public double X { get; set; }
public double Y { get; set; }
public CustomPoint(double x, double y)
{
this.X = x;
this.Y = y;
}
public override bool Equals(object obj)
{
var other = obj as CustomPoint;
if (other == null)
{
return base.Equals(obj);
}
return ((this.X == other.X) && (this.Y == other.Y));
}
public override int GetHashCode()
{
int hash = 23;
hash = hash * 31 + this.X.GetHashCode();
hash = hash * 31 + this.Y.GetHashCode();
return hash;
}
}
它打印:
Doubles:
X: 0, Y: 2
X: 1, Y: 3
X: 3, Y: 3
如你所见 GetPointCount()
,我每个独特的创建一个字典 CustomPoint
(通过哈希)。然后我插入一个 PointCount
包含对引用的对象 CustomPoint
从一开始 Count
为1,并且每次遇到相同的点, Count
增加了。
最后在 GetPointsByCount
我回来了 CustomPoint
s在字典里 PointCount.Count == count
,在你的情况下2。
还请注意我更新了 GetHashCode()
方法,因为你的那个为点(1,2)和(2,1)返回相同的方法。如果您确实需要,请随意恢复自己的哈希方法。您将不得不测试散列函数,因为很难将两个数字唯一地散列为一个。这取决于使用的数字范围,因此您应该实现适合您自己需要的哈希函数。