Feature request #10947
Use Filter() instead of STIntersects() for better performance in MSSQL
|Pull Request or Patch supplied:||No||Resolution:|
|Easy fix?:||No||Copied to github as #:||19300|
I've noticed that "STIntersects()" is used to query the currently visible elements - I think you should consider using "Filter()" to get (much) better performance.
[mssql] Use Filter instead of STIntersects to improve query performance
...and refine validity test from 57dc3c7
Using Filter is more performant since it does a bounding box only
check when an appropriate spatial index is available. This matches
the behavior with other providers, where the provider filter only
does a bounding box check and callers must perform the actual
intersection check if required.
While Filter also avoids SQL server closing the iterator upon
encountering invalid geometries, we can't rely on this because SQL
server will transparently fall back to STIntersects if it
decides there's no suitable spatial indexes available, and then
throw an exception on invalid geometries. So we still require
the STISValid check when using Filter.
The extent calculation from 57dc3c7 has been refined to
avoid the very expensive STMakeValid call. Instead we use a (ugly!)
workaround to skip invalid geometries using STIsValid only
inside the min/max x/y aggregates. Note that we can't just
dump a WHERE STIsValid clause in to the statement because SQL
server still throws the exception. Gotta love it!
#3 Updated by Markus Woehling almost 6 years ago
I've done some performance tests with interesting results ;-)
I used "SELECT <id> FROM <table> WHERE ..." with four different conditions:
1) WHERE <geo>.STIntersects(geometry::Parse(<wkt>)) = 1
2) WHERE <geo>.STIntersects(@geometryParameter) = 1
3) WHERE <geo>.Filter(geometry::Parse(<wkt>)) = 1
4) WHERE <geo>.Filter(@geometryParameter) = 1
I used four different tables with 100,000 to 1,000,000 rows (three tables with points, one with linestrings).
The tables have quite different "fragmentation", because some of them have rows with "wrong" coordinates (e.g. points at with x=0, y=0).
The results on SQL 2008 are;
Case 4) always gives us the best performance, case 1) always the worst (factor 3 to 10!).
Case 2) and 3) are in the middle, sometimes 2) is better, sometimes 3).
The results on SQL 2012 quite the same, but in some cases 1) is even 100 times slower than 4), because it's 10 times slower than on SQL 2008. Maybe this depends on my hardware, SQL 2008 and SQL 2012 are running on different machines.
So I think you should always use Filter() and you should try to use a parameter for the geometry.
I also tried to pass the wkt-string as parameter [WHERE <geo>.STIntersects(geometry::Parse(@wkt)) = 1]: This makes no difference.
I used .NET for the tests, so this is the SQL statement for case 4):
declare @p3 sys.geometry
exec sp_executesql N'SELECT <id> FROM <table> WHERE <geo>.Filter(@box) = 1',N'@box [geometry]',@[email protected]
Hope that helps.
#6 Updated by Markus Woehling over 3 years ago
I think 4) is more complex to implement than 3).
Because 3) would also improve performance a lot, I want to propose that implementation of 3) would be enough for now.
There are even some corner cases where the SQL query optimizer does NOT use a spatial index if STIntersects() is used instead of Filter().
In this cases 3) would help a lot (not using a spatial index degrades performance massively, of course).