Bug report #4604
WFS option "Only request features overlapping the current view extent" disappeared from QGIS master
|Category:||Web Services clients/WFS|
|Affected QGIS version:||master||Regression?:||No|
|Operating System:||Easy fix?:||No|
|Pull Request or Patch supplied:||Yes||Resolution:||fixed/implemented|
|Crashes QGIS or corrupts data:||No||Copied to github as #:||14509|
Still available in 1.7.2
#1 Updated by Bill Clay almost 8 years ago
Looks like this was intentionally disabled by developer "MD" in providers/qgswfssourceselect.cpp at lines 54 and 266-274. Perhaps this was done pending implementation of feature request #4280? For my work, it's better to have that capability (even with the flaw described at feature request #4280) than not have it at all.
#3 Updated by Bill Clay almost 8 years ago
- File incremental-wsf.patch added
The attached patch reimplements the disappeared capability described above, thanks to Marco Hugentobler, who supplied much of the approach and code.
- The former per-WFS-source "only request features overlapping ..." checkbox has become a per-layer "cache features" checkbox. The new checkbox is the inverse of the old option. It is checked by default; the entire WFS layer is fetched and cached at layer instantiation unless user UNchecks it.
- If "cache features" is NOT checked, WFS layers will be incrementally fetched to cover new areas exposed by pans and zooms. These fetches will cover the combined extent of the previous WFS layer extent and the new canvas extent. (This causes large WFS fetches on large diagonal pans.)
- The incremental fetches described above work with and without "on the fly" canvas reprojection.
- If "cache features" is NOT checked and the layer contains no features that intersect the canvas extent at layer instantiation, layer creation fails. Subsequent pans and zooms will NOT retrieve features, even if they intersect the newly-rendered areas. (This bug appears to independent of this patch.)
This patch has NOT BEEN TESTED with editable WFS layers; I do not have access to an updateable WFS repository.
I intend to address the issues in items 2 and 4 above ASAP, but the restored functionality may be useful to others even with these limitations.
#5 Updated by Bill Clay almost 8 years ago
- File incremental-wsf-rev2.patch added
The attached patch completely replaces the patch attached at update nr. 3. It contains the following improvements:
- correctly creates layers even if their initially retrieved extents contain no features (e.g., due to BBOX or FILTER)
- eliminates double fetches in some cases and removes unreliable logic for mismatchs between project and saved layer extents by refactoring initialization logic (DescribeFeatureType) out of feature retrieval logic (GetFeature)
- limits incremental growth of fetched extents when "Cache Features" is not selected
When "Cache Features" is NOT selected, the following flaws are present:
A. per OGC WFS spec, BBOX and FILTER are mutually exclusive; BBOX must be merged into the filter, server's capabilities permitting
B. when fetched extents are incrementally expanded, the previously fetched area should not be re-fetched
I hope to correct these flaws within year 2011.
This code has not been tested with editable WFS providers!
#6 Updated by Marco Hugentobler almost 8 years ago
- Affected QGIS version set to master
- Status changed from Open to Closed
- Crashes QGIS or corrupts data set to No
Patch is pushed to master (9957ee58f05892c3251216dc5dee421405a6a412). Thank you very much for these nice improvements!
The approach to read the feature type from describeFeatureType is better than the current one. However, in some situations, there is no feature type declared (e.g.Giovannis test layer for WFS-T). If the feature type is still WKBUnknown after describeFeatureType, it would be good to additionally get the type from the first fetched feature.
In future, it would be cool if there was a possibility to limit memory for really large layers (e.g. fetch layer in several parts if too big, keep already fetched features in cache)
#7 Updated by Paolo Cavallini almost 8 years ago
In a large, real use case, option #3 is most probably the best, as it will happen often that a layer is quite large, with many large features. Avoiding refetching (based on an ID/PK) will greatly increase the usability of the plugin.
Probably the local cache could be stored in SpatiaLite, to take advantage of spatial and alphanumeric indexing, etc.
#8 Updated by Bill Clay almost 8 years ago
- File incremental-wsf-rev3.patch added
The attached patch is intended to replace my 2nd patch attached at update 5 above.
Although this patch has no known bugs, it has two undesirable (but infrequent) interactions. Perhaps a slightly-improved version of the 2nd patch is what should be released, even though it will usually run slower. Please let me know which the team prefers.
The major enhancement of this patch over patch nr. 2:
- "Incremental get:" when "cache all features" is not selected, the WFS provider fetches features only for the NEW map canvas areas exposed by pans and zoom-outs, adding them to cached features for the areas previously fetched. This improves response time for areas with many features. A user who pans and zooms in a localized area rapidly builds a cache that avoids delays for new WFS feature retrieval.
- The map canvas extent is occasionally expanded or shifted slightly without the user (or this patch) explicitly changing it. This can trigger a WFS "incremental get," with resulting delay. This is not a bug, but it may surprise the user.
- Testing revealed one WFS provider (ArcGIS on Windows) that does not process filters with BBOXes as produced by this patch. Rather than returning an exception, the service returns ALL features contained in the layer. This causes slow response on each "incremental" fetch, but QGIS continues to function correctly. I have found no simple & efficient method to reliably detect WFS providers that have this problem (I omitted a moderately complex and inefficient one from this patch). Work-around: user deletes the layer and recreates it with the "cache all features" option.
Does anyone know a reliable method to identify WFS providers that do NOT support filter syntax v. 1.0.0, but that do not signal an
exception when they receive it? (Methods that do NOT work: "version=1.0.0" in the GetCapabilities URL; looking at version number in the WFS_Capabilities xsi:schemaLocation attribute -- it contains http://schemas.opengis.net/filter/1.0.0/filter.xsd!)
Responses to updates 6 & 7 above:
Marco, I encountered the layer creation problem with Giovanni's TinyOWS test server and added the fix you suggested.
Paolo, I'm willing to take a swing at a SpatiaLite local cache, but I'd need some guidance on what you and Marco have in mind. It presents some interesting design challenges that the team has probably already resolved in similar contexts. (I've also never used SpatiaLite, but I have used SQLite a bit in Python.)
#9 Updated by Marco Hugentobler almost 8 years ago
Normally, a WFS server lists BBOX as a spatial capability (see below). Is this also the case for ArcGIS WFS?
#10 Updated by Bill Clay almost 8 years ago
Marco, my nr. 3 patch uses its "incremental get" method only with WFS providers that advertise filtering capability with BBOX and logical operators. The ArcGIS system in question advertises all three in response to WFS GetCapabilities, both version 1.0.0 or 1.1.0. Unfortunately, it does not provide the service it advertises, at least with a version 1.0.0 GetFeatures. It ignores my <filter> and provides ALL the features in the queried layer.
How do you want to handle that? I need direction from someone who knows better than I QGIS architecture, practice, and plans regarding the 2nd bullet item in update 8 above.
Meanwhile, I have resolved the first "undesirable interaction" listed in update 8 with the patch attached to issue #4734, which does not touch WFS code.
#12 Updated by Jeremy Palmer almost 8 years ago
I Just came across this now. Very interesting indeed. This work is quite funny because I was just talking to colleagues before Xmas about options for improving the WFS provider, including the possibility of using a spatialite cache!
To give you some background we (LINZ) provide our data via WFS to the public using geoserver on http://data.linz.govt.nz. Many of our vector layers have millions of features in them (e.g http://data.linz.govt.nz/#/layer/772-nz-primary-parcels/webservices) and we would like users to be able to add these layers to a "fresh" QGIS canvas and then zoom in to their smaller area of interest - much like you can do with shapefiles or PostGIS layers by adding the layer, canceling the rendering, and then zooming in. This way a general user would not have to understand caching options to get going.
Not that I've looked at the full WFS code that much, but it seems that this might be possible to do if the WFS server has the necessary capabilities (as you have already worked on). Our early thinking before Xmas is quite close to what you have in patch 3, but has a few subtitle differences:
- Don't load the WFS data on the construction of the object
- QgsWFSProvider::nextFeature would be changed. This method will do 2 things, first check a cache to see if the current select extents are available. If the extent is not it will get the data incrementally using the QgsWFSData reader (which is setup to get the faetures in batches), put the data into the cache and finally returning false once the reader has run out features. Also once the reader has finished it will set the cache to flag the extents as complete.
- The cache would have an index of extents that had been visited and would dissolved extents that are superseded by larger covering extents. Ideally the cache is configurable to be memory based (map of features by fid with a spatial index, discarding features once memory limits are reached) or a spatialite database backend storing the features which are indexed by fid and geometry.
- The cache is invalided if the WFS filter changes.
BTW. I'm more that willing to help with testing or any of the WFS stuff you are working on. I've already testing out your WFS changes so far, and it's a big improvement with large WFS datasets!
#13 Updated by aperi2007 - almost 8 years ago
QgsWFSProvider::nextFeature would be changed. This method will do 2 things, first check
a cache to see if the current select extents are available. If the extent is not it will
get the data incrementally using the QgsWFSData reader (which is setup to get the
faetures in batches), put the data into the cache and finally returning false once the
reader has run out features. Also once the reader has finished it will set the cache to
flag the extents as complete.
Hi, i like give my opinion on the cache system for a WFS server.
I guess that unfortunately, use only the extent is not sufficient to understand if the WFS client has cached all the feature of that extent.
Infact there is some WFS Server (tinyows for example) that allow to limit the maximun features send.
So if an extent there is really 10.000 feature, but the WFS Server is set to send maximum 5000 features. You wrongly understand that the extend is fully received, instead there are again 5000 feature you don't know there are in that extent.
To avoid this , when the user do a zoom into a new extent that is inside the old extent, the WFS client should again send a request to the Server WFS.
So the server WFS send the feature available in this new extent. Again it send max 5.000 features.
The cache system should analyze every feature received to understand if it is a new feature or a just cached feature. To do this I guess the better method is use the MD5 of the feature: If a new feature received has the same MD5 of a just cached feature (this kind of search are simple in a spatialite) otherwise add to the cache this new feature received.
#14 Updated by Jeremy Palmer almost 8 years ago
Hmm this is bad news. I didn't realise that WFS servers could do that. Is this advertised in the server capabilities, or as a vendor extension?
If this really is the case, then caching is almost a waste of time because each extent refresh will cause a server request, causing server processing and full transfer of the GML payload.
If only tinyows does this then we could test for the server type and disable caching. I know the WFS ogr driver does server testing like this.
#16 Updated by Bill Clay almost 8 years ago
Sorry, all, for the long silence. I'm in the US on personal business and vacation Jan. 17 - Mar. 19, and will not have much time to work on this. However, my nr. 3 patch restores the desired "fetch map extent only" functionality despite its limitations (fairly minor, I believe). Since Marco Hugentobler is the WFS original developer, and surely knows this code MUCH better than I, I guess he's the one to decide whether it's worth releasing my patch as it currently exists.
The following is a list of the remaining open issues, as I understand them:
- There appear to be some WFS servers that advertise WFS v. 1.0 filter capabilities, but then ignore the filter and send the whole layer. (I have found only one such instance, but it may be common to ArcGIS of certain versions.)
- I understand TinyOWS can be configured NOT to provide its internal FeatureIDs; Andrea Peri's test server is the only example I know. Andrea suggests computing our own internal "FeatureID" by running an MD5 digest on the feature text. This would be expensive, but we could add logic to do it only when required. If Tiny OWS is the only package that can be configured NOT provide Feature IDs, I'm not sure it's worth the extra logic. Maybe we should simply require that a WFS service provide Feature IDs if one wants to use this feature.
- I opened issue #4734 for the "undesired canvas movement" problem (first "undesirable interaction" in my update nr. 8 above) and supplied a patch. Alexander Bruy closed that issue yesterday as "fixed." HOWEVER, looking at the text of his commit, it appears he fixed a similar issue on "map scale" edits, but the commit does not seem to include my patch for "map center" edits. Unless I've missed something (perhaps "map scale" and "map center" are in a union and logic that fixes one fixes both?), #4734 should be reopened and my patch (or an improved version like Alexander's "map scale" edit patch) actually applied to master. I have not (yet) contacted Alexander about this, nor have I looked deeper than the text of the commit posted yesterday, so this issue may really be fixed.
Between preparing for my trip and actually leaving, I have not had the opportunity to examine Jeremy Palmer's suggestions in any depth nor responded to Andrea Peri's concerns regarding Tiny OWS's omitting Feature IDs. I apologize to both, and hope to be able to respond more constructively within two weeks. I would like to continue to improve the WFS caching code once I return to Italy, but if someone is able to do so sooner or better, please don't hesitate!
#26 Updated by Nicolas Boisteault about 4 years ago
This feature is not present in 2.8.3 neither but is back in 2.10.1.
Though I looked in Fiddler and saw that there is no BBOX parameter added to the request with this option checked.
The strangest is that I have to cancel once to get the right query sent.
I'll test on master and report back.
#27 Updated by Nicolas Boisteault about 4 years ago
- File WFS_issue.jpg added
- To uncheck "Cache Feature", click add button and cancel first query. A second query with bbox is sent after.
- To check "Only request features overlapping the current view extent", click add button and cancel first query. A second query with bbox is sent after.
In both cases if I move the view extent there are no new request sent. I can't see any other difference between these two methods.
In screenshot attached I put in red two '&&'. Maybe BBOX parameter should be between those.