Skip to content

[DIvisions] Mismatch between OSM's and overture's division feature type #482

@danabauer

Description

@danabauer

Discussed in https://github.com/orgs/OvertureMaps/discussions/252

Originally posted by bocalml November 18, 2024

Category Feedback

Querying division feature type using the overturemaps package for some bbox returns following data:

batch = overturemaps.record_batch_reader(
    overture_type="division", bbox=(1.0, 11.0, 2.0, 12.0)
)
gdf = gpd.GeoDataFrame.from_arrow(batch)
places_of_interest = ["isolated_dwelling", "village", "hamlet", "town", "city"]
# `local_type` column is of list[tuple[str, str]] type, just interested in the second element of the tuple
gdf = gdf.loc[gdf["local_type"].apply(lambda x: x[0][1] in places_of_interest)]
gdf["local_type"].value_counts()

>>> local_type
>>> [(en, village)]    100
>>> [(en, hamlet)]      29
>>> [(en, town)]         1
>>> Name: count, dtype: int64

while, when using overpass turbo to retrieve the data (from which I assume the overture's division feature type is pulling the data from) with

[out:json][timeout:25];
(
  node["place"](11.0, 1.0, 12.0, 2.0);
);
out body;
>;
out skel qt;

and processing this data returns a larger amount of places actually being detected

gdf_osm = gpd.read_file("osm_benin_bbox.geojson")
gdf_osm = gdf_osm.loc[gdf_osm["place"].isin(places_of_interest)]
gdf_osm["place"].value_counts()

>>> place
>>> village    103
>>> hamlet      49
>>> town         1
>>> Name: count, dtype: int64

Couple of questions:

  • why is there a mismatch between the OSM and overture data? My understanding was that overture is using OSM and GeoBoundaries dataset for this feature type
  • is there some kind of filtering/preprocessing being used on your side to filter this OSM data, which would explain this? Couldn't find this explanation in the docs
  • will the overture ever match the same data as it is in the OSM (it is clear to me this won't be the case if indeed there is some filtering/preprocessing happening on your side), regardless of the other datasets that could be added to it
  • also noticed isolated_dwelling type barely even exists in overture's division feature type (there's ~1.1k rows), while this number is significantly larger in the OSM
  • (couldn't find the Division category, opened this discussion as Base)

Dependency with other categories, if any.

No response

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions