Location Deduper
Indholdsfortegnelse
- What does deduping do?
- How can you avoid or block deduping
- Avoiding deduping
- Blocking deduping
- How to split a deduped set of POIs?
What does deduping do?
RouteYou has a model where info from the same location is grouped per location page. Example: Many people or organization have a story about the Grand Place in Brussels
But, people create several POIs and don't position them exactly on the same spot. So the deduping software DEDUP, takes several logical rules into account to merge or not to merge several POIs to the same location.
Aspects taken into account
- Distance
- Semantic distance of the types
- = are the POI-types similar or dissimilar
- e.g. the semantic distance between a church and a cathedral is smaller than the semantic distance between a church and a pub
- Name
- Description
- Size of that object in the real world (e.g. a cathedral is in general of the size of > 100m, while a chapel is < 10m), this is taken into account in combination with the distance between two objects
- ...
An example
You have 2 POI explained in case 1 and case 2
- Case 1:
- POI 1:
- Name: St. Baafs Cathedral
- Type Cathedral
- POI 2:
- Name: Cathedral Sint Bavo
- Type Basilica
- POI 1 - POI 2:
- Realworld distance between the two POIs: 50 m
- Semantic distance between Basilica-Cathedral (Small)
- POI 1:
- Case 2:
- POI 1:
- Name: St. Baafs
- Type: Cathedral
- POI 2:
- Name: St. Baafs
- Type: Pub
- POI 1 - POI 2:
- Realworld distance between the two POIs: 45 m
- Semantic distance between Pub-Cathedral (Big)
- POI 1:
Conclusion: In case 1, you want the two POIs to be merged (deduped), in case 2 you don't want them to be deduped because probably the two places are in our mind very different.
A good check
A good way to interpret if you should have 1 or 2 places (deduping or not deduping = keeping the split) is the following test
"Let's meet at POI 1 and POI 2. Would you find each other or would you wait and not find each other".
In the example above we wouldn't find each other in case 2. In case 1, we would, although it's a big place, we would keep on walking around till you meet.
How can you avoid or block deduping
Why would you like to avoid deduping?
See example above and below
- Case 1:
- POI 1:
- Name: Golden Gate Bridge
- Type: Bridge
- POI 2:
- Name: Golden Gate
- Type: Bridge
- POI 1 - POI 2:
- Real world distance between the two POIs: 250 m
- Semantic distance between Bridge-Bridge (Null)
- POI 1:
- Case 2:
- POI 1:
- Name: Foot-Bridge
- Type: Bridge
- POI 2:
- Name: Foot-Bridge
- Type: Bridge
- POI 1 - POI 2:
- Real world distance between the two POIs: 25 m, but they give access to a different area or road leading to a different area
- Semantic distance between Bridge-Bridge (Null)
- POI 1:
In case 2 you probably want to keep the POIs at a separate location
Avoiding deduping
You can help the DEDUP software NOT to dedupe by
- adding other names for Bridge 1 and Bridge 2 (be creative with the title , e.g. add the streetname if they are different,...)
- If you both call them Bridge, the software will merge them because there is no other info NOT to merge them. provide also the correct type of the object.
- provide the correct poi-type
Blocking deduping
In some cases you want to make absolutely sure things are NOT merged or getting deduped. This is the backdoor trick. But only RouteYou can do this for you.
If you connect your POI to the group Dedupe-blocker (this one https://www.routeyou.com/nl/group/view/71595/dedupe-blocker), POIs will not be deduped.
How to split a deduped set of POIs?
For the moment, only RouteYou can do this.