We Rebuilt Osclass Geolocation After Real Customer Feedback

Recently we've received feedback from multiple customers with bad experience on location data in Czech republic and Egypt, so I analyzed how we build country, region, and city datasets and improved the rules for easier installation and better everyday UX.

Technical Guides

27. May 2026

7 min read

288 views

We Rebuilt Osclass Geolocation After Real Customer Feedback

Recently we've received feedback from multiple customers with bad experience on location data of Czech republic and Egypt, so I've decided to analyze current logic of building location datasets and improve it to provide easier installation and overall better user experience and data quality. This was not about adding more rows for marketing numbers. It was about making country, region, and city trees trustworthy when someone publishes a listing or filters by location.

Updated packages are available on osclass-classifieds.com/geolocation and in Osclass backoffice under Market > Locations. Same dataset, refreshed generation rules, cleaner SQL output.

After refresh, package covers 252 countries, 3871 regions, and about 2.3 million cities with coordinates where source data allows map and distance features.

What customers actually reported

Support tickets were very specific. Czech sites showed regions sometimes in English, sometimes in Czech, sometimes half-transliterated labels that do not match official admin names. Egypt and other Arab markets had regions in Arabic but cities still looked like Latin transliteration in dropdown. UAE had cases where native label picked wrong script (Russian or Greek variant instead of Arabic). Algeria had many empty native values. Indonesia SQL was huge, much bigger than United States, which made import slow on shared hosting.

Locations management in backoffice

These are not cosmetic issues. Wrong location tree breaks trust in search filters, hurts SEO location pages, and creates extra admin work when moderators fix listings manually.

Database columns we focused on

Location import still targets standard Osclass tables: t_country, t_region, t_city. Main visible fields remain s_name, s_slug, b_active, plus country phone and currency on country level. Coordinates stay on city rows (d_coord_lat, d_coord_long).

Important change: we now use s_name_native on country, region, and city - but only for countries where non-Latin script is part of normal local usage. If native value does not exist or would be duplicate noise, column is omitted from generated SQL for that row. Primary s_name stays readable for admin UI and Latin-friendly workflows.

Data quality improvements in practice

1) Name consolidation (Czechia and similar cases)

Before refresh, same country could mix admin labels from different naming layers. User sees "Zlín" in region list but expects "Zlínský kraj". That happens when global region source wins over local admin naming.

We changed generation so region names are resolved per country from authoritative admin records first, then fallback only when needed. Result: more consistent local naming in region dropdown, less random English leftovers in otherwise local tree.

2) Native names with script-aware rules

Earlier logic treated "first non-Latin alternate" as native. That fails in multilingual datasets. Example pattern we fixed: Arab country row contains Arabic, Russian, Greek, and other exonyms in same alternate list. Picking first match creates wrong native label.

New rule: for selected countries, native extraction prefers expected script family for that country (Arabic for Gulf/Levant/North Africa set, Cyrillic for Russia/Belarus/Ukraine/Bulgaria, CJK for China/Japan/Taiwan/Hong Kong/Macau, and so on). If preferred script exists, we use it. If not, we keep primary name only and avoid fake native column.

3) Indonesia size optimization

Indonesia was the extreme outlier. Too many micro-places with near-zero population were included, so SQL grew far beyond practical install size.

We applied stronger population filtering for large countries and country-specific threshold for Indonesia (minimum population 1 instead of global 5). This keeps meaningful places while cutting noise. On live package, Indonesia dropped from multi-megabyte class to practical size for normal hosting.

4) Cleaner SQL for import reliability

We fixed output cases where country had no regions/cities but SQL still printed empty statement tails (; only lines). Import tools and manual review are cleaner now. Batch refresh list is also synced with current country registry so deprecated country codes still get refreshed country row instead of staying stale forever.

5) Performance for maintainers

Regeneration on shared hosting was too slow for small countries because preprocessing ran globally each run. We moved heavy steps to selected-country scope and added quiet batch mode. Practical effect: single-country refresh is usable again for support and iterative fixes.

Before vs after examples

Market	Level	Before (typical issue)	After (expected behavior)	Why it helps
Czechia	Region	Mixed EN/CZ admin labels (e.g. city-style name in region list)	Consistent local admin region naming	Users recognize official kraj names in filters
Egypt	Region	Arabic region, Latin-only city feel	Arabic native on region/city where available	Better local UX for Arabic interface sites
UAE	City	Native sometimes Russian/Greek exonym	Arabic native preferred for AE	Avoids wrong script in bilingual markets
Russia / Belarus	Country, region, city	Native column missing or inconsistent	Cyrillic s_name_native aligned on all levels	Consistent azbuka display in RU/BY/Ukraine-style sites
Algeria / Morocco	City	Empty native or wrong script pick	Arabic first, Tifinagh where relevant	More complete Maghreb localization
Indonesia	City volume	Very large SQL, slow import	Reduced low-value micro-places	Faster install, lighter DB, same practical coverage

Who benefits most

Local-language classifieds - Arabic, Cyrillic, CJK, Greek, Thai, Hebrew markets get meaningful native labels.
Admins on shared hosting - smaller country SQL where possible, especially large Asian countries.
SEO-focused sites - cleaner location structure reduces low-quality location pages built from messy free text.
Support teams - fewer "wrong region/city" tickets after import.

How to install updated locations safely

Download only countries you serve from Geolocation page or backoffice Market > Locations.
Import via Tools > Import SQL in oc-admin.
Verify tree in International > Locations (country, region, city samples).
Test listing form and location filter on frontend in your active theme.

If you maintain custom cities manually, document them before full re-import of same country. Replacement imports can overwrite generated rows depending on your workflow.

What we validated after release

We ran structural SQL checks on random refreshed country packages: one country statement per file, valid statement order (country, then region, then city), no orphan semicolon blocks, balanced statement structure. Sample batch passed. This does not replace testing on your own server, but it caught real regressions from earlier generator output.

We also reviewed known edge cases from tickets: Czech region naming consistency, Arab script preference, Cyrillic consistency for Russia/Belarus/Ukraine, and Indonesia file size. Those were direct drivers of this refresh.

Practical recommendation from production experience

Import geography you actually operate in. Full-world import is rarely needed and often hurts performance on small servers. Better tree in 5-20 countries beats bloated global tree nobody uses.

For multilingual sites, keep theme/language packs aligned with location strategy: native column helps display, but UI translation and slug strategy still matter for SEO URLs.

Closing note

This refresh came from real customer pain, not from abstract "data update day". We optimized generation rules, consolidated naming behavior, improved native script handling, reduced noisy city volume in Indonesia, and made SQL imports more predictable. If you still see wrong location label after import, send country code + level (country/region/city) + example label. That helps us tune rules with evidence, not guesses.

I will keep monitoring feedback from Czech, Arab, and Cyrillic markets specifically, because those were the strongest signals that previous logic was good enough for import, but not good enough for daily user experience 🙂

About the Author

Oliver Bk

My passion is building classifieds marketplaces, automating workflows, and turning messy data into useful products. From PHP, HTML, CSS, and JavaScript to Python, crawlers, imports, and SEO, I enjoy solving technical challenges and sharing lessons learned from real-world projects. Most ideas start with a problem, a cup of coffee, and a curiosity to see how far automation can go.

Osclass, PHP, JavaScript, CSS, Python

48 posts Publishing since 04/2018