Osclass Themes and Osclass Plugins

Listings Crawler Plugin

Plugin
Demo available
Showcase video
Updated recently
New product
Listings Crawler Plugin - SEO, Speed & Admin - Osclass plugins
Downloads 0-50
Version 1.0.1
Osclass version 8.0+
Last update Feb 2026
No. updates 2
  • Details Product details
  • Reviews
  • FAQ
  • Changelog
  • Support

The Listings Crawler Plugin is a powerful automation tool for Osclass that enables you to extract and store classified listings from external websites directly into your marketplace database or JSON files. Save countless hours of manual data entry by automatically crawling product listings, real estate ads, vehicle classifieds, or any structured content from the web and preparing it for import into your Osclass site.

This plugin requires a solid understanding of HTML and CSS to properly configure and build crawlers. Ensure that your server can access the target URL, as some sites may have security measures that prevent crawling.

Plugin does not perform import into Osclass! Objective of plugin is to prepare data into JSON or Database storage for further processing.

Key Features

Intelligent Content Extraction

  • CSS Selector-Based Mapping: Precisely target and extract any data field using CSS selectors with visual page structure analysis
  • Dual Crawling Modes:
    • Follow Links Mode: Extract from search/listing pages, then crawl individual item detail pages separately
    • Direct Extraction Mode: Grab all data directly from a single page when complete details are available
  • Multi-Source Support: Use comma-separated selectors for fallback options (OR logic) to ensure data capture
  • Smart Image Handling: Automatically detects and extracts multiple images per listing (both src and data-src attributes)
  • URL Analysis Tool: Built-in analyzer to examine page structure, test URL accessibility, and identify CSS selectors before crawling

Advanced Field Mapping (25+ Data Fields)

Extract and map comprehensive listing data including:

  • Core Information: Title, description, locale/language
  • Pricing Details: Price amount, currency with customizable delimiter parsing
  • Visual Content: Multiple images with thumbnail and full-size support
  • Contact Information: Name, email, phone with privacy visibility controls
  • Location Data: Country, region, city, city area, ZIP code, full address
  • Categorization: Automatic category assignment or default values
  • Temporal Data: Publish date, expiration date with flexible format support
  • Unique Identification: Automatic URL-based unique ID generation for deduplication

Flexible Configuration Options

  • Static Default Values: Set fallback values for any field using simple double-quote syntax (e.g., "For Sale", "US", "[email protected]")
  • Category & Location Intelligence: Automatically assign categories by name or ID, or use default mappings
  • Currency & Price Parsing: Built-in delimiter support for accurate price extraction from combined price/currency strings
  • Item Wrapper Filtering: Ignore specific page elements (like related items or user boxes) to avoid data conflicts
  • Relative Selector Evaluation: All selectors evaluated relative to item wrapper for precise targeting

Automated Workflow & Scheduling

  • Cron Integration: Schedule automatic crawling runs via Osclass cron jobs for hands-free operation
  • Smart Deduplication: Update existing listings or skip duplicates based on unique URL-generated identifiers
  • Full Refresh Mode: Option to completely replace old data with fresh crawls on each run
  • Configurable Limits: Control items per run (recommended: up to 50) and total storage capacity (auto-cleanup of oldest items)
  • Update vs. Skip Logic: Choose whether to update existing items with new data or skip them entirely

Server-Friendly Operation

  • Adjustable Request Delays: Set pause intervals (100-2000ms recommended) between HTTP calls to respect target servers
  • Custom User Agents: Configure browser mimicking to improve compatibility and avoid blocks
  • Custom Headers: Add authentication or special headers via JSON configuration for protected content
  • Rate Limiting: Built-in protections to prevent server overload on both source and target systems
  • Server Accessibility Testing: Verify your server can access target URLs before setting up crawlers

Flexible Storage Options

  • Database Storage: Store crawled items in dedicated database table (t_crw_item) for structured access
  • JSON File Storage: Alternative file-based storage in oc-content/uploads/crawler for portability
  • API Access: Secure API key system for programmatic data extraction and integration
  • Retention Management: Automatic cleanup of oldest items when storage limits are reached
  • Structured Data View: Browse extracted listings with searchable table including ID, title, category, contact info, images, and fetch date

Contact Data Management

  • Smart Email Generation: Automatically generate random emails from extracted domains (e.g., extract @gmail.com and create [email protected])
  • Privacy Controls: Configure email and phone visibility (public/private, yes/no, 1/0)
  • Fallback Contacts: Set default contact information when none is found on source pages
  • Email Domain Extraction: Intelligent domain detection for generating valid email addresses

Built-in Analysis & Testing Tools

  • URL Analysis Feature: Test any URL before crawling to verify:
    • Server accessibility (HTTP status codes, error detection)
    • Response data quality (text extraction validation)
    • Complete page structure with CSS selector hierarchy
    • Element counts and direct text content
    • Selector identification for easy mapping
  • Visual Selector Browser: See all available CSS selectors on target pages with counts
  • Fetch Validation: Detect 4xx/5xx errors, "Forbidden" responses, and security blocks before setup

Data Quality & Validation

  • HTML Sanitization: Automatic HTML cleanup and conversion to clean text
  • Date Format Parsing: Supports Y-m-d and Y-m-d H:i:s timestamp formats
  • Unique ID Generation: MD5 hash-based unique identifiers from URLs for reliable deduplication
  • Multi-Image Support: Extract all available images (12+ images per listing supported)
  • Locale Support: Multi-language listing extraction capability

Perfect For

  • Marketplace Aggregators: Build comprehensive classified platforms by extracting from multiple sources
  • Content Migration: Transfer listings from old platforms to prepare for Osclass import
  • Competitor Analysis: Monitor and extract competitor listings for market research
  • Real Estate Portals: Aggregate property listings from various sources into centralized database
  • Automotive Marketplaces: Extract vehicle listings automatically (cars, motorcycles, etc.)
  • Job Boards: Crawl and aggregate job postings from multiple sites
  • Price Comparison Sites: Extract product data with pricing for comparison engines
  • Data Warehousing: Store classified ad data for analytics and business intelligence

How It Works

  1. Configure Crawler: Set up crawler with target URL and extraction mode (follow links or direct)
  2. Map Fields: Define CSS selectors for each data field you want to extract (title, price, images, etc.)
  3. Test & Analyze: Use built-in URL analyzer to verify server can access target and identify selectors
  4. Set Schedule: Configure cron for automatic runs or execute manually from admin panel
  5. Monitor Extraction: View extracted items in structured table with all data fields
  6. Access Data: Use stored database records or JSON files for import into Osclass listings

Technical Specifications

  • Supported Formats: HTML pages with structured content accessible via HTTP/HTTPS
  • Selector Engine: Standard CSS selectors (advanced pseudo-selectors like :has, :not, ~, + not supported)
  • Selector Keywords: "this" keyword to reference item wrapper element itself
  • Data Validation: Automatic HTML sanitization and text conversion for clean data
  • Date Parsing: Supports Y-m-d and Y-m-d H:i:s formats, falls back to current timestamp
  • Image Formats: Extracts both src and data-src attributes (lazy loading support)
  • Locale Support: Multi-language listing extraction with locale field mapping
  • Storage Formats: MySQL database tables or JSON file storage
  • API Integration: RESTful API with key-based authentication

Use Cases

  1. Automated Extraction: Schedule nightly crawls to keep your database fresh with new listings
  2. Bulk Data Migration: One-time extraction of large listing databases from external sources
  3. Competitive Intelligence: Track changes and updates from competitor websites over time
  4. Multi-Source Aggregation: Combine listings from dozens of sources into single database
  5. Data Enrichment: Supplement existing data with additional fields from external sources
  6. Market Research: Collect pricing and inventory data for analysis
  7. Archive & Backup: Create regular snapshots of external listing data

Crawler Management Interface

  • Multiple Crawlers: Create and manage unlimited crawlers for different sources
  • Crawler List View: See all configured crawlers with ID, name, URL, and status
  • Individual Settings: Each crawler has independent configuration for fields, limits, and scheduling
  • Items Overview: Browse all extracted items across all crawlers in unified table
  • Detail View: Examine complete extracted data for individual items including all 25+ fields
  • Easy Editing: Modify crawler configuration anytime without losing extracted data

Requirements

  • Osclass 8.x or higher
  • PHP 7.4 or newer with cURL support
  • MySQL database access for database storage mode
  • Server with cron job capability (for automated scheduled runs)
  • Write permissions to oc-content/uploads/crawler (for JSON storage mode)
  • Target websites must be accessible from your server (no JavaScript-required pages or Cloudflare-protected sites)
  • Server must support outbound HTTP/HTTPS requests

Limitations & Important Notes

  • JavaScript-rendered content cannot be crawled (server-side HTML only)
  • Sites with Cloudflare protection or similar security services may block requests
  • Advanced CSS selectors (:has, :not, ~, +) are not supported
  • Plugin extracts and stores data only - separate import step required to create actual Osclass listings
  • Recommended maximum of 50 items per crawl run for optimal performance
  • URL must be accessible from server backend (client-side JavaScript cannot be executed)

Transform your Osclass marketplace into a powerful data aggregator with the Listings Crawler Plugin. Extract thousands of listings from any website and store them ready for import with just a few CSS selectors.

Last update of product description has been on 28. April 2026

Product features and functionality

Basic documentation included
Require PHP skills
Coding skills recommended
Recommended for advanced osclass users
Advanced installation (need more skills)
No dependency on 3rd party services
Osclass Seller's picture
MB Themes
Premium developer
221 products
View seller profile

Product support includes

Direct support from Adrian Brezak, founder of MB Themes and developer maintaining these products in production
12 months access to support and latest updates
Support can be extended anytime for 35% of base price (+12 months)
Availability of seller to answer questions
Answer technical queries about product features
Assistance with reported bugs or issues
Help with installation in case there is problem
Product in English language (other locales provided by community)
Proven support scale: 9,200 resolved tickets and 47,000 support messages
Long-term maintenance track record: 2,200+ updates released across products
Updates are based on customer support cases, Osclass core changes, PHP/MySQL updates, and real-world usage feedback

Support does not include

Customization service, custom work or feature requests
Support on free/gratis plugins delivered with premium themes
Installation service
Translation and localization services

Support quality, trust and engineering proof

Seller updated this product 2 times
Seller rating is 4.7 of 5 - Excellent (583 reviews)
Average response time to support tickets is 1 hour 23 mins
Member since 2017

Support available in:

English English
English Czech
English Slovak
This product is not compatible with WordPress. All our themes and plugins work exclusively with Osclass.

Frequently asked questions

Question: What does Listings Crawler Plugin do in a real classifieds workflow?

Answer: The Listings Crawler Plugin is a powerful automation tool for Osclass that enables you to extract and store classified listings from external websites directly into your marketplace database or JSON files.

Question: When is Listings Crawler Plugin the right choice?

Answer: It is useful for teams that prefer a tested implementation path over ad-hoc custom development The setup details for Listings Crawler Plugin are different in production.

Question: Which setup step is most important for Listings Crawler Plugin?

Answer: Enable core options first, then validate main user flow and admin settings save cycle before enabling advanced features The setup details for Listings Crawler Plugin are different in production.

Question: How should compatibility be checked for Listings Crawler Plugin?

Answer: Validate plugin behavior after Osclass core updates and PHP upgrades, then review changelog-dependent configuration changes The setup details for Listings Crawler Plugin are different in production.

Question: Can Listings Crawler Plugin affect page performance?

Answer: Monitor load time and database queries on pages affected by plugin hooks, then optimize configuration based on real traffic patterns The setup details for Listings Crawler Plugin are different in production.

Question: What are common issues with Listings Crawler Plugin?

Answer: Common causes are missing prerequisites, cached outdated settings, and conflicts with custom forms or third-party overrides The setup details for Listings Crawler Plugin are different in production.

Question: What is the recommended migration path for Listings Crawler Plugin?

Answer: Update in controlled steps, retest primary business flow, and keep rollback package ready before production deployment The setup details for Listings Crawler Plugin are different in production.

Changelog - Product updates history

1.0.1 Fixed labels in settings.
Corrected unique id generation when follow links is set to off.
It is now possible to turn off/on md5 hash of unique id and use santized url instead (constant in index).
Many minor improvements.
1.0.0 Initial plugin release
View all products updates

Verified & Genuine Reviews

All reviews on OsclassPoint come from real customers who have purchased the product. Only verified buyers can leave a rating or review.

To maintain quality and accuracy, every review is moderated before being published.

No reviews has been added yet.
View license details
49.99
Created by best developers
Regular updates and bug fixes
Premium support services
Add to cart ✨ Ask Fred – AI Support Agent Contact seller Online · Replies within 1 hour
Price is in Euros
Osclass Seller's picture
MB Themes
I am Adrian Brezak, founder of MB Themes and developer of Osclass plugins and themes for classifieds platforms. I focus on maintaining and improving compatibility, payment integrations, SEO features, performance, spam protection, and marketplace monetization across releases. 9,200+ support tickets resolved · 47,000+ customer messages handled. 2,200+ product updates and compatibility fixes. Trustpilot profile

Product technical details

0-50 downloads
2 updates
2058 views
Product version: 1.0.1
Last update: 3 months ago
Osclass support: 8.0+ Download osclass
Product rating: 0 of 5 - No reviews
Published on: 11. Feb 2026
Folder name: crawler