web
analytics

job listings

Hijacked! The Problem with Job Scraping

job-scraping.png

A reader wrote to me about the following problem she was encountering with scraped jobs off her company website;

Do you have any tips on getting old, filled job posts off the internet?  Job Boards scrape jobs  and it’s frustrating when you get a phone call or resume from someone who saw a job that’s been filled for months on a job board.

I reached out to a couple of popular job aggregators to get their thought on this vexing issue many employers face. Adzuna offered the following;

Unfortunately for this person, the responsibility ultimately falls with the job board themselves to be diligent and responsible and to refresh their feed and scrapes regularly. What we are seeing is that more and more job sites are adopting a consistent schema for job ads. This means that we're seeing an increasing number of job ads using the 'valid from - valid to' + 'expiry date' markup. So expired ads can be quickly identified by 3rd party sites through scripts and automation. This is not a silver bullet, but it suggests that if you markup your jobs in the same way, this will happen less often.

The other thing is to be vigilant. We deal with people taking jobs off Adzuna's site and marking it with the wrong company on a regular basis, so start with politely informing them that they've done this and ask them to refresh their feed and scrapes. If it keeps happening, or you get no response, you can take more serious action. It's unlikely that any job board is keeping expired ads on their site deliberately as it's a terrible candidate experience. 

~Andrew Hunter, Adzuna

As he says you can try contacting the job board in question but those efforts may not always payoff. A lag has always existed in job listings after they have been distributed across multiple providers. My friends over at Talent.com also had some thoughts.

Their Director of Client Services Robert Boersma told me over email;

  • First it is important to understand the type of website in which this content is being displayed – is it a job board or an aggregator? Job boards like Monster or CareerBuilder do durational based postings – think of it like Craigslist for jobs. Aggregators (like Indeed or Talent.com) are search engines, and are likely the ones scraping your website and pulling job content. Very useful for getting free, organic candidate flow, but sometimes can lead to a delay time in jobs that are filled still being displayed. If this is a trend you see, reach out directly to those websites to explore your options with them.

  • If your ATS system enables you to generate an XML feed of your job content, you can provide this feed to job search websites, which will put the content that gets indexed to these sites into your control. This will also help you build a direct relationship with these sites and have better control of your content that is scraped online.

  • Understand that job content is often shared between both job boards and job aggregators (often called “cross-posting”, or “backfilling”). This process relies on refresh rates varying between 1-12 hours. There can be scenarios where a job, or set of jobs falls between the cracks in this process and jobs can stay live for far too long outside this window.

  • Using a plugin like “link redirect trace” can help you understand this job sharing process, and find the root scrape of your jobs – and therefore where to address your request to remove expired jobs.

  • Ultimately, the online job search world is one where hundreds of websites are continuously scraping and sharing jobs in order to give candidates the best possible chance of finding your roles, wherever they’re looking for work. Typically this works well and the online world will follow closely behind your ATS as jobs are added and removed, but technical errors can happen (as we all know too well). The best way to prevent expired jobs is to have a contact within websites that are indexing you, and to control the content that is shared via an XML feed, leaving less to interpretation from the indexation crawler of a job search website.


One other tip I might add is to sign up for email alerts on Google for Jobs. For instance if you google a phrase like “company name + jobs” you can sign up for alters via the tab on the right. At least this way you can keep track of where your jobs are appearing online at any one time.

brojobs.png


Get the Podcast | Subscribe