Blog

Content Scraping and Your Nonprofit—Useful Tools to Protect Your Content

Wednesday, March 6, 2013 - 2:52 pm
Sophia Guevara

I recently had an eye-opening experience that helped me realize that many organizations promote their content but do nothing to ensure that it is being utilized appropriately. While some nonprofits allow others to make use of their content with proper attribution, some content users decide to make use of content inappropriately. Specifically, I am referring to content scraping. Sometimes site owners that host scraped content are looking to make money by serving up ad content to visitors who unknowingly visit the copycat page. The site owner can then profit through ad partner programs that compensate them for ad impressions or clicks. In addition, site owners may be looking to improve their site ranking in search engine results by copying and serving up content from a popular site that is considered authoritative.

How can you protect your organization’s online content? Here are some tools that your organization can use:

1. Has your foundation posted several snapshots of public events, gatherings, or conference events? It is a good idea to check once in a while to make sure that your logo and other images are being used appropriately by utilizing tools likeTinEye or Google Images. Search by uploading images or the link to the hosted image. TinEye currently has more than two billion indexed images that it will check against to see if there are any matches.

2. Copyscape is a great tool for seeing if there are copies of your content on the Web. There is a free option and a couple of fee-based products available. In the search box, you can enter a website address and search to see if there are any close matches. If there is a hit, Copyscape will provide the address of the copied page and let you know how many words match the original page.

If you find a site that is making use of your organization’s content inappropriately, you may want to contact the webmaster. If that information isn’t visible on the site, try conducting a WHOIS search to see if you can track down the contact information for the person responsible for the domain/site. If the registrant hasn’t elected to keep their information private, you should be able to find what you need.

3. If your organization uses Google’s Webmaster Tools, you can see who is linking to your content. This information is quite interesting to look at and you can see how often a domain linked to your content and which page was linked. By hovering over the domain name, you can see a snapshot of the site linking to your content.

In conclusion, keep in mind that it is important to keep your organization’s content protected from those who are looking to drive more visitors to their site to generate ad revenue by copying your content. If you would like to research this topic, contact your foundation’s librarian.

Sophia Guevara is the chair of the Consortium of Foundation Libraries affinity group

Share on FacebookShare on TwitterShare on LinkedInShare on all
Technology

Related Events

Related Resources