Resource Location: https://www.import.io/
By far the most robust and user-friendly of the scraping products I looked at. Import.io is designed for marketing research, and it has the friendly graphic user interface, and the customizability that is expected for that market. It is also a commercial product, with all that entails financially.
The system is web based, no installation required. You can run multiple “extractions” simultaneously, and adding a “feed” requires only a URL. Once an extraction is complete you can preview the content within the system like a news feed.
Additionally, you can customize the data tables from your extraction, eliminating and adding columns of data. Allowing you to create custom data tables, you can even visually scroll websites for specific information to add that the initial extraction missed.
Import.io provides some very detailed meta-data from its scrapings including account info, post date/time, post text, links to images, but not copies of those images, and hashtags. You can also opt to have the stats for the retweet/reblog, likes, and faves. It is even possible, with some fiddling to get the comments and responses to a specific “feed.”
There had to be a con to this scraping tool eventually. This is a commercial product and, as such, there is a fee. Import.io charges $299/month for the full version of their product. However, they do offer a promotional “limited” version of the software for free. Limitations include a max of 500 queries/extractions per month, and in the case of Twitter/Tumblr feeds they limit the crawl to the most recent 50 posts.
The free version would be more than sufficient for those who archive as they go–at the start of a project til its end. For those of us, like myself, who are coming to the scraping/archiving party late, the full version would be necessary to dig far enough back to get the whole of an account. Fortunately, Import.io does offer an educational discount–75% of an annual plan or 25% off a monthly plan (roughly $750/year or $225/month).
As a commercial marketing tool, this system provides both proprietary access to the data in a really slick and easy to use graphic interface to its database system. It also offers an option to export your data tables to a CSV (comma separated values) file that you can use in Excel or import into a MySQL database. To see an example of an exported extraction click here.