Web Scraping with Google Spreadsheets and XPath

bochi · on Sept 25, 2011

tl;dw(atch): Use the ImportXML function and XPath: https://docs.google.com/support/bin/answer.py?answer=155184

thedjpetersen · on Sept 25, 2011

I remember trying to use Google Apps Scripts to try to build a wikipedia game, by scraping the random page. It would determine whether the page was a person, and if so add it as a character and the rest would later be used as items. Didn't finish it but the idea was fun.

https://docs.google.com/spreadsheet/ccc?key=0AlQOPbxFfjKhdHd...

iamchrisle · on Sept 25, 2011

ImportXML Cookbook: http://www.seerinteractive.com/blog/importxml-cookbook/2011/...

I would love if anyone can add more to it... i'm trying to start a big collection :)

madiator · on Sept 25, 2011

Unfortunately the audio was bad, so couldn't continue listening after a minute. Would be great if the author could fix it.

cschep · on Sept 25, 2011

Possibly the worst abuse of "mobile" web high jacking. Can't even scroll without "paging" over? Boo!

wslh · on Sept 25, 2011

You can look at alternative articles such as: http://blog.ouseful.info/2008/10/14/data-scraping-wikipedia-...

elchief · on Oct 2, 2011

which browser? fine on opera mini

stfu · on Sept 25, 2011

Love that blog. Glad to see that it made it to HN. His Rapidminer tuts are excellent as well.