Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Web Scraping with Google Spreadsheets and XPath (vancouverdata.blogspot.com)
56 points by wslh on Sept 25, 2011 | hide | past | favorite | 8 comments


tl;dw(atch): Use the ImportXML function and XPath: https://docs.google.com/support/bin/answer.py?answer=155184


I remember trying to use Google Apps Scripts to try to build a wikipedia game, by scraping the random page. It would determine whether the page was a person, and if so add it as a character and the rest would later be used as items. Didn't finish it but the idea was fun.

https://docs.google.com/spreadsheet/ccc?key=0AlQOPbxFfjKhdHd...


ImportXML Cookbook: http://www.seerinteractive.com/blog/importxml-cookbook/2011/...

I would love if anyone can add more to it... i'm trying to start a big collection :)


Unfortunately the audio was bad, so couldn't continue listening after a minute. Would be great if the author could fix it.


Possibly the worst abuse of "mobile" web high jacking. Can't even scroll without "paging" over? Boo!


You can look at alternative articles such as: http://blog.ouseful.info/2008/10/14/data-scraping-wikipedia-...


which browser? fine on opera mini


Love that blog. Glad to see that it made it to HN. His Rapidminer tuts are excellent as well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: