Applications of Google spreadsheet functions for external data are virtually endless, and my previous article on how
=IMPORTXML() can help to automate the process of collecting web data is just an example. Since fantasy has no limits, I want to show that another possibility is to build… a proxy. Yes, a proxy inside Google preadsheet. Let see.
According to Google help function
=IMPORTDATA() retrieves information from a CSV or TSV file, but really it can be used for whatever web page. So, if I am precluded to visit a certain web site, say
www.example.com, in theory in an opened spreadsheet I could get the source code of its web page by
=IMPORTDATA("www.example.com"), copy and paste the returned string in a notepad window, save it as a html file and finally open it with the browser. If I need another page, I have to repeat all these steps, but clearly this is a bit clumsy.
I made no other reasoning until the reading about text mirror on ghacks.net convinced me that automating the described sketch would have made possible to get a better tool.
So I have build a small proxy app. It works combining two ad hoc elements: a spreadsheet which stores user commands and retrieves the html source of requested pages, and a bookmarklet which creates a minimal browsing interface layer in front of the sheet window and intercepts and processes changes in underlying sheet cells. To begin, the user has to open the spreadsheet and run the bookmarklet. Then, for each input of him (an url request, a move back or forward in the navigation history, a click on any link in the current served page) a form is submitted to the sheet in a transparent way, forcing the html source code of the requested page to be first retrieved on the sheet and then shown on the proxy interface layer. Every time the spreadsheet is opened or refreshed, history of previous session is deleted by a tiny Google apps script. Here is a short demo video.
Indeed, the main worth of this proxy is that it is practically invisible, apart from Google who obviously logs all the activity in the sheet. Among other well known proxies, some, as the one that everybody can set up following this Digital Inspiron tutorial, is subject to url filtering while most of them, as text tunnel, rely on web sites which can be blocked at any time. On the contrary this proxy app works over https protocol, so exhibiting only the private and protected url of the sheet on
docs.google domain both on the browser history and on a potential firewall device. Even
docs.google domain can be blocked, too, but it is widely accessed across all the world for fair uses and there is no way (or, at least, no simple way) for an outsider to understand what people are doing inside their own documents.
I must give a disclaimer. I consider this app just a proof of concept. It was nice for me to test it for some occasional browsing and use it in some emergency situations but even if I think it is harmless I cannot recommend it to other people because I do not know Google TOS in detail. Anyhow, remember that Google docs has a blocking system which prevents misuse of external data functions.
For testing purposes, spreadsheet and bookmarklet links are just below. I have checked functioning on Firefox 9 and Iron (Chrome) 14 with no problem on most of the sites I have visited but since the bookmarklet doesn’t work in an isolated iframe and its html parser is not very sophisticated, some unexpected accident could occour on some non-standard pages. If it hangs up when loading a page, refreshing the page and reloading the bookmarklet suffice to start a new session.
Proxy6109 (spreadsheet): to copy in your Google docs page
P.S.: IF YOU DON’T MAKE A COPY OF THE SPREADSHEET, IT CANNOT WORK PROPERLY when several persons open it at a time. Since I’m experiencing a massive access to my Google spreadsheet, from now on I am forced to disable form submission to prevent multiple editing. After opening the spreadsheet, go to
File > Make a copy and then check
Form > Accepting responses to start using it.
Proxy6109 (bookmarklet): to drag on your browser bookmark toolbar.