While trying to scrape some data from a Website , I chanced upon the getXML function which is pretty neat, as it basically allows you to import the XML feed of a webpage and then parse the data appropriately.
Here is an example-
Using the getXML function I parsed all links for “analytics consultant in India” search […]
While trying to scrape some data from a Website , I chanced upon the getXML function which is pretty neat, as it basically allows you to import the XML feed of a webpage and then parse the data appropriately.
Here is an example-
Using the getXML function I parsed all links for “analytics consultant in India” search results in Google.
The GetXML function works as follows (from the support page here )
Functions:
=importXML("URL","query")
- URL – the URL of the XML or HTML file
- query – the XPath query to run on the data given at the URL. For example, "//a/@href" returns a list of the href attributes of all <a> tags in the document (i.e. all of the URLs the document links to). For more information about XPath, please visithttp://www.w3schools.com/xpath/
- Example: =importXml("www.google.com", "//a/@href"). This returns all of the href attributes (the link URLs) in all the <a> tags on www.google.com home page
You can see it here-
http://spreadsheets.google.com/pub?key=pS9vSxWuwOllXHdueY0TDdg
or Using the Embed Function