library(XML)
url.df_1 = htmlTreeParse("http://www.appannie.com/app/android/com.king.candycrushsaga/", useInternalNodes = T)
If I run under the code,
xpathSApply(url.df_1, "//div[@class='app_content_section']/h3", function(x) c(xmlValue(x), xmlAttrs(x)[["href" ]]))
I will get the following –
[1] "Description" "What's new"
[3] "Permissions" "More Apps by King.com All Apps »"
[5] "Customers Also Viewed" "Customers Also Installed"
Now, I am only interested in the “Customers Also Installed” part. But , When I run the following code,
xpathSApply(url.df_1, "//div[@class='app_content_section']/ul/li/a", function( x) c(xmlValue(x), xmlAttrs(x)[["href"]]))
It will “more applications in all applications of King.com”, “also viewed All applications contained in “Customers” and “Customers have also been installed” spit out.
So I tried,
xpathSApply(url.df_1, " //div[h3='Customers Also Installed']”, function(x) c(xmlValue(x), xmlAttrs (x)[["href"]]))
But it didn’t work. So I tried it
xpathSApply(url.df_1, " //div[contains(.,'Customers Also Installed')]",xmlValue)
But this doesn’t work either. (The output should look like this)
[,1]
[1,] "Christmas Candy Free\n Daniel Development\n "
[2,] "/app/android/xmas.candy.free/"
[,2]
[1,] "Jewel Candy Maker\n Nutty Apps\n "
[2,] "/app/android/com.candy.maker.jewel.nuttyapps/"
[,3]
[1,] "Pogz 2\n Terry Paton\n "
[2,] "/app/android/com.terrypaton.unity.pogz2/"< /pre>Any guidance would be greatly appreciated!
xpathSApply(url.df_1,"//div[contains(.,'Customers Also Installed')]/*/li/a",xmlGetAttr,'href')
< br />[1] "/app/android/xmas.candy.free/"
[2] "/app/android/com.candy.maker.jewel.nuttyapps/"
[3] "/app/android/com.terrypaton.unity.pogz2/"
Suppose I use the following expression to parse a website
library(XML)
url.df_1 = htmlTreeParse("http://www.appannie.com/app/android/com.king.candycrushsaga/", useInternalNodes = T)< /pre>If I run under the code,
xpathSApply(url.df_1, "//div[@class='app_content_section']/h3", function (x) c(xmlValue(x), xmlAttrs(x)[["href"]]))I will get the following –
[ 1] "Description" "What's new"
[3] "Permissions" "More Apps by King.com All Apps »"
[5] "Customers Also Viewed" "Customers Also Installed"Now, I’m only interested in Install" section. However, when I run the following code,
xpathSApply(url.df_1, "//div[@class='app_content_section']/ul/li/ a", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))It will "more applications of all applications on King.com" , "Customers have also been viewed" and "Customers have also been installed" spit out all applications.
So I tried,
xpathSApply( url.df_1, "//div[h3='Customers Also Installed']", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))But it didn't work. So I tried it
xpathSApply(url.df_1, "//div[contains(.,'Customers Also Installed')]",xmlValue)But this doesn’t work either. (The output should look like this)
[,1]
[1,] "Christmas Candy Free\n Daniel Development\n "
[2,] "/app/android/xmas.candy.free/"
[,2]
[1,] "Jewel Candy Maker\ n Nutty Apps\n "
[2,] "/app/android/com.candy.maker.jewel.nuttyapps/"
[,3]
[1,] "Pogz 2 \n Terry Paton\n "
[2,] "/app/an droid/com.terrypaton.unity.pogz2/"Any guidance would be greatly appreciated!
This is an option (you are really close):
xpathSApply(url .df_1,"//div[contains(.,'Customers Also Installed')]/*/li/a",xmlGetAttr,'href')
[1] "/app/android/ xmas.candy.free/"
[2] "/app/android/com.candy.maker.jewel.nuttyapps/"
[3] "/app/android/com.terrypaton.unity. pogz2/"