R, right xPath expression when using XML and XPathApply

Suppose I use the following expression to parse a website

library(XML)
url.df_1 = htmlTreeParse("http://www.appannie.com/app/android/com.king.candycrushsaga/", useInternalNodes = T)

If I run under the code,

xpathSApply(url.df_1, "//div[@class='app_content_section']/h3", function(x) c(xmlValue(x), xmlAttrs(x)[["href" ]]))

I will get the following –

[1] "Description" "What's new" 
[3] "Permissions" "More Apps by King.com All Apps »"
[5] "Customers Also Viewed" "Customers Also Installed"

Now, I am only interested in the “Customers Also Installed” part. But , When I run the following code,

xpathSApply(url.df_1, "//div[@class='app_content_section']/ul/li/a", function( x) c(xmlValue(x), xmlAttrs(x)[["href"]]))

It will “more applications in all applications of King.com”, “also viewed All applications contained in “Customers” and “Customers have also been installed” spit out.

So I tried,

xpathSApply(url.df_1, " //div[h3='Customers Also Installed']”, function(x) c(xmlValue(x), xmlAttrs (x)[["href"]]))

But it didn’t work. So I tried it

xpathSApply(url.df_1, " //div[contains(.,'Customers Also Installed')]",xmlValue)

But this doesn’t work either. (The output should look like this)

[,1] 
[1,] "Christmas Candy Free\n Daniel Development\n "
[2,] "/app/android/xmas.candy.free/"
[,2]
[1,] "Jewel Candy Maker\n Nutty Apps\n "
[2,] "/app/android/com.candy.maker.jewel.nuttyapps/"
[,3]
[1,] "Pogz 2\n Terry Paton\n "
[2,] "/app/android/com.terrypaton.unity.pogz2/"< /pre>

Any guidance would be greatly appreciated!

This is an option (you are really close):

< p>

xpathSApply(url.df_1,"//div[contains(.,'Customers Also Installed')]/*/li/a",xmlGetAttr,'href')
< br />[1] "/app/android/xmas.candy.free/"
[2] "/app/android/com.candy.maker.jewel.nuttyapps/"
[3] "/app/android/com.terrypaton.unity.pogz2/"

Suppose I use the following expression to parse a website

library(XML)
url.df_1 = htmlTreeParse("http://www.appannie.com/app/android/com.king.candycrushsaga/", useInternalNodes = T)< /pre>

If I run under the code,

xpathSApply(url.df_1, "//div[@class='app_content_section']/h3", function (x) c(xmlValue(x), xmlAttrs(x)[["href"]]))

I will get the following –

[ 1] "Description" "What's new" 
[3] "Permissions" "More Apps by King.com All Apps »"
[5] "Customers Also Viewed" "Customers Also Installed"

Now, I’m only interested in Install" section. However, when I run the following code,

xpathSApply(url.df_1, "//div[@class='app_content_section']/ul/li/ a", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))

It will "more applications of all applications on King.com" , "Customers have also been viewed" and "Customers have also been installed" spit out all applications.

So I tried,

xpathSApply( url.df_1, "//div[h3='Customers Also Installed']", function(x) c(xmlValue(x), xmlAttrs(x)[["href"]]))

But it didn't work. So I tried it

xpathSApply(url.df_1, "//div[contains(.,'Customers Also Installed')]",xmlValue) 

But this doesn’t work either. (The output should look like this)

[,1] 
[1,] "Christmas Candy Free\n Daniel Development\n "
[2,] "/app/android/xmas.candy.free/"
[,2]
[1,] "Jewel Candy Maker\ n Nutty Apps\n "
[2,] "/app/android/com.candy.maker.jewel.nuttyapps/"
[,3]
[1,] "Pogz 2 \n Terry Paton\n "
[2,] "/app/an droid/com.terrypaton.unity.pogz2/"

Any guidance would be greatly appreciated!

This is an option (you are really close):

xpathSApply(url .df_1,"//div[contains(.,'Customers Also Installed')]/*/li/a",xmlGetAttr,'href')

[1] "/app/android/ xmas.candy.free/"
[2] "/app/android/com.candy.maker.jewel.nuttyapps/"
[3] "/app/android/com.terrypaton.unity. pogz2/"

Leave a Comment

Your email address will not be published.