HANA and R with RServe

Hi,

I have posted this question in Blag's blog post (http://scn.sap.com/community/developer-center/hana/blog/2013/02/18/when-sap-hana-met-r--whats-new), but I guess this forum is the more appropriate place to ask these questions (of course I am grateful for Blag's replies!):

I have a very fundamental question concerning R and HANA: I have been working with an AWS HANA instance; I was not allowed to install RServe (or R) at this instance. In a productive HANA environment, is it possible to intall R on the same machine as HANA? Or is there some restriction from SAP's side concerning what may run on that appliance machine?

As with my AWS instance, HANA and R had to communicate over some TCP connection. From what I seen in my tests is that both HANA and R are increadibly efficient, but the internet connection, especially transfer of "big data" slows computations down enormously (to my understanding, even if it would have been a super-fast ethernet connection, this would not be much different).

I was just wondering how HANA/R scales and performs on truly big data sets. I have done my tests mainly on several dozen MB of data on AWS. However, would that also work for, say, 10-100 GB? My question is, would you recommend HANA over any other DB in this case, when doing analytics with R? My concern here is that the performance gain using an in-memory, row-based DB (rather than a disk-based one) seems small when we have to forward all the data over a TCP connection to R for analysis, which seems like a true bottleneck here. Also, running HANA on a seperate instance than the one where R is located does not seem to fit into the general HANA philosophy of moving application logic to the DB in order to get most out of the in-memory performance advantages and real-time capabilities.

The second thing I noticed is that PAL seems to be an alternative for common data mining tasks, which I have tried for the same algorithm (i.e. k-means). It worked very nicely. To my understanding, PAL is implemented in C/C++. Is it possible to extend this by adding functions of my own in C/C++, running on the HANA instance, i.e. machine learning algorithms etc.?

Did I get something wrong here?

Thank you in advance,

regards,

Georg

PS. my background is rather data mining specific other than SAP / BI/BW, so I may have missed some facts obvious to SAP veterans.

HANA and R with RServe

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112