Quantcast
Channel: SCN : Discussion List - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 6412

Categorizing multiple texts with TM_CATEGORIZE_KNN in one call

$
0
0

Hi guys,

 

I have a working application right now, but I want to make the things more efficient. I am developing a Java Dynamic web project with SAP HANA. In this project, I call a procedure which executes kNN text classification by using TM_CATEGORIZE_KNN from text mining which had been defined as follows

 

create procedure "SYSTEM"."KNN_TEXT_CLASSIFICATION_MAINCATEGORY"
(in textInput NCLOB) as
begin
SELECT T.CATEGORY_VALUE, T.NEIGHBOR_COUNT, T.SCORE  FROM TM_CATEGORIZE_KNN(    DOCUMENT :textInput      MIME TYPE 'text/plain'    SEARCH NEAREST NEIGHBORS 5 "text"      FROM "SYSTEM"."aveaLabelledData"    RETURN top 1      "main_category"      from "SYSTEM"."aveaLabelledData"     ) AS T;
end;

So, every time I call this procedure, I am able to categorize only one text (which is textInput). However, I have a table (named as SYSTEM/AVEA_UNLABELLEDDATA_VIEW) which contains hundreds of rows and all these records need to be categorized by TM_CATEGORIZE_KNN function. That's why I iterate through all rows in that table and for each row I call KNN_TEXT_CLASSIFICATION_MAINCATEGORYprocedure from my code as follows.

 

PreparedStatement pstmt = null;            pstmt = connection                    .prepareStatement("SELECT \"id\",\"text\" FROM \"_SYS_BIC\".\"SYSTEM/AVEA_UNLABELLEDDATA_VIEW\"");            ResultSet rs = pstmt.executeQuery();            pstmt = null;                       String text = null, sql=null;            Double id = null;            Statement stmt = null;            ResultSet rs2 = null;            while (rs.next()) {                id = rs.getDouble(1);                text = rs.getString(2);                stmt = connection.createStatement();                sql = "CALL \"SYSTEM\".\"KNN_TEXT_CLASSIFICATION_MAINCATEGORY\"(\'" + text + "\')";                rs2 =  stmt.executeQuery(sql);                           }

It actually works without any error. But, since there are hundreds of rows to be categorized, that means I call hundreds of calls of my procedure from my code. That makes the things quite slow.

 

I wonder, is there a way to categorize all the rows in one call? In other words, I need to change my procedure such that, it will categorize all the records, not just one.

 

Thanks in advance,

Inanc


Viewing all articles
Browse latest Browse all 6412

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>