Quantcast
Channel: SCN : Discussion List - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 6412

Better compression due to column based storage grouped by native datatype ?

$
0
0

Hi HANA devs,

 

lets assume 2 column based table definitions: (c40 = 40 chars width, n3 = 3 digits width, uVal = value (unique by unique index))

 

Table T1:

Name (c40)City (c40)
Age (n3)
SmithBoston38
KlineBoston41

 

Table T2:

Name (c40)Size (n3)
Smith190
Kline176

 

 

Does HANA store these two tables internally like ... ?

 

Table T1_Col_1

KeyuVal
1Smith
2Kline

 

Table T1_Col_2

KeyuVal
138
241

 

Table T2_Col_1

KeyuVal
1Smith
2Kline

 

Table T2_Col_2

KeyuVal
1190
2176

--> This assumes a "per table" columnar design, good compression if original tables containing enough duplicate values to squeeze out.

 

Several years before HANA, we emulated a "columnar" design in a row based DB by normalizing all DB tables to a small amount of technical key/value tables grouping the values just by technical datatypes. With this approach ("store together, what's technically the same type"), the 2 tables could be stored like

 

Table T_c40

KeyuVal_c40
1Smith
2Kline
3Boston

 

Table T_n3

KeyuVal_n3
138
241
3190
4176

--> This assumes a "per database" columnar design, allows much more potentional duplicates to be squeezed out

 

We also tried a design with just one data storage table (besides the index tables to link the T_uVal entries to its original tables) like ...

Table T_uVal

KeyuVal
1Smith
2Kline
3Boston
438
541
6190
7176

--> we got all the "really unique" values in one big table now.

 

How is this done internally right now ?

 

Thanks for clarification,

Matthias


Viewing all articles
Browse latest Browse all 6412

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>