8.txt

来自「This complete matlab for neural network」· 文本代码 · 共 123 行

TXT

123 行

发信人: GzLi (笑梨), 信区: DataMining
标  题: [合集]数据仓库里的实体化视图怎么实现？
发信站: 南京大学小百合站 (Sat Sep 21 12:46:30 2002), 站内信件

netsaint (大圣) 于Tue Sep 10 16:13:29 2002)
提到：

很多文章都是空谈，到底怎么实现呢？

是自己定义格式还是借助别人已有的格式


欢迎提供线索，资源


shg_w@yahoo.com.cn



fervvac (高远) 于Tue Sep 10 22:46:30 2002提到：

What is your question?

To compute the MV, you simple modify existing cube computation algorithms to 
skip those cuboids that do not need to be materialized. The modification is
straigh-forward for both top-down and bottom-up algorithms.

If you are asking for the physcial storage format, I think the commercial 
systems still store the data in relational tables.

There is another line of approaches that use multidimensional model for the 
cube/cuboids. They store the pre-computed results in chunks.



netsaint (大圣) 于Wed Sep 11 09:28:32 2002)
提到：

我指的是物理存储模式，目前有很多都是用chunk方式，但这样实现起来比较难的

如果直接存放在关系表里，那对不同的维的组合就有一个值，表的字段不好统一设定

我想（把那些综合数据）存放在一个临时文件里如.txt里，好像更麻烦了





fervvac (高远) 于Wed Sep 11 14:33:12 2002提到：

For any serious implementation, it is a bad idea to store data in text 
format. 

First of all, you need to know which type of system you have, ROLAP or MOLAP.
Chunk is only used in MOLAP, and relations are only used in ROLAP.

The basic method to store data in ROLAP is to store them in relational table(s).
You can either create a table with all dimension attributes  pulus the measure
attribture, or use multiple tables, each for a cuboid. So there is no 
fundamental difficulty  there.

For chunks, I am not sure of the real implementation of commercial systems.
For most prototype systems, the problem is that "extra-long" integer type 
and its efficient calculation is needed. Space trade-off is another issue.
Moreoever, there is an issue of choosing an appropriate compression method.
 


netsaint (大圣) 于Wed Sep 11 21:34:38 2002)
提到：

一般的ROLAP确实把这种数据存放在关系表里

我把这些数据存放在临时文件里是出于这样的目的：

把这些文件常驻内存，这样，对每个请求，首先去查询内存中的文件

如果找不到结果，再去数据库检索关系表

是否可行？

对于这个临时文件怎么去统一它的格式，还有怎样去读取里面的数据（好像要对字符串读
取和处理的）？

头都大了









fervvac (高远) 于Thu Sep 12 11:52:42 2002提到：

1. MV/cube is usually much larger than available memory, thus you cannot 
   put it entirely in the memory.

   To quickly return a precomputed result from the MV/cube, you need to index
   it. Either the traditional index or some new ones can be used. Related 
   techniques includeub-tree, cubetree, etc.
   Ross has an interesting paper in ssdbm (2000?) about actively caching 
   part of the cube in memory.

2. Not sure why you think it is difficult to store your result in a file. This
   method, although slower, only requires basic knowledge of C/C++. Find any
   c/c++ book and read the i/o part. btw, you need to map dimension values to
   integer first.



netsaint (大圣) 于Sat Sep 14 09:22:56 2002)
提到：

我最近在看Ross 的文章

他用的是二级存储方式

第一级是存放"高值"元组，即粗粒度的数据，第二级存放最细节的数据，这是他最近定义
的一个内存数据结构，也是基于array的

不过array的方式我不知道他是怎么实现的

是不是要借助已有的中间件来存储？

否则，多维数组怎么实现？





fervvac (高远) 于Sun Sep 15 03:25:57 2002提到：

That's the ssdbm paper.

The memory data structure is just hash tables, I remember.

Linearize it.

8.txt - 源码说明

本页面展示了「This complete matlab for neural network」中的 8.txt 源码文件，采用文本编程语言编写，共 123 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫下载站收录了大量与complete相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?