📄 readme.intarray
字号:
This is an implementation of RD-tree data structure using GiST interfaceof PostgreSQL. It has built-in lossy compression.Current implementation provides index support for one-dimensional array ofintegers: gist__int_ops, suitable for small and medium size of arrays (used bydefault), and gist__intbig_ops for indexing large arrays (we use superimposedsignature with length of 4096 bits to represent sets). There is also anon-default gin__int_ops for GIN indexes on integer arrays.All work was done by Teodor Sigaev (teodor@stack.net) and Oleg Bartunov(oleg@sai.msu.su). See http://www.sai.msu.su/~megera/postgres/gistfor additional information. Andrey Oktyabrski did a great work on adding new functions and operations.FUNCTIONS: int icount(int[]) - the number of elements in intarraytest=# select icount('{1,2,3}'::int[]); icount -------- 3(1 row) int[] sort(int[], 'asc' | 'desc') - sort intarraytest=# select sort('{1,2,3}'::int[],'desc'); sort --------- {3,2,1}(1 row) int[] sort(int[]) - sort in ascending order int[] sort_asc(int[]),sort_desc(int[]) - shortcuts for sort int[] uniq(int[]) - returns unique elementstest=# select uniq(sort('{1,2,3,2,1}'::int[])); uniq --------- {1,2,3}(1 row) int idx(int[], int item) - returns index of first intarray matching element to item, or '0' if matching failed.test=# select idx('{1,2,3,2,1}'::int[],2); idx ----- 2(1 row) int[] subarray(int[],int START [, int LEN]) - returns part of intarray starting from element number START (from 1) and length LEN. test=# select subarray('{1,2,3,2,1}'::int[],2,3); subarray ---------- {2,3,2}(1 row) int[] intset(int4) - casting int4 to int[]test=# select intset(1); intset -------- {1}(1 row)OPERATIONS: int[] && int[] - overlap - returns TRUE if arrays have at least one common element int[] @> int[] - contains - returns TRUE if left array contains right array int[] <@ int[] - contained - returns TRUE if left array is contained in right array # int[] - returns the number of elements in array int[] + int - push element to array ( add to end of array) int[] + int[] - merge of arrays (right array added to the end of left one) int[] - int - remove entries matched by right argument from array int[] - int[] - remove right array from left int[] | int - returns intarray - union of arguments int[] | int[] - returns intarray as a union of two arrays int[] & int[] - returns intersection of arrays int[] @@ query_int - returns TRUE if array satisfies query (like '1&(2|3)') query_int ~~ int[] - returns TRUE if array satisfies query (commutator of @@)(Before PostgreSQL 8.2, the containment operators @> and <@ wererespectively called @ and ~. These names are still available, but aredeprecated and will eventually be retired. Notice that the old namesare reversed from the convention formerly followed by the core geometricdatatypes!)CHANGES:August 6, 2002 1. Reworked patch from Andrey Oktyabrski (ano@spider.ru) with functions: icount, sort, sort_asc, uniq, idx, subarray operations: #, +, -, |, &October 1, 2001 1. Change search method in array to binarySeptember 28, 2001 1. gist__int_ops now is without lossy 2. add sort entry in picksplitSeptember 21, 2001 1. Added support for boolean query (indexable operator @@, looks like a @@ '1|(2&3)', perfomance is better in any case ) 2. Done some small optimizationsMarch 19, 2001 1. Added support for toastable keys 2. Improved split algorithm for intbig (selection speedup is about 30%)INSTALLATION: gmake gmake install -- load functions psql <database> < _int.sql REGRESSION TEST: gmake installcheckEXAMPLE USAGE: create table message (mid int not null,sections int[]); create table message_section_map (mid int not null,sid int not null); -- create indicesCREATE unique index message_key on message ( mid );CREATE unique index message_section_map_key2 on message_section_map (sid, mid );CREATE INDEX message_rdtree_idx on message using gist ( sections gist__int_ops); -- select some messages with section in 1 OR 2 - OVERLAP operator select message.mid from message where message.sections && '{1,2}'; -- select messages contains in sections 1 AND 2 - CONTAINS operator select message.mid from message where message.sections @> '{1,2}'; -- the same, CONTAINED operator select message.mid from message where '{1,2}' <@ message.sections;BENCHMARK: subdirectory bench contains benchmark suite. cd ./bench 1. createdb TEST 2. psql TEST < ../_int.sql 3. ./create_test.pl | psql TEST 4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries with/without RD-Tree. Run script without arguments to see availbale options. a)test without RD-Tree (OR) ./bench.pl -d TEST -c -s 1,2 -v b)test with RD-Tree ./bench.pl -d TEST -c -s 1,2 -v -rBENCHMARKS:Size of table <message>: 200000Size of table <message_section_map>: 269133 Distribution of messages by sections:section 0: 74377 messagessection 1: 16284 messagessection 50: 1229 messagessection 99: 683 messagesold - without RD-Tree support,new - with RD-Tree+----------+---------------+----------------+|Search set|OR, time in sec|AND, time in sec|| +-------+-------+--------+-------+| | old | new | old | new |+----------+-------+-------+--------+-------+| 1| 0.625| 0.101| -| -|+----------+-------+-------+--------+-------+| 99| 0.018| 0.017| -| -|+----------+-------+-------+--------+-------+| 1,2| 0.766| 0.133| 0.628| 0.045|+----------+-------+-------+--------+-------+| 1,2,50,65| 0.794| 0.141| 0.030| 0.006|+----------+-------+-------+--------+-------+
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -