📄 exam3.txt
字号:
Part1:
Question1: State where the beginning and end of the transaction should occur by referring to the number of the step as well as whether the boundary is before the step or after the step referenced. Fully justify your answer.
Answer:
First: The transaction begins before step 2 and stop step 3.
Reason: The step 1 just offers some information to user and do not occur to database operation. The query operation begins, with the step 2. So, the transaction begins before step 2. After step 3, the transaction over.
Second: Set "TRANSCTION READ COMMIT".
Reason: There may be too many write and read operation to database at the same time. Set the isolation level to "READ COMMIT" is appropriate.
Question2: State what isolation level the transaction should be run at. Fully justify your answer in terms of performance in a concurrent system as well as semantic correctness.
Answer:
First: The transaction begins before step 1 and stop after step 3.
Reason: At step 1 the query begins, so the transaction should begin before that step. The step 3 may write something to database. So it must stop to that step.
Second: Set "TRANSACTION READ SERIALIZABLE".
Reason: Different write and read operations took place. To avoid write-read and read-write dependencies, phantom problem, setting the isolation level "SERIALIZABLE" is good.
___________________________________________________________
PART2:
Question1: State which index supported by PostgreSQL will improve the performance of each query. You must state what column or set of columns you would index and what index you would use. Fully justify your choice by referring to the characteristics of the index that improve the performance each query.
Answer:
1. Query First:
A hash Index as followed:
select *
from census
where encrypted_name='9946561'
Reason: In this query, PostgreSQL is using a sequential scan on the census table on 'encrypted_name '. Because this is an equality selection condition based on 'encrypted_name'.
2. Query 2:
b-tree Index as followed:
select *
from census
where family_income between 10500 and 55000;
Reason: In this query, PostgreSQL is using a sequential scan on the census table that visits actual rows. The total run-time cost of this query is available disk-page fetch units and a total run time of 1500.89 msec. So we only need to access data range query.
Question2: Are any of the indexes you suggest above candidates for clustering? Fully justify your answer.
Answer:
Second that use B-Tree could use clustering index, for clustering index can be useful in speeding up range queries.
Question3: The query plans listed above show that a large number of records are touched by the queries. Based on this you can deduce that the table is very large, i.e. it occupies a large number of disk blocks. If you were to assume that the database was small, i.e. it occupies one or two disk blocks, would you still index the table? Fully justify your answer.
Answer:
If the database is small, there is no need to use index. Because when we use index that want to minimize the number of sequential scans used in an application. And an index is only useful when the number of tuples that satisfy a condition are small compared to the total number of tuples for the entire table. And for the database it just occupies one or two disk blocks, so there is also not encourage use index, for the hard drive is very expensive.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -