http:^^www.cs.cornell.edu^info^courses^current^cs537^project1.html

来自「This data set contains WWW-pages collect」· HTML 代码 · 共 265 行

HTML
265
字号
MIME-Version: 1.0
Server: CERN/3.0
Date: Sunday, 24-Nov-96 22:43:15 GMT
Content-Type: text/html
Content-Length: 10366
Last-Modified: Monday, 16-Sep-96 20:58:14 GMT

<html><head><title>CS 537 - Possible Term Projects</title></head><body><h3> Term Projects for CS537</h3>All CS537 projects will involve a significant amountof coding in C++. If this is something you are notfamiliar with, you should start early on the project.The projects will be either on PREDATOR or on MINIBASE.A few projects will be stand-alone and require neithersystem. I would rather you did the PREDATOR or thestand-alone projects, since you will learn the mostfrom these. Those of you who have done CS432 or anequivalent introductory database course should notdo the MINIBASE projects. In all the projects, therewill be an emphasis on coding according establishedconventions, documenting the code, and code stability(i.e. I would rather have you write code that does a few thingswell, rather than do many things in a very unstable fashion).You should choose your project by the 8th of October,but I encourage you to do so earlier and start on yourprojects. There should not be several groupsworking on the same project topics. Also, wherepossible, try to match the projects to your interests,and your background. Many projects need 2 persons,and if you can find your own groups, that is ideal.If you want me to put you in a group with someone else,I can do that. In either case, early into the project,you will need to identify what each person is going to do,and you will be graded individually on the net result.<h2>PREDATOR Projects</h2>PREDATOR is a client server DBMS built by me as aresearch prototype. The main research purpose of thesystem is to explore techniques to support a largenumber of data types in an extensible manner (meaningthat it should be possible to extend the system to supportfields of a new type -- like video, or image --- withoutchanging the system significantly. Part of PREDATOR isa relational database system supporting a subset of SQL.Most of the course projects will either involve extendingand enhancing the SQL functionality, or it will involveadding support for a new data type.<ul><li>  Add the OPT++ relational query optimizer  (2 persons)<p>Currently, PREDATOR does not have a serious query optimizer.Instead, it just uses the join order provided in the query.This project involves incorporating the OPT++ optimizerinto PREDATOR. OPT++ is an independent library which can beused to customize and design a query optimizer. Work willinvolve finding out about OPT++, integrating it withPREDATOR query evaluation, and demonstrating queryoptimization on simple join queries.<p>     <li>  Develop query plan visualization tool and     query execution visualization tool. (2 persons)<p> The purpose of this project is to build a graphical tool that displays a query plan (the result of query optimization), and also displays the execution of that plan (possibly by displaying how the computation is proceeding).<p><li>  Add path and function indexes. (2 persons)<p>This project involves getting a good understanding of the wayindexes are used in query processing. Path indexes are complexindexes, which can be implemented on top of the existing simple index functionality in PREDATOR. They are very important inobject-oriented and object-relational database systems. Inthis project, you will need to provide fully working pathindex capability (specifying an index in SQL, recording its presencein the catalogs, using the index if applicable in query optimization,and actually retrieving from the index at run-time). This projectwill give you a very good grasp of the internals of query processingengines.<p><li>  Evaluate the Wisconsin benchmark (2 persons)<p>  The Wisconsin benchmark is an industry standard DBMS benchmark thatis used to measure the performance of a relational DBMS. The project has two parts: first implement GroupBy and OrderBy features that arecurrently not supported in PREDATOR. Second, execute the benchmark,and try to enhance the performance to whatever extent is possible.This is invaluable experience if you plan to work on performancerelated issues in a real DBMS.<p><li>  Evaluate the TPC-D benchmark (2 persons)<p> The TPC-D benchmark is an advanced query processing benchmark, andsome of the functionality for this benchmark is not yet in PREDATOR.This project will involve a balance of adding some functionality(so that some of the benchmark queries run), and improving theperformance of those queries. Again, like the previous project,this is good exposure to practical benchmarks that people care about.<p><li>  Implement an image data type (RIVL) (2 persons)<p> PREDATOR already has a very elementary image data type. Thisproject will implement a large part of the support forimages found in RIVL( Brian Smith's multi-media system ),with operations to rotate, clip, overlap, etc an image.<p><li>  Implement an image data type (Vision) (2 persons)<p> PREDATOR already has a very elementary image data type. Thisproject will involve interacting with Ramin Zabih toincorporate his feature extraction algorithms, and usethese extracted image features to index the image data.<p><li>  Implement a video data type (1 or 2 persons)<p> Add a video data type with support for the various operationsdefined in RIVL (Brian Smith's multi-media system)<p><li>  Implement an audio data type (1 or 2 persons)<p> This requires some knowledge of audio data, and the likelyoperations on audio. Audio data needs to be added asa data type, along with manipulation functions.<p><li>  Implement a text document data type (1 or 2 persons)<p> Add a document data type, along with NLP operations onthe document (based on Claire Cardie's work). This willrequire interaction with the NLP people.<p><li>  Implement a data type for chemical molecules (1 or 2 persons)<p> Pharmeceutical companies have huge databases of chemicalmolecular structures, and much of their research involvessearching this database for 3-D spatial matches of molecules.This project will try to support a molecule structure asa data type in the database. Operations on the moleculewill be based on research that Paul Schuh and others havedone in this area. The project will involve interactionswith that group.<p><li>  Build a C++ language embedding for PREDATOR (2 persons)<p> Any commercial SQL system allows queries to be embeddedinside a host language (like C, C++, COBOL, etc). Thisproject will build a C++ embedding of PREDATOR SQL.<p><li>  Integrate external databases into PREDATOR<p> PREDATOR has an extensibility mechanism that allows newquery processing engines to be incorporated into the system.This project will extend this mechanism to integrateexternal databases (for instance, an Informix server)into PREDATOR.<p><li>  Ensure multi-user functionality (1 person)	<p>PREDATOR is a client-server system, implemented with a multi-threadedserver. However, the multi-user nature of the system has not beentested, and there are several problems. This project will fix all the problemsand demonstrate multi-user capability.<p><li>  Build a full SQL-92 compliant parser,         upto the level of type checking and transformations (1 or 2 persons)<p> The current version of SQL is a small subset of the ANSI standard.This project will make sure that the ANSI standard SQL-92 is implemented to the extent of parsing and type checking. If 2 personswork on this, some query transformations will also be requiredin this project.<p><li> Implement materialized views/query caching with indexed retrieval<p> For quite a while, researchers have suggested that the results ofqueries can be cached for later use in executing another query.This project will provide some portion of this functionality.Since this is an ongoing topic of research, this project must goalong with a paper survey of this topic.<p></ul><hr><h2>MINIBASE Projects</h2><ul><li>  Implement B+ - trees<p><li>  Implement Hybrid-Hash Join and Sort-Merge Join<p><li>  Implement two new buffer replacement policies<p><li>  Implement R-tree Indexes<p><li>  Implement Hash-based and Sort-based Aggregation<p></ul><hr><h2>Other Projects</h2><ul><li>  Build a "data-mining" DBMS (2 persons : upto 2 such groups)<p>Data mining is this exciting new area that blends AI with databases.The idea is that there is information or patterns hidden in a database that are not very evident. For instance, from medical databases, variousstatistical patterns can be extracted, or empirical cause-effect rules.This project must go with data mining paper survey. The purpose ofthe project is to implement some of the algorithms suggested in theliterature, and see how they perform.<p><li>  Implement data clustering algorithms (2 persons)<p>This is another aspect of data mining (see above). Here we are lookingto classify a large amount of data into a few groups or clustersbased on some properties. The point is to do this efficiently.Several algorithms have been proposed in the literature. The projectwill involve implementing a few of them.<p><li>  Build standalone System-R optimizer and randomized optimizer (2 persons)<p>Query optimization is (and has been) a very important topic in database query processing. In this project, you will build a stand-alone query optimizerusing two or three different approaches. The purpose is to comparethe alternatives suggested in the literature. This project must go along witha paper survey on query optimization.<p><li>  Build a simple OLAP (On-Line Analytical Processing) system (2 persons)<p>OLAP is a very refined form of query processing that involvesa large amount of precomputation of answers. The queries typically involve several aggregates, and the answers arepresented graphically.<p><li>  Build OLE-DB database components (2 persons)<p>OLE-DB is a protocol that Microsoft has developed on top of OLE/COMto allow multiple databases to connect and interoperate. Thisproject will involve using this protocol to build OLE-DBcompliant database components, and will be built on NT usingvisual C++. Since you will not have a lot of help on Visual C++from me or the TA, you should already be familiar with thisenvironment if you plan to take on this project.<p></ul><hr><HR></body></html>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?