📄 symboltable.otx

📁 inger小型c编译器源码
💻 OTX
📖 第 1 页 / 共 2 页
字号:
12 下一页
% symboltable.tex
% Practical Compiler Construction
% Chapter: Symbol table

\chapter{Symbol table}
    
	\section{Introduction to identification}
	%Introduce the concept of identification

%	\begin{quote}
%	    \textbf{Why is identification required? Give an example of a flawed program.}
%	\end{quote}

	At compile time we need to keep track of the different symbols 
	(functions and variables) declared in the program source code. 
	Identification is required because every symbol should be identified
	with a unique name so we can find the symbols back later. Below an
	example of a flawed program.
	
	\begin{example} An incorrect program
	    \begin{quote}
	        \begin{verbatim}
		    2 * 4;
		    myfunction( 3 );
		\end{verbatim}
	    \end{quote}
	\end{example}
	
	Without identification it is not possible to do assignments, or define 
	functions which makes most of the code more or less useless. In the 
	above example we have a mathematical production but the result is lost 
	because no assignment to a uniquely identified symbol is done. We also 
	call a function myfunction which is never declared and identified as 
	being a function, and therefore not available cause we could not 
	possibly know to what myfunction is referring.\\
	
%	\begin{quote}
%		\textbf{What is scoping? What benefits does scoping have? Give an example program
%		that makes good use of scoping. Show that identification uses scopes by 
%		using identical identifiers in different scopes.}
%	\end{quote}	

	What actually is scoping? In Webster's Revised Unabridged Dictionary (1913) a scope
	is defined as:
	
	\begin{quote}
		\emph{Room or opportunity for free outlook or aim; space for action; 
		amplitude of opportunity; free course or vent; liberty; range of view, 
		intent, or action.}
	\end{quote}

	When talking about scoping in a programming context it means that certain 
	actions like variable assignments, or symbol lookups only affect symbols 
	in a certain range. Lets illustrate this with an example in which the start 
	of a scope is indicated by a {, and a scope end by }.

	\begin{example}
	    A simple scope example
	    \begin{verbatim}				    
		/* begin scope 0 */
		    
		type_x a = 4
		type_y b = 3
		
		main( )
		/* begin scope 1 */
		{
		  type_x a = 0
		
		  /* begin scope 2 */
		  {				
		    type_x a = 5
		    print a				
		  }
		  /* end of scope 2 */
		  print b
		}
		/* end of scope 1 */
		/* end of scope 0 */
	    \end{verbatim}
	\end{example}
	
	This example shows a program with 3 scopes containing variable declarations,
	whenever a global or local symbol is encountered it should be stored into 
	a symbol table. This symbol table will remain available until the compiler 
	is finished.\\
	
	In earlier stages of research we thought local symbols should only be present 
	in the symbol table when the function it is declared in is being processed. 
	This does not seem to be a good idea, because later on in the semantic stage 
	we need access to all symbols.Using identification in scopes enables us to 
	declare a variable or symbol multiple times in the same program but in different 
	scopes, this implicates that a symbol is unique in its scope and not 
	necessarily in the program, this way we do not run out of useful variable 
	names within a program.	
	
	\section{Identification using a symbol table}
	% Explain why a symbol table is required, and how it is used for identification
	
%	\begin{quote}
%		\emph{Argue that symbols must be stored for later reference (for identification
%		purposes in later stages). This shows why we require a data structure to store 
%		symbols in.}
%	\end{quote}	
	
	Symbols collected during the first pass of compilation must be stored for later 
	reference in the semantic stage. In order to have easy access to the symbols in 
	later stages it is important to define a clear data structure in which we store 
	the necessary information about the symbols. This data structure is called a 
	symbol table and can be implemented in a variety of ways (arrays, linked lists, 
	hash tables, binary search trees). Later on in this chapter we will discuss what 
	data structure would be best for us.\\
	
%	\begin{quote}
%		\emph{Using a very simple symbol table structure (i.e. an array), show how the 
%		ST is filled from the AST. Give an algorithm and an example.}
%	\end{quote}
	
	To show how the symbol table is filled from the Abstract Syntax Tree we will use 
	a very simple table structure as an example; an array. Below a sequence of actions 
	displays which steps are necessary to build the symbol table, a simple data structure, 
	from the AST.
	
	\begin{enumerate}
		\item	Start walking at the root node of the AST.
		\item	For each scope entered during parsing, add one to the symbol 
			table's scope counter.
		\item	For each declaration instruction parsed, we extract:
		\begin{itemize}
			\item[-] Name of the symbol
			\item[-] Symbol type
			\item[-] Indirection level
			\item[-] Modifiers
		\end{itemize}
	
		\item	For each function found, we extract:
		\begin{itemize}
			\item[-] process its function header, and store the same information of 
	  		possible parameters as we do with variable declarations.
	  	\end{itemize}
	  \end{enumerate}
	
	Below an array, representing a simple symbol table to illustrate the idea of the table.\\
	
		PICTURE HERE\\

	\section{Symbol table data structures}
	%An overview of some available data structures, and how to make the right choice
	
%	\begin{quote}
%		\emph{Dynamic vs. static symbol table. Why? What are the other options, 
%		and why are they wrong? Why is an AST necessary at all? Why are multiple 
%		passes through the AST necessary?}
%	\end{quote}
	
	There are two possible types of symbol tables, dynamic or static symbol tables. What 
	exactly are the differences between the two types, and why is one better than the other? 
	A dynamic symbol table will grow and shrink during the compilation pass and will not be 
	available anymore after the pass is finished. This is not the desired behaviour of the 
	symbol table since we will apply the multi pass principle which means not all the required 
	compilation actions (lexing, parsing, semantic checking and code generation) will take 
	place in one pass. Below an example of the collection of symbols evolving during pass 1.


	\begin{example}
	  \begin{verbatim}
		 1 module example;
		 2 
		 3 int v1, v2;
		 4 
		 5 f: int v1, v2 -> int
		 6 {
		 7     return (v1 + v2);
		 8 }
		 9  
		10 g: int v3 -> int
		11 {
		12     return  (v1 + v3);
		13 }

	\end{verbatim}
	
	The collection of symbols is extended during the process.\\
	At line 4 the symbol table is a collection of variables: T = \{ v1, v2 \}.\\
	At line 7 the symbol table also contains the local variables v1 and v2:\\ T = 
	\{ v1, v2, v1, v2 \}.\\
	At line 9 the symbol table is back to its global form: T = { v1, v2 }.\\
	At line 12 the symbol table is expanded with the local variable v3:\\ 
	T = \{ v1, v2, v3 \}.
	\end{example}

	When using a static symbol table, the table will not shrink but only grow. Instead 
	of building the symbol table during pass 1 which happens when using a dynamic table, 
	we will construct the symbol table from the AST. The AST will be available after 
	parsing the source code.

	\section{Data structure selection}
	% Several aspects on the selection of a suitable data structure for the symbol table.

	\begin{enumerate}
		\item	Criteria
		\item	Several data structures compared using these criteria.
		\item	Select a data structure.
	\end{enumerate}

	\subsection{Criteria}
	In order to choose the right data structure for implementing the symbol table we
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -