📄 7serial.html
字号:
_store.DeSerialize (in);
}</pre>
</td></tr>
</table>
<!-- End Code -->
<p>The symbol table consists of a dictionary that maps strings to integers plus a variable that contains the current id. And the simplest way to walk the symbol table is indeed in this order. To walk the standard map we will use its iterator. First we have to store the count of elements, so that we know how many to read during deserialization. Then we will iterate over the whole map and store pairs: string, id. Notice that the iterator for <var>std::map</var> points to a <var>std::pair</var> which has <var>first</var> and <var>second</var> data members. According to our previous discussion, we store the integer id as a <var>long</var>.
<!-- Code -->
<table width="100%" cellspacing=10><tr>
<td class=codeTable>
<pre>void SymbolTable::Serialize (Serializer & out) const
{
out.PutLong (_dictionary.size ());
std::map<std::string, int>::const_iterator it;
for (it = _dictionary.begin (); it != _dictionary.end (); ++it)
{
out.PutString (it->first);
out.PutLong (it->second);
}
out.PutLong (_id);
}</pre>
</td></tr>
</table>
<!-- End Code -->
<p>The deserializer must read the data in the same order as they were serialized: first the dictionary, then the current id. When deserializing the map, we first read its size. Then we simply read pairs of strings and longs and add them to the map. Here we treat the map as an associative array. Notice that we first clear the existing dictionary. We have to do it, otherwise we could get into conflicts, with the same id corresponding to different strings.
<!-- Code -->
<table width="100%" cellspacing=10><tr>
<td class=codeTable>
<pre>void SymbolTable::DeSerialize (DeSerializer & in)
{
_dictionary.clear ();
int len = in.GetLong ();
for (int i = 0; i < len; ++i)
{
std::string str = in.GetString ();
int id = in.GetLong ();
_dictionary [str] = id;
}
_id = in.GetLong ();
}</pre>
</td></tr>
</table>
<!-- End Code -->
<p>Notice that for every serialization procedure we immediately write its counterpart--the deserialization procedure. This way we make sure that the two match.
<p>The serialization of the store is also very simple. First the size and then a series of pairs (double, bool).
<!-- Code --><table width=100% cellspacing=10><tr> <td class=codetable>
<pre>void Store::Serialize (Serializer & out) const
{
int len = _aCell.size ();
out.PutLong (len);
for (int i = 0; i < len; ++i)
{
out.PutDouble (_aCell [i]);
out.PutBool (_aIsInit [i]);
}
}</pre>
</td></tr></table><!-- End Code -->
<p>When deserializing the store, we first clear the previous values, read the size and then read the pairs (double, bool) one by one. We have a few options when filling the two vectors with new values. One is be to push them back, one by one. Since we know the number of entries up front, we could reserve space in the vectors up front, by calling the method <var>reserve</var>. Here I decided to <var>resize</var> the vectors instead and then treat them as arrays. The resizing fills the vector of doubles with zeroes and the vector of <var>bool</var> with <var>false</var> (these are the default values for these types).
<!-- Sidebar -->
<table width=100% border=0 cellpadding=5><tr>
<td width=10>
<td bgcolor="#cccccc" class=sidebar>
<p>There is an important difference between <var>reserve</var> and <var>resize</var>. Most standard containers have either one or both of these methods. <var>Reserve</var> makes sure that there will be no re-allocation when elements are added, e.g., using <var>push_back</var>, up to the reserved <i>capacity</i>. This is a good optimization, in case we know the required capacity up front. In the case of a vector, the absence of re-allocation also means that iterators, pointers or references to the elements of the vector won't be suddenly invalidated by internal reallocation.
<p><var>Reserve</var>, however, does not change the <i>size</i> of the container. <var>Resize</var> does. When you resize a container new elements are added to it. (Consequently, you can't <var>resize</var> containers that store objects with no default constructors or default values.)
<ul>
<li>reserve--changes capacity but not size
<li>resize--changes size
</ul>
<p>You can enquire about the current capacity of the container by calling its <var>capacity</var> method. And, of course, you get its size by calling <var>size</var>.
</table>
<!-- End Sidebar -->
<!-- Code --><table width=100% cellspacing=10><tr> <td class=codetable>
<pre>void Store::DeSerialize (DeSerializer & in)
{
_aCell.clear ();
_aIsInit.clear ();
int len = in.GetLong ();
_aCell.resize (len);
_aIsInit.resize (len);
for (int i = 0; i < len; ++i)
{
_aCell [i] = in.GetDouble ();
_aIsInit [i] = in.GetBool ();
}
}</pre>
</td></tr></table><!-- End Code -->
<p>Finally, let's have a look at the implementation of the deserializer stream. It is a pretty thin layer on top of the output stream.
<!-- Code -->
<table width="100%" cellspacing=10><tr>
<td class=codeTable>
<pre>#include <fstream>
using std::ios_base;
const long TruePattern = 0xfab1fab2;
const long FalsePattern = 0xbad1bad2;
class DeSerializer
{
public:
DeSerializer (std::string const & nameFile)
: _stream (nameFile.c_str (), ios_base::in | ios_base::binary)
{
if (!_stream.is_open ())
throw "couldn't open file";
}
long GetLong ()
{
if (_stream.eof())
throw "unexpected end of file";
long l;
_stream.read (reinterpret_cast<char *> (&l), sizeof (long));
if (_stream.bad())
throw "file read failed";
return l;
}
double GetDouble ()
{
double d;
if (_stream.eof())
throw "unexpected end of file";
_stream.read (reinterpret_cast<char *> (&d), sizeof (double));
if (_stream.bad())
throw "file read failed";
return d;
}
std::string GetString ()
{
long len = GetLong ();
std::string str;
str.resize (len);
_stream.read (&str [0], len);
if (_stream.bad())
throw "file read failed";
return str;
}
bool GetBool ()
{
long b = GetLong ();
if (_stream.bad())
throw "file read failed";
if (b == TruePattern)
return true;
else if (b == FalsePattern)
return false;
else
throw "data corruption";
}
private:
std::ifstream _stream;
};</pre>
</td></tr>
</table>
<!-- End Code -->
<p>Several interesting things happen here. First of all: What are these strange flags that we pass to <var>ifstream::open ()</var>? The first one, <var>ios_base::in</var>, means that we are opening the file for input. The second one, <var>ios_base::binary</var>, tells the operating system that we don't want any carriage return-linefeed translations.
<!-- Sidebar -->
<table width=100% border=0 cellpadding=5><tr>
<td width=10>
<td bgcolor="#cccccc" class=sidebar>
What is this carriage return-linefeed nonsense? It's one the biggest blunders of the DOS file system, that was unfortunately inherited by all flavors of Windows. The creators of DOS decided that the system should convert single character '\n' into a pair '\r', '\n'. The reasoning was that, when you print a file, the printer interprets carriage return, '\r', as the command to go back to the beginning of the current line, and line feed, '\n', as the command to move down to the next line (not necessarily to its beginning). So, to go to the beginning of the next line, a printer requires two characters. Nowadays, when we use laser printers that understand Postscript and print wysywig documents, this whole idea seems rather odd. Even more so if you consider that an older operating system, Unix, found a way of dealing with this problem without involving low level file system services.
<p>Anyway, if all you want is to store bytes of data in a file, you have to remember to open it in the "binary" mode, otherwise you might get unexpected results. By the way, the default mode is <var>ios_base::text</var> which does the unfortunate character translation.
</table>
<!-- End Sidebar -->
<p>Another interesting point is that the method <var>ifstream::read</var> reads data to a character buffer--it expects <var>char *</var> as its first argument. When we want to read a long, we can't just pass the address of a long to it--the compiler doesn't know how to convert a <var>long *</var> to a <var>char *</var>. This is one of these cases when we <i>have to</i> force the compiler to trust us. We want to split the long ito its constituent bytes (we're ignoring here the big endian/little endian problem). A reasonably clean way to do it is to use the <var>reinterpret_cast</var>. We are essentially telling the compiler to "reinterpret" a chunk of memory occupied by the long as a series of chars. We can tell how many chars a long contains by applying to it the operator <var>sizeof</var>.
<p>This is a good place to explain the various types of casts. You use
<ul>
<li>const_cast--to remove the const attribute
<li>static_cast--to convert related types
<li>reinterpret_cast--to convert unrelated types
</ul>
<p>(There is also a dynamic_cast, which we won't discuss here.)
<p>Here's an example of const_cast:
<!-- Code --><table width=100% cellspacing=10><tr> <td class=codetable>
<pre>char const * str = "No modify!";
char * tmp = const_cast<char *> (str);
tmp [0] = 'D';</pre>
</td></tr></table><!-- End Code -->
<p>To understand static_cast, think of it as the inverse of implicit conversion. Whenever type T can be implicitly converted to type U (in other words, T is-a U), you can use static_cast to perform the conversion the other way. For instance, a char can be implicitly converted to an int:
<!-- Code --><table width=100% cellspacing=10><tr> <td class=codetable>
<pre>char c = '\n';
int i = c; // implicit conversion</pre>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -