Saturday, July 12, 2008

SERIALIZATION

What is serialization in .NET and what are the ways to control serialization?

Serialization is the process of converting an object into a stream of bytes. On the other hand Deserialization is the process of creating an object from a stream of bytes. Serialization/Deserialization is used to transport or to persist objects. Serialization can be defined as the process of storing the state of an object to a storage medium. During this process, the public and private fields of the object and the name of the class, including the assembly are converted to a stream of bytes. Which is then written to a data stream. Upon the object's subsequent deserialized, an exact clone of the original object is created.

Binary serialization preserves Type fidelity, which is useful for preserving the state of an object between different invocations of an application. For example: An object can be shared between different applications by serializing it to the clipboard.

You can serialize an object to a stream, disk, memory, over a network, and so forth. Remoting uses serialization to pass objects "By Value" from one computer or application domain to another. XML serialization serializes only public properties and fields and does not preserve Type fidelity. This is useful when you want to provide or consume data without restricting the application that uses the data.

As XML is an open standard, it is an attractive choice for sharing data across the Web. SOAP is also an open standard, which makes it an attractive choice too. There are two separate mechanisms provided by the .NET class library - XmlSerializer and SoapFormatter/BinaryFormatter. Microsoft uses XmlSerializer for Web Services, and uses SoapFormatter/BinaryFormatter for remoting. Both are available for use in your own code.

What is serialization?

Serialization is the process of converting an object into a stream of bytes. Deserialization is the opposite process of creating an object from a stream of bytes. Serialization / Deserialization is mostly used to transport objects (e.g. during remoting), or to persist
objects (e.g. to a file or database).

What is a formatter?

A formatter is an object that is responsible for encoding and serializing data into messages on one end, and deserializing and decoding messages into data on the other end.

Does the .NET Framework have in-built support for serialization?

There are two separate mechanisms provided by the .NET class library - XmlSerializer and SoapFormatter/BinaryFormatter. Microsoft uses XmlSerializer for Web Services, and uses SoapFormatter/BinaryFormatter for remoting. Both are available for use in your own code.

Can I customise the serialization process?

Yes.

XmlSerializer supports a range of attributes that can be used to configure serialization for a particular class. For example, a field or property can be marked with the [XmlIgnore] attribute to exclude it from serialization. Another example is the [XmlElement]
attribute, which can be used to specify the XML element name to be used for a particular property or field.

Serialization via SoapFormatter/BinaryFormatter can also be controlled to some extent by attributes. For example, the [NonSerialized] attribute is the equivalent of XmlSerializer's [XmlIgnore] attribute. Ultimate control of the serialization process can be acheived by implementing the the ISerializable interface on the class whose instances are to be serialized.

Why is XmlSerializer so slow?

There is a once-per-process-per-type overhead with XmlSerializer. So the first time you serialize or deserialize an object of a given type in an application, there is a significant delay. This normally doesn't matter, but it may mean, for example, that XmlSerializer is a poor choice for loading configuration settings during startup of a GUI application.

Why do I get errors when I try to serialize a Hashtable?

XmlSerializer will refuse to serialize instances of any class that implements IDictionary, e.g. Hashtable. SoapFormatter and BinaryFormatter do not have this restriction.

What is serialization in .NET? What are the ways to control serialization?

Serialization is the process of converting an object into a stream of bytes. Deserialization is the opposite process of creating an object from a stream of bytes. Serialization/Deserialization is mostly used to transport objects (e.g. during remoting), or to persist objects (e.g. to a file or database).Serialization can be defined as the process of storing the state of an object to a storage medium. During this process, the public and private fields of the object and the name of the class, including the assembly containing the class, are converted to a stream of bytes, which is then written to a data stream. When the object is subsequently deserialized, an exact clone of the original object is created.

• Binary serialization preserves type fidelity, which is useful for preserving the state of an object between different invocations of an application. For example, you can share an object between different applications by serializing it to the clipboard. You can serialize an object to a stream, disk, memory, over the network, and so forth. Remoting uses serialization to pass objects "by value" from one computer or application domain to another.

• XML serialization serializes only public properties and fields and does not preserve type fidelity. This is useful when you want to provide or consume data without restricting the application that uses the data. Because XML is an open standard, it is an attractive choice for sharing data across the Web. SOAP is an open standard, which makes it an attractive choice.
There are two separate mechanisms provided by the .NET class library - XmlSerializer and SoapFormatter/BinaryFormatter. Microsoft uses XmlSerializer for Web Services, and uses SoapFormatter/BinaryFormatter for remoting. Both are available for use in your own code.

Why do I get errors when I try to serialize a Hashtable?

XmlSerializer will refuse to serialize instances of any class that implements IDictionary, e.g. Hashtable. SoapFormatter and BinaryFormatter do not have this restriction.

Serialize and MarshalByRef?

When a client application makes use of a marshal-by-reference object, it can activate the object using server-activation or client-activation. This is the main advantage of using MarshalByRef.

Let's kick off with the first part: creating a class library which can host the ASP.NET runtime to execute (in our case simple) .aspx pages. The .NET Framework provides a mechanism which allows us to use the ASP.NET runtime without requiring the ISAPI extension for IIS.

So, let's open up an instance of Visual Studio .NET 2003 and create a new type class library project in C# (the code for VB.NET is available as well): delete the Class1.cs file and add a new Class file called Host.cs. The first thing we need to do is to inherit this class from the MarshalByRef class. This is required because the ASP.NET runtime is using application domains to run web applications. By inheriting from the MarshalByRef class, we're enabling access to our object across application domain boundaries. If you've ever written a .NET Remoting application yourself, I'm sure you have used this class before.

.NET Marshalling
Thus .NET runtime automatically generates code to translate calls between managed code and unmanaged code. While transferring calls between these two codes, .NET handles the data type conversion also. This technique of automatically binding with the server data type to the client data type is known as marshalling. Marshaling occurs between managed heap and unmanaged heap. For example, Fig.4 shows a call from the .NET client to a COM component. This sample call passes a .NET string from the client. The RCW converts this .NET data type into the COM compatible data type. In this case COM compatible data type is BSTR. Thus the RCW converts the .NET string into COM compatible BSTR. This BSTR will be passed to the object and the required calls will be made. The results will be returned to back to the RCW. The RCW converts this COM compatible result to .NET native data type.

Fig.4 Sample diagram for marshalling
Logically the marshalling can be classified into 2 types.
1. Interop marshalling
2. COM marshalling
If a call occurs between managed code and unmanaged code with in the same apartment, Interop marshaler will play the role. It marshals data between managed code and unmanaged code.
In some scenarios COM component may be running in different apartment threads. In those cases i.e., calling between managed code and unmanaged code in different apartments or process, both Interop marshaler and COM marshaler are involved.
Interop marshaler
When the server object is created in the same apartment of client, all data marshaling is handled by Interop marshaling.

Fig.5 Sample diagram for same apartment marshalling
COM marshaler
COM marshaling involved whenever the calls between managed code and unmanaged code are in different apartments. For eg., when a .NET client (with the default apartment settings) communicates with a COM component (whichever developed in VB6.0), the communication occurs through proxy and stub because both the objects will be running in different apartment threads. (The default apartment settings of .NET objects are STA and the components which are developed by VB6.0 are STA). Between these two different apartments COM marshaling will occurs and with in the apartment Interop marshaling will occurs. Fig.6 shows this kind of marshaling.
This kind of different apartment communication will impact the performance. The apartment settings of the managed client can be changed by changing the STAThreadAttribute / MTAThreadAttribute / Thread.ApartmentState property. Both the codes can run in a same apartment, by making the managed code’s thread to STA. (If the COM component is set as MTA, then cross marshaling will occurs.)

Fig.6 Sample diagram for cross apartment marshalling
In the above scenario, the call with in different apartments will occur by COM marshaling and the call between managed and unmanaged code will occur by Interop marshaling.

Write syntax to serialize class using XML Serializer?

The xml serializer is the simplest possible serializer. It generates an xml document from the sax events. This serializer is used for serializing to any XML document format, ie. SVG, WML, VRML, et. al.

Name : xml
Class: org.apache.cocoon.serialization.XMLSerializer
Cacheable: yes.


...
src="org.apache.cocoon.serialization.XMLSerializer"
mime-type="text/xml"
logger="sitemap.serializer.xml"
pool-grow="4" pool-max="32" pool-min="4">

...

...

No comments: