You may have noticed the first line of XML output generated by XmlWriter or XmlTextWriter shows that the encoding defaults to UTF-16:
<?xml version="1.0" encoding="utf-16"?>
This happens even if you explicitly set the Encoding property in the XmlWriterSettings to something different, such as UTF-8:
StringBuilder sb = new StringBuilder(); XmlWriterSettings settings = new XmlWriterSettings (); settings.Encoding = System.Text.Encoding.UTF8; XmlWriter writer = XmlWriter.Create (sb, settings);
The problem occurs because the StringWriter defaults to UTF-16. (It’s not clear from the example above, but the XmlWriter class uses a StringWriter to output the XML to the specified StringBuilder.)
The key is to change the StringWriter Encoding, but unfortunately you cannot set the Encoding property directly. Instead, you must create your own class that derives from StringWriter and override the Encoding property as follows:
public class StringWriterWithEncoding : StringWriter { public StringWriterWithEncoding( StringBuilder sb, Encoding encoding ) : base( sb ) { this.m_Encoding = encoding; } private readonly Encoding m_Encoding; public override Encoding Encoding { get { return this.m_Encoding; } } }
And here is a simple example of how you would use this new class. Note that you don’t have to explicitly set the Encoding in the XmlWriterSettings because it uses the StringWriter’s Encoding. However, you may wish to set the XmlWriterSettings.CloseOutput property to true so that the StringWriter is closed automatically with the XmlWriter.
StringBuilder sb = new StringBuilder(); StringWriterWithEncoding stringWriter = new StringWriterWithEncoding( sb, Encoding.UTF8 ); XmlWriterSettings settings = new XmlWriterSettings(); settings.Indent = true; settings.CloseOutput = true; XmlWriter writer = XmlWriter.Create( stringWriter, settings ); writer.WriteStartElement( "Root" ); writer.WriteElementString( "Test", "Value" ); writer.WriteEndElement(); writer.Close(); string xml = sb.ToString(); Console.WriteLine( xml );
Here is the output from this sample program:
<?xml version="1.0" encoding="utf-8"?>
<Root>
<Test>Value</Test>
</Root>
Isn’t easier just to use:
writer.WriteProcessingInstruction(“xml”, “version=’1.0′ encoding=’UTF-8′”);
[…] Force XmlWriter or XmlTextWriter to use Encoding Other Than UTF … […]