So, you know everything about text, right? – part XIII

In the previous post, I’ve mentioned that I’d dedicate a post on the topic of formatting. And I thing the best way to start the discussion is to start by looking at the ToString instance method. The ToString is a public and virtual method introduced by the Object class. In practice, this means that it can be used over any instance of any type. By convention, it returns a string which represents the current object, formatted according to the calling’s thread culture. Here’s an example that illustrates the use of this method for printing the value of a double:

Thread.CurrentThread.CurrentCulture = new CultureInfo( "en-US" );
var strEn = 10.5.ToString( );
Thread.CurrentThread.CurrentCulture = new CultureInfo( "pt-PT" );
var strPt = 10.5.ToString( );

As you can see from the comments, changing the current culture from the calling thread results in two different strings. This happens because the Double type overrides the virtual ToString method inherited so that it can provide a reasonable description for its “content”. If it didn’t, then the returned string would only reflect the name of the type:

//no override, so ToString returns
//the typename: ConsoleApplication1.Student
class Student{}

Btw, it’s a good idea to override the ToString method whenever you create a new type. Overriding this method is also important when you debug an application because the ToString method will also ve called by VS when you put the cursor over an instance of that type or when you add an instance to the watch window (in other words, the ToString method can be overriden to improve your debugging experience in VS).

The main problem associated with the “inherited” ToString method is that there’s no way for the caller to customize the culture used internally by the method. Yes, you can change the thread’s culture, but that is often an overkill operation for getting a string in a specific culture…And that’s why the framework introduced the IFormattable interface. This interface has a single method, which looks like this:

public interface IFormattable{
    string ToString(string format, IFormatProvider formatProvider);

By default, this interface is implemented by most of the base types exposed by the .NET framework. Event enums have it implemented by default…As you can see, the ToString method expects two parameters:

  • The first is a string which defines the way that the object should be formatted.
  • The second is an instance of the IFormatProvider type that is responsible for passing specific culture info to the method.

If the type that implements the IFormattable interface doesn’t support the format string received, then it should generate a FormatException exception. Several of the base types that implement this interface are able to receive several format strings. For instance, take a look at the following example:

var date = DateTime.Now;
Console.WriteLine(date.ToString("d", new CultureInfo( "pt-PT" )));//16-05-2011
Console.WriteLine(date.ToString("D", new CultureInfo( "pt-PT" )));//segunda-feira, 16 de Maio de 2011

As you can see, “d” formats the current date in the short date form while “D” uses a long format form. DateTime supports other format strings too: for instance, you can use “U” for getting a string with the current date in the universal time in full date format. There are also some strings which can be used for formatting different types of objects. For instance, you can use “G” for getting a string for an enum value or a number (Int32, Decimal, Double, etc) in the general form:

Console.WriteLine(1.ToString("G", new CultureInfo( "pt-PT" )));
Console.WriteLine((10.0).ToString("G", new CultureInfo( "pt-PT" )));

By default, all objects should serialize themselves in the so called general form. The general form is just a string which represents the most common used format of an instance. As you’ve probably deduced from the previous paragraphs, the general form string should be returned when you pass the “G” format string or null (it’s also a good practice to return the general form string from the override of the ToString method inherited from Object).

Notice that the format string is only responsible for influencing the way that data is presented. For instance, if you’ve got an integer that needs to be represented, that integer can be a quantity. But it could also represent a value in currency. And that’s what the format string parameter does: it specifies the type of information returned in the string.

But that’s only half of the story. For instance, in the next example, I’m saying that money should be represented as currency (notice the “C” format string). Since the currency symbol changes from culture to culture, the ToString method can also receive a second parameter with culture specific info. And that’s what the IFormatProvider parameter does: it can return an object that knows how to format a value according to a specific culture. If you want, you can simply pass null for this parameter. By doing that, you’re saying that all formatting should be done according to the calling thread’s culture (since this is a common scenario, it’s usual for a type to expose an overload of the ToString method which only receives a format string).

var money = 10;
Console.OutputEncoding = Encoding.GetEncoding( 1250 );//change output encoding
Console.WriteLine(money.ToString("C", new CultureInfo( "pt-PT" )));//10,00 ?
Console.WriteLine(money.ToString("C", new CultureInfo( "en-US" )));//$10.00

As a side note, I had to change the default encoding used in the console output so that I could get an euro symbol printed…

The CultureInfo type is one of the few types that implement the IFormatProvider interface:

public interface IFormatProvider{
    object GetFormat(Type formatType);

You can build an instance of the CultureInfo type for any of the existing major cultures. The easiest way to do that is to pass a string which identifies that culture.It’s that easy! CultureInfo’s implementation of the IFormatProvider is rather simple: it will only respond to the NumberFormatInfo or DateTimeFormatInfo types (and this is because currently the framework will only format numbers and dates).

Each of these types (NumberFormatInfo and DateTimeFormatInfo) expose several interesting properties which are used by ToString for formatting the values (ex.: NumberFormatInfo exposes a CurrencySymbol property which identities the symbol used for the currency associated with the current culture). As you’re probably expecting, the values returned by these properties depend on the the culture specified during the instantiation of the CultureInfo object: internally, the constructor relies on an internal culture table which has all the required info for correctly formatting numbers and dates for most of the existing cultures.

And I guess this is all for now. In the next post, we’ll keep looking at formatting and see how we can influence the way values get formatted. Stay tuned for more.


~ by Luis Abreu on May 16, 2011.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: