Print unicode windows console

Output Unicode to console Using C++, in Windows

I’m still learning C++, so bear with me and my sloppy code. The compiler I use is Dev C++. I want to be able to output Unicode characters to the Console using cout. Whenver i try things like:

It outputs strange characters to the console, like µA■Gg. Why does it do that, and how can I get to to display ĐĄßĞĝ? Or is this not possible with Windows?

5 Answers 5

What about std::wcout ?

This is the standard wide-characters output stream.

Still, as Adrian pointed out, this doesn’t address the fact cmd , by default, doesn’t handle Unicode outputs. This can be addressed by manually configuring the console, like described in Adrian’s answer:

  • Starting cmd with the /u argument;
  • Calling chcp 65001 to change the output format;
  • And setting a unicode font in the console (like Lucida Console Unicode).

You can also try to use _setmode(_fileno(stdout), _O_U16TEXT); , which require fcntl.h and io.h (as described in this answer, and documented in this blog post).

I’m not sure Windows XP will fully support what you need. There are three things you have to do to enable Unicode with a command console:

  1. Start the command window with cmd /u . The /u says your programs will output Unicode.
  2. Use chcp 65001 to indicate you want to use UTF-8 instead of one of the code pages.
  3. Select a font with more glyph coverage. The command windows in newer versions of Windows offer Lucida Console Unicode . My XP box has a subset of that called Lucida Console . It doesn’t have a very extensive repertoire, but it should be sufficient if you’re just trying to display some accented characters.

You can use the open-source library to portably print Unicode text, including on Windows, for example:

This requires compiling with the /utf-8 compiler option in MSVC.

I don’t recommend using wcout because it is non-portable, for example:

will print the ĐĄßĞĝ part incorrectly on macOS or Linux (https://godbolt.org/z/z81jbb):

and doesn’t even work on Windows without changing the code page:

Disclaimer: I’m the author of .

Properly print utf8 characters in windows console

This is the way I try to do it:

And the effect is that only us ascii chars are displayed. No errors are shown. The source file is encoded in utf8.

So, what I’m doing wrong here ?

  • this also doesn’t work. Effect is just the same. My font is of course Lucida Console.

ok, something begins to work, but the output is: ańbcdefghijklmno÷pqrs▀tuŘvwxyz .

7 Answers 7

By default the wide print functions on Windows do not handle characters outside the ascii range.

Читайте также:  Lock windows from command line

There are a few ways to get Unicode data to the Windows console.

use the console API directly, WriteConsoleW. You’ll have to ensure you’re actually writing to a console and use other means when the output is to something else.

set the mode of the standard output file descriptors to one of the ‘Unicode’ modes, _O_U16TEXT or _O_U8TEXT. This causes the wide character output functions to correctly output Unicode data to the Windows console. If they’re used on file descriptors that don’t represent a console then they cause the output stream of bytes to be UTF-16 and UTF-8 respectively. N.B. after setting these modes the non-wide character functions on the corresponding stream are unusable and result in a crash. You must use only the wide character functions.

UTF-8 text can be printed directly to the console by setting the console output codepage to CP_UTF8, if you use the right functions. Most of the higher level functions such as basic_ostream ::operator don’t work this way, but you can either use lower level functions or implement your own ostream that works around the problem the standard functions have.

The problem with the third method is this:

Unlike most operating systems, the console on Windows is not simply another file that accepts a stream of bytes. It’s a special device created and owned by the program and accessed via its own unique WIN32 API. The issue is that when the console is written to, the API sees exactly the extent of the data passed in that use of its API, and the conversion from narrow characters to wide characters occurs without considering that the data may be incomplete. When a multibyte character is passed using more than one call to the console API, each separately passed piece is seen as an illegal encoding, and is treated as such.

It ought to be easy enough to work around this, but the CRT team at Microsoft views it as not their problem whereas whatever team works on the console probably doesn’t care.

How do I print UTF-8 from c++ console application on Windows

For a C++ console application compiled with Visual Studio 2008 on English Windows (XP,Vista or 7). Is it possible to print out to the console and correctly display UTF-8 encoded Japanese using cout or wcout?

8 Answers 8

This should work:

Don’t know if it affects anything, but source file is saved as Unicode (UTF-8 with signature) — Codepage 65001 at FILE -> Advanced Save Options . .

Project -> Properties -> Configuration Properties -> General -> Character Set is set to Use Unicode Character Set.

Some say you need to change console font to Lucida Console, but on my side it is displayed with both Consolas and Lucida Console.

The Windows console uses the OEM code page by default to display output.

To change the code page to Unicode enter chcp 65001 in the console, or try to change the code page programmatically with SetConsoleOutputCP .

Читайте также:  What are pipes in linux

Note that you probably have to change the font of the console to one that has glyphs in the unicode range.

Here’s an article from MVP Michael Kaplan on how to correctly output UTF-16 through the console. You could convert your UTF-8 to UTF-16 and output that.

How do I print Unicode to the output console in C with Visual Studio?

As the question says, do I have to do in order to print Unicode characters to the output console? And what settings do I have to use? Right now I have this code:

and it prints: Text is the ?.

I’ve tried to change the output console’s font to MS Mincho, Lucida Console and a bunch of others but they still don’t display the japanese character.

So, what do I have to do?

3 Answers 3

This is code that works for me (VS2017) — project with Unicode enabled

This is console

After copying it to the Notepad++ I see the proper string

the 来. Testing unicode — English — Ελληνικά — Español.

OS — Windows 7 English, Console font — Lucida Console

Edits based on comments

I tried to fix the above code to work with VS2019 on Windows 10 and best I could come up with is this

When run it «as is» I see

When it is run with console set to Lucida Console fond and UTF-8 encoding I see

As the answer to 来 character shown as empty rectangle — I suppose is the limitation of the font which does not contain all the Unicode gliphs

When text is copied from the last console to Notepad++ all characters are shown correctly

A question mark usually means Windows was unable to convert the character to the destination codepage. In the console a hollow square means the Unicode character was received correctly but it could not be displayed because the console font does not support it or it is a complex script requiring Uniscribe which the console does not handle. You can copy the square and paste it in Notepad/Wordpad and it should display correctly.

The WriteConsoleW Windows function can display Unicode characters and works all the way back to Windows NT. It can only write to the console so you must use WriteFile instead when the output is redirected. GetConsoleMode fails on redirected handles.

You don’t say which VS version you are using and things have changed over the years but Unicode output has been decent since VS2005 if you call _setmode(_fileno(stdout), _O_U16TEXT); early in main():

The characters ‘来’ may not be in your system character code page. You need to save the characters as utf-8.

How to print a unicode string in python in Windows console [duplicate]

I’m working on a python application that can print text in multiple languages to the console in multiple platforms. The program works well on all UNIX platforms, but in windows there are errors printing unicode strings in command-line.

Читайте также:  Для чего предназначен панель задач windows

There’s already a relevant thread regarding this: ( Windows cmd encoding change causes Python crash ) but I couldn’t find my specific answer there.

For example, for the following Asian text, in Linux, I can run:

But in windows I get:

I succeeded displaying the correct text with a message box when doing something like that:

But, I want to be able to do it in windows console, and preferably — without requiring too much configuration outside my python code (because my application will be distributed to many hosts).

Is this possible?

Edit: If it’s not possible — I would be happy to accept some other suggestions of writing a console application in windows that displays unicode, e.g. a python implementation of an alternative windows console

5 Answers 5

There’s a WriteConsoleW solution that provides a unicode argv and stdout (print) but not stdin: Windows cmd encoding change causes Python crash

The only thing I modified is sys.argv to keep it unicode. The original version utf-8 encoded it for some reason.

Use a different console program. The following works in mintty, the default terminal emulator in Cygwin.

There are other console alternatives available for Windows but I have not assessed their Unicode support.

It merely comes from that cmd and powershell consoel do not support variable-width fonts. Fixed fonts do not have Chinese script included. Cygwin is in the same case.
Putty is more advanced, supporting variable-width fonts with cyrillic, vietnamese, arabic scripts, but no chinese so far.

Can you try using the program iconv on Windows, and piping your Python output through it? It’d go something like this:

You might have to do a little work to get iconv on Windows—it’s part of Cygwin but you may be able to build it separately somehow if needed.

The question is answered in the PrintFails article.

By default, the console in Microsoft Windows only displays 256 characters (cp437, of Code page 437, the original IBM-PC 1981 extended ASCII character set.)

For Russia this means CP866, other countries use their own codepages too. This means that to read Python output in Windows console correctly you should have windows configuration with native codepage configured to display printed symbols.

I suggest you to always print Unicode text without any encoding to ensure maximum compatibility with various platforms.

If you try to print unprintable character you will get UnicodeEncodeError or see distorted text.

In some cases, if Python fails to determine output encoding correctly you might try to set PYTHONIOENCODING environment variable, do note however, that this probably won’t work for your example, as your console is unable to present Asian text in current configuration.

To reconfigure console use Control Panel->Language and Regional settings->Advanced(tab)->Non Unicode programs language(section). Note that menu names are translated by me from Russian.

See also answers for the very similar question.

Оцените статью