Set utf 8 linux

Содержание

How to Change or Set System Locales in Linux
How to View System Locale in Linux
How to Set System Locale in Linux
If You Appreciate What We Do Here On TecMint, You Should Consider:
Локализация Ubuntu Server 18.04 LTS
Текущие настройки языка
Список доступных локалей
Добавить новую локаль
Подробная информация о локалях
Локаль по-умолчанию
Быстрая локализация
Удалить лишние локали
Переводы для системных программ
Локализация для текущей сессии
Файлы конфигурации шрифта и клавиатуры
Настройка шрифта и клавиатуры
How to Convert Files to UTF-8 Encoding in Linux
Convert Files from UTF-8 to ASCII Encoding
Convert Multiple Files to UTF-8 Encoding
If You Appreciate What We Do Here On TecMint, You Should Consider:
How to set up a clean UTF-8 environment in Linux
Choosing an encoding
Locales: installing
Locales: generation
Locales: configuration
A Warning about Non-Interactive Processes
Locales: check
Setting up the terminal emulator
Testing the terminal emulator
Screen
Irssi

How to Change or Set System Locales in Linux

A locale is a set of environmental variables that defines the language, country, and character encoding settings (or any other special variant preferences) for your applications and shell session on a Linux system. These environmental variables are used by system libraries and locale-aware applications on the system.

Locale affects things such as the time/date format, the first day of the week, numbers, currency and many other values formatted in accordance with the language or region/country you set on a Linux system.

In this article, we will show how to view your currently installed system locale and how to set system’s locale in Linux.

How to View System Locale in Linux

To view information about the current installed locale, use the locale or localectl utility.

You can view more information about an environmental variable, for example LC_TIME, which stores the time and date format.

To display a list of all available locales use the following command.

How to Set System Locale in Linux

If you want to change or set system local, use the update-locale program. The LANG variable allows you to set the locale for the entire system.

The following command sets LANG to en_IN.UTF-8 and removes definitions for LANGUAGE.

To configure a specific locale parameter, edit the appropriate variable. For instance.

You can find global locale settings in the following files:

/etc/default/locale – on Ubuntu/Debian
/etc/locale.conf – on CentOS/RHEL

These files can also be edited manually using any of your favorite command line editors such as Vim or Nano, to configure your system locale.

To set a global locale for single user, you can simply open

/.bash_profile file and add the following lines.

For more information, see the locale, update-locale and localectl man pages.

That’s all! In this short article, we have explained how to view and set system local in Linux. If you have any questions, use the feedback form below to reach us.

If You Appreciate What We Do Here On TecMint, You Should Consider:

TecMint is the fastest growing and most trusted community site for any kind of Linux Articles, Guides and Books on the web. Millions of people visit TecMint! to search or browse the thousands of published articles available FREELY to all.

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

We are thankful for your never ending support.

Источник

Локализация Ubuntu Server 18.04 LTS

Локаль (locale или локализация) в Linux определяет, какой язык и какой набор символов (кодировку), пользователь видит в терминале. Посмотрим, как проверить текущие настройки языка и кодировки, как получить список всех доступных локалей, как сменить язык и кодировку для текущей сессии или установить их постоянно.

Для тех, кому лень читать всю статью до конца — чаще всего для локализации консоли достаточно повторно сконфигурировать пакет locales :

Сначала будут созданы нужные локали (их выбрать на первом экране), потом установлена локаль по умолчанию (ее выбрать на втором экране).

Текущие настройки языка

Посмотрим информацию о текущем языковом окружении:

Список доступных локалей

Теперь посмотрим список всех установленных языков и кодировок:

Есть только системная локаль C.UTF-8 , которая присутствует всегда. А нам надо добавить еще две локали — en_US.UTF-8 и ru_RU.UTF-8 .

Добавить новую локаль

Смотрим список всех поддерживаемых (доступных для установки) локалей:

Устанавливаем нужные локали — en_US.UTF-8 и ru_RU.UTF-8 :

Второй способ установить локали — расскомментровать нужные строки в файле /etc/locale.gen

И просто выполнить команду locale-gen без указания локалей:

Подробная информация о локалях

Более подробную информацию об установленных в системе локалях можно посмотреть так:

Часть локалей размещена в архиве /usr/lib/locale/locale-archive , а часть — в директориях внутри /usr/lib/locale/ .

Локаль по-умолчанию

Хорошо, нужные локали у нас теперь есть, осталось только задать локаль по умолчанию:

Эта команда запишет в файл /etc/default/locale строку:

После этого надо будет перезайти в систему. И проверяем информацию о языковом окружении:

Теперь все правильно, так что запишем эту информацию в файл /etc/default/locale :

Быстрая локализация

До сих пор мы все делали ручками, но если лень — можно просто повторно сконфигурировать пакет locales . Сначала будут созданы нужные локали (их нужно выбрать на первом экране), потом установлена локаль по умолчанию (ее нужно выбрать на втором экране).

Удалить лишние локали

После установки (генерации) локали, она помещается в архив /usr/lib/locale/locale-archive . Файл архива — это файл, отображаемый в память, который содержит все локали системы; он используется всеми локализованными программами. Посмотреть список локалей в архиве можно с помощью команды:

Удалить заданную локаль из файла архива:

Обратите внимание на название локали — ru_UA.utf8 , а не ru_UA.UTF-8 . Если неправильно указать локаль — она не будет удалена из архива:

В случае, если утилита locale-gen была вызвана с опцией —no-archive , надо удалить соответствующую директорию в /usr/lib/locale :

Переводы для системных программ

Локализация для основных системных программ, чтобы получать сообщения на русском языке:

Локализация для текущей сессии

Достаточно временно установить переменную окружения LANG в текущей сессии терминала:

Или даже так — передать переменную LANG конкретной программе:

Файлы конфигурации шрифта и клавиатуры

Настройки можно найти в файлах конфигурации /etc/default/console-setup и /etc/default/keyboard :

Это системные настройки, пользователь может создать свои в файлах

Настройка шрифта и клавиатуры

Чтобы сформировать файлы конфигурации /etc/default/console-setup и /etc/default/keyboard можно использовать команды:

После того, как файлы конфигурации будут сформированы, нужно выполнить команду setupcon без аргументов или перезагрузить систему.

Источник

How to Convert Files to UTF-8 Encoding in Linux

In this guide, we will describe what character encoding and cover a few examples of converting files from one character encoding to another using a command line tool. Then finally, we will look at how to convert several files from any character set (charset) to UTF-8 encoding in Linux.

As you may probably have in mind already, a computer does not understand or store letters, numbers or anything else that we as humans can perceive except bits. A bit has only two possible values, that is either a 0 or 1 , true or false , yes or no . Every other thing such as letters, numbers, images must be represented in bits for a computer to process.

In simple terms, character encoding is a way of informing a computer how to interpret raw zeroes and ones into actual characters, where a character is represented by set of numbers. When we type text in a file, the words and sentences we form are cooked-up from different characters, and characters are organized into a charset.

There are various encoding schemes out there such as ASCII, ANSI, Unicode among others. Below is an example of ASCII encoding.

In Linux, the iconv command line tool is used to convert text from one form of encoding to another.

You can check the encoding of a file using the file command, by using the -i or —mime flag which enables printing of mime type string as in the examples below:

Check File Encoding in Linux

The syntax for using iconv is as follows:

Where -f or —from-code means input encoding and -t or —to-encoding specifies output encoding.

To list all known coded character sets, run the command below:

List Coded Charsets in Linux

Convert Files from UTF-8 to ASCII Encoding

Next, we will learn how to convert from one encoding scheme to another. The command below converts from ISO-8859-1 to UTF-8 encoding.

Consider a file named input.file which contains the characters:

Let us start by checking the encoding of the characters in the file and then view the file contents. Closely, we can convert all the characters to ASCII encoding.

After running the iconv command, we then check the contents of the output file and the new encoding of the characters as below.

Convert UTF-8 to ASCII in Linux

Note: In case the string //IGNORE is added to to-encoding, characters that can’t be converted and an error is displayed after conversion.

Again, supposing the string //TRANSLIT is added to to-encoding as in the example above (ASCII//TRANSLIT), characters being converted are transliterated as needed and if possible. Which implies in the event that a character can’t be represented in the target character set, it can be approximated through one or more similar looking characters.

Consequently, any character that can’t be transliterated and is not in target character set is replaced with a question mark (?) in the output.

Convert Multiple Files to UTF-8 Encoding

Coming back to our main topic, to convert multiple or all files in a directory to UTF-8 encoding, you can write a small shell script called encoding.sh as follows:

Save the file, then make the script executable. Run it from the directory where your files ( *.txt ) are located.

Important: You can as well use this script for general conversion of multiple files from one given encoding to another, simply play around with the values of the FROM_ENCODING and TO_ENCODING variable, not forgetting the output file name «$.utf8.converted» .

For more information, look through the iconv man page.

To sum up this guide, understanding encoding and how to convert from one character encoding scheme to another is necessary knowledge for every computer user more so for programmers when it comes to dealing with text.

Lastly, you can get in touch with us by using the comment section below for any questions or feedback.

If You Appreciate What We Do Here On TecMint, You Should Consider:

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

We are thankful for your never ending support.

Источник

How to set up a clean UTF-8 environment in Linux

Many people have problems with handling non-ASCII characters in their programs, or even getting their IRC client or text editor to display them correctly.

To efficiently work with text data, your environment has to be set up properly — it is so much easier to debug a problem which has encoding issues if you can trust your terminal to correctly display correct UTF-8.

I will show you how to set up such a clean environment on Debian Lenny, but most things work independently of the distribution, and parts of it even work on other Unix-flavored operating systems like MacOS X.

Choosing an encoding

In the end the used character encoding doesn’t matter much, as long as it’s a Unicode encoding, i.e. one which can be used to encode all Unicode characters.

UTF-8 is usually a good choice because it efficiently encodes ASCII data too, and the character data I typically deal with still has a high percentage of ASCII chars. It is also used in many places, and thus one can often avoid conversions.

Whatever you do, chose one encoding and stick to it, for your whole system. On Linux that means text files, file names, locales and all text based applications (mutt, slrn, vim, irssi, . ).

For the rest of this article I assume UTF-8, but it should work very similarly for other character encodings.

Locales: installing

Check that you have the locales package installed. On Debian you can do that with.

The last line is the important one: if it starts with ii , the package is installed, and everything is fine. If not, install it. As root, type

If you get a dialog asking for details, read on to the next section.

Locales: generation

make sure that on your system an UTF-8 locale is generated. As root, type

You’ll see a long list of locales, and you can navigate that list with the up/down arrow keys. Pressing the space bar toggles the locale under the cursor. Make sure to select at least one UTF-8 locale, for example en_US-UTF-8 is usually supported very well. (The first part of the locale name stands for the language, the second for the country or dialect, and the third for the character encoding).

In the next step you have the option to make one of the previously selected locales the default. Picking a default UTF-8 locale as default is usually a good idea, though it might change how some programs work, and thus shouldn’t be done servers hosting sensitive applications.

Locales: configuration

If you chose a default locale in the previous step, log out completely and then log in again. In any case you can configure your per-user environment with environment variables.

The following variables can affect programs: LANG, LANGUAGE, LC_CTYPE, LC_NUMERIC, LC_TIME, LC_COLLATE, LC_MONETARY, LC_MESSAGES, LC_PAPER, LC_NAME, LC_ADDRESS, LC_TELEPHONE, LC_MEASUREMENT, LC_IDENTIFICATION.

Most of the time it works to set all of these to the same value. Instead of setting all LC_ variables separately, you can set the LC_ALL . If you use bash as your shell, you can put these lines in your

To make these changes active in the current shell, source the .bashrc:

All newly started interactive bash processes will respect these settings.

You must restart long-running programs for these changes to take effect.

A Warning about Non-Interactive Processes

There are certain processes that don’t get those environment variables, typically because they are started by some sort of daemon in the background.

Those include processes started from cron, at, init scripts, or indirectly spawned from init scripts, like through a web server.

You might need to take additional steps to ensure that those programs get the proper environment variables.

Locales: check

Run the locale program. The output should be similar to this:

If not you’ve made a mistake in one of the previous steps, and need to recheck what you did.

Setting up the terminal emulator

Setting up the terminal emulator for your terminal emulator strongly depends on what you actually use. If you use xterm , you can start it as xterm -en utf-8 , konsole and the Gnome Terminal can be configured in their respective configuration menus.

Testing the terminal emulator

To test if you terminal emulator works, copy and paste this line in your shell:

This should print a Euro sign € on the console. If it prints a single question mark instead, your fonts might not contain it. Try installing additional fonts. If multiple different (nonsensical) characters are shown, the wrong character encoding is configured. Keep trying :-).

If you use SSH to log in into another machine, repeat the previous steps, making sure that the locale is set correctly, and that you can view a non-ASCII character like the Euro sign.

Screen

The screen program can work with UTF-8 if you tell it to.

The easiest (and sometimes the only) way is to start it with the -U option:

and also when detaching ( screen -Urd or so).

Inside a running screen you can try Ctrl+a :utf8 on . If that doesn’t work, exit your screen and start a new one with -U

Irssi

There’s a complete guide for setting up irssi to use UTF-8, which partially overlaps with this one. The gist is:

Источник