Text editor windows unix line endings

Как отключить поддержку Unix Line Endings в Блокноте Windows 10

В сборке Windows 10 17661 Microsoft представила маленькое усовершенствование для Блокнота, которое наверняка по достоинству оценят системные администраторы, разработчики программного обеспечения, а также пользователи, работающие с разными операционными системами. Отныне во встроенном текстовом редакторе Windows 10 можно будет просматривать текстовые документы, созданные в Unix -системах.

А разве раньше Блокнот не мог читать содержимое таких файлов, спросите вы?

Мог, только вот отображалось оно некорректно, ведь нормально Блокнот мог читать только те файлы, которые содержали символы начала новой строки в поддерживаемом Windows формате EOL, а именно возврат каретки (CR) и подача на строку (LF). В Unix же системах признаком конца строки служил только символ LF, что делало в Windows практически невозможным чтение и редактирование созданных в Linux и Mac OS текстовых документов.

Благодаря поддержке Блокнотом Unix Line Endings теперь можно просматривать такие файлы Блокнотом в вполне читабельном виде, тем не менее, есть некая доля вероятности, что нововведение каким-то образом нарушит работу ваших скриптов.

В этом случае вы можете отключить поддержку Unix Line Endings, применив несложный твик реестра.

Откройте командой regedit редактор реестра и разверните в нём следующую ветку:

HKEY_CURRENT_USER\Software\Microsoft\Notepad

Если последнего подраздела у вас нет, создайте его вручную.

В свою очередь в нём создайте два 32-битных DWORD -параметра fWindowsOnlyEOL и fPasteOriginalEOL. В качестве значения первого параметра установите 1.

Значение второго оставьте по умолчанию, то есть 0.

Чтобы изменения вступили в силу, выйдите и заново войдите в свою учетную запись.

Если подраздел Notepad существует, измените оба параметра так, как показано выше и перезайдите в систему. Поддержка Unix Line Endings будет отключена.

Если в будущем вы захотите восстановить настройки по умолчанию, просто очистите содержимое подраздела Notepad.

Converting from Windows-style to UNIX-style line endings

The Problem

In a plain text file, to tell the computer that a line of text doesn’t continue forever, the end of each line is marked by a sequence of one or more invisible characters, called control characters. While there are many control characters for different purposes, the relevant ones for line endings are the carriage return (CR) and line feed (LF) characters.

Unfortunately, the programmers of different operating systems have represented line endings using different sequences:

  • All versions of Microsoft Windows represent line endings as CR followed by LF.
  • UNIX and UNIX-like operating systems (including Mac OS X) represent line endings as LF alone.

Therefore, a text file prepared in a Windows environment will, when copied to a UNIX-like environment such as a NeSI cluster, have an unnecessary carriage return character at the end of each line. To make matters worse, this character will normally be invisible, though in some text editors it will show up as ^M or similar.

Many programs, including the Slurm and LoadLeveler batch queue schedulers, will give errors when given a file containing carriage return characters as input.

Therefore, you will need to convert any such file so it has only UNIX-style line endings before using it on a NeSI cluster.

The Symptoms

In the Slurm job scheduler

If you submit (using sbatch ) a Slurm submission script with Windows-style line endings, you will likely receive the following error:

In other programs

Some UNIX or Linux programs are tolerant to Windows-style line endings, while others give errors. The text of the error is almost infinitely variable, but program behaviours might include the following responses:

  • Explicitly stating the problem with line endings
  • Complaining more vaguely that the input data is incomplete or corrupt or that there are problems reading it
  • Failing in a more serious way such as a segmentation fault
Читайте также:  Windows sql server developer edition

Checking a file’s line ending format

If you have what you think is a text file on the cluster but you don’t know whether its line endings are in the correct format or not, you can run the following command:

Depending on the contents of foo.txt , the output of this command may vary, but if the output has «CR» or «CRLF» in it, you will need to convert foo.txt to UNIX format line endings if you want to use it on the cluster.

How to Convert

Converting using Notepad++

In the Windows text editing program Notepad++ (not to be confused with ordinary Notepad), there is a function to prepare text files with UNIX-style line endings.

To write your file in this way, while you have the file open, go to the Edit menu, select the «EOL Conversion» submenu, and from the options that come up select «UNIX/OSX Format». The next time you save the file, its line endings will, all going well, be saved with UNIX-style line endings.

You can check what format line endings you are currently editing in by looking in the status bar at the bottom of the window. Between the range box (a box containing Ln, Col and Sel entries) and the text encoding box (which will contain UTF-8, ANSI, or some other technical string) will be a box containing the current line ending format.

  • In most cases, this box will contain the text «DOS\Windows».
  • In a few cases, such as the file having been prepared on a UNIX or Linux machine or a Mac, it will contain the text «UNIX».
  • It is possible, though highly unlikely by now, that the file may have old-style (pre-OSX) Mac line endings, in which case the box will contain the text «Macintosh».

Please note that if you change a file’s line ending style, you must save your changes before copying the file anywhere, including to a cluster.

Converting using dos2unix

Suppose, though, that you’ve copied a text file to the cluster already, and you realise you need to convert it to UNIX format. How do you do that?

Simple: Use the program dos2unix .

Just give the name of your file to dos2unix as an argument, and it will convert the file’s line endings to UNIX format:

There are other options in the rare case that you don’t want to just modify your existing file; run man dos2unix for details.

Configure Visual Studio to use UNIX line endings

We would like to use Visual Studio 2005 to work on a local copy of an SVN repository. This local copy has been checked out by Mac OS X (and updates and commits will only be made under Mac OS X, so no problem there), and as a consequence the line endings are UNIX-style.

We fear that Visual Studio will introduce Windows-style line endings. Is it possible to force Visual Studio to use UNIX line endings?

9 Answers 9

Warning: This solution no longer works for Visual Studio 2017 and later. Instead, both of the answers by jcox and Munther Jaber are needed. I have combined them into one answer.

As OP states «File > Advanced Save Options», select Unix Line Endings.

This will only affect new files that are created. Fixing any that were previously created can be done file-by-file or you can search for tools that will fix on-bulk.

Читайте также:  Application microsoft programming windows

Here are some options available for Visual Studio Community 2017

  1. «File > Advanced Save Options» has been removed by microsoft due to «uncommon use». Whatever that means. https://developercommunity.visualstudio.com/content/problem/8290/file-advanced-save-options-option-is-missed.html You can add it back by going to «Tools>Customize», then «Commands» tab, select the drop down next to «Menu Bar» select «File» then «Add Command»>File>Advanced Save Options..». You can then reorder it in the file menu by using «move down».

I don’t know if you will have to then set the advanced save options for each and every file, but it might prevent the issue I was having where my Visual Studio kept adding CL RF line endings into my files that were uniformly LF.

But I took it one step further and I added an extension called «Line Endings Unifier» by going to «Tools>Extensions and Updates>Online» and then searching for «line endings» in the search bar to the right. I will use this to automatically force all of my scripts to save with uniform line endings of my choice, but you can do more with it. https://marketplace.visualstudio.com/items?itemName=JakubBielawa.LineEndingsUnifier

strip’em is another solution that does something similar to Line Endings Unifier. http://www.grebulon.com/software/stripem.php

I am not sure how they differ or the advantages/disadvantages of either. I’m mainly using Line Endings Unifier just because it was in the Visual Studio Marketplace. I think I’ve used all of these methods in the past, but my memory is fuzzy.

How to find out line-endings in a text file?

I’m trying to use something in bash to show me the line endings in a file printed rather than interpreted. The file is a dump from SSIS/SQL Server being read in by a Linux machine for processing.

Are there any switches within vi , less , more , etc?

In addition to seeing the line-endings, I need to know what type of line end it is ( CRLF or LF ). How do I find that out?

11 Answers 11

You can use the file utility to give you an indication of the type of line endings.

To convert from «DOS» to Unix:

To convert from Unix to «DOS»:

Converting an already converted file has no effect so it’s safe to run blindly (i.e. without testing the format first) although the usual disclaimers apply, as always.

simple cat -e works just fine.

This displays Unix line endings ( \n or LF) as $ and Windows line endings ( \r\n or CRLF) as ^M$ .

:set list to see line-endings.

:set nolist to go back to normal.

While I don’t think you can see \n or \r\n in vi , you can see which type of file it is (UNIX, DOS, etc.) to infer which line endings it has.

Alternatively, from bash you can use od -t c or just od -c to display the returns.

In the bash shell, try cat -v . This should display carriage-returns for windows files.

(This worked for me in rxvt via Cygwin on Windows XP).

Editor’s note: cat -v visualizes \r (CR) chars. as ^M . Thus, line-ending \r\n sequences will display as ^M at the end of each output line. cat -e will additionally visualize \n , namely as $ . ( cat -et will additionally visualize tab chars. as ^I .)

To show CR as ^M in less use less -u or type — u once less is open.

Try file then file -k then dos2unix -ih

file will usually be enough. But for tough cases try file -k or dosunix -ih .

Try file -k

Short version: file -k somefile.txt will tell you.

  • It will output with CRLF line endings for DOS/Windows line endings.
  • It will output with LF line endings for MAC line endings.
  • And for Linux/Unix line «CR» it will just output text . (So if it does not explicitly mention any kind of line endings then this implicitly means: «CR line endings».)

Long version see below.

Читайте также:  Драйвер для xerox workcentre 312 для windows

Real world example: Certificate Encoding

I sometimes have to check this for PEM certificate files.

The trouble with regular file is this: Sometimes it’s trying to be too smart/too specific.

Let’s try a little quiz: I’ve got some files. And one of these files has different line endings. Which one?

(By the way: this is what one of my typical «certificate work» directories looks like.)

Let’s try regular file :

Huh. It’s not telling me the line endings. And I already knew that those were cert files. I didn’t need «file» to tell me that.

What else can you try?

You might try dos2unix with the —info switch like this:

So that tells you that: yup, «0.example.end.cer» must be the odd man out. But what kind of line endings are there? Do you know the dos2unix output format by heart? (I don’t.)

But fortunately there’s the —keep-going (or -k for short) option in file :

Excellent! Now we know that our odd file has DOS ( CRLF ) line endings. (And the other files have Unix ( LF ) line endings. This is not explicit in this output. It’s implicit. It’s just the way file expects a «regular» text file to be.)

(If you wanna share my mnemonic: «L» is for «Linux» and for «LF».)

Now let’s convert the culprit and try again:

Good. Now all certs have Unix line endings.

Try dos2unix -ih

I didn’t know this when I was writing the example above but:

Actually it turns out that dos2unix will give you a header line if you use -ih (short for —info=h ) like so:

And another «actually» moment: The header format is really easy to remember: Here’s two mnemonics:

  1. It’s DUMB (left to right: d for Dos, u for Unix, m for Mac, b for BOM).
  2. And also: «DUM» is just the alphabetical ordering of D, U and M.

Further reading

You can use xxd to show a hex dump of the file, and hunt through for «0d0a» or «0a» chars.

You can use cat -v as @warriorpostman suggests.

You may use the command todos filename to convert to DOS endings, and fromdos filename to convert to UNIX line endings. To install the package on Ubuntu, type sudo apt-get install tofrodos .

You can use vim -b filename to edit a file in binary mode, which will show ^M characters for carriage return and a new line is indicative of LF being present, indicating Windows CRLF line endings. By LF I mean \n and by CR I mean \r . Note that when you use the -b option the file will always be edited in UNIX mode by default as indicated by [unix] in the status line, meaning that if you add new lines they will end with LF, not CRLF. If you use normal vim without -b on a file with CRLF line endings, you should see [dos] shown in the status line and inserted lines will have CRLF as end of line. The vim documentation for fileformats setting explains the complexities.

Also, I don’t have enough points to comment on the Notepad++ answer, but if you use Notepad++ on Windows, use the View / Show Symbol / Show End of Line menu to display CR and LF. In this case LF is shown whereas for vim the LF is indicated by a new line.

I dump my output to a text file. I then open it in notepad ++ then click the show all characters button. Not very elegant but it works.

Оцените статью