- Linux Tutorial — 3. More About Files
- More About Files!
- Introduction
- Everything is a File
- Linux is an Extensionless System
- Linux is Case Sensitive
- Spaces in names
- Quotes
- Escape Characters
- Hidden Files and Directories
- Summary
- Activities
- Why does linux use file extension to decide the default program for opening a file though it’s independent of file extensions
- 3 Answers 3
- File extensions and association with programs in linux
- 6 Answers 6
Linux Tutorial — 3. More About Files
More About Files!
Kinda boring but essential knowledge.
Introduction
After the previous section I’m sure you’re keen and eager to get stuck into some more commands and start doing some actual playing about with the system. We will get to that shortly but first we need to cover some theory so that when we do start playing with the system you can fully understand why it is behaving the way it is and how you can take the commands you learn even further. That is what this section and the next intend to do. After that it will start getting interesting, I promise.
Everything is a File
Ok, the first thing we need to appreciate with linux is that under the hood, everything is actually a file. A text file is a file, a directory is a file, your keyboard is a file (one that the system reads from only), your monitor is a file (one that the system writes to only) etc. To begin with, this won’t affect what we do too much but keep it in mind as it helps with understanding the behaviour of Linux as we manage files and directories.
Linux is an Extensionless System
This one can sometimes be hard to get your head around but as you work through the sections it will start to make more sense. A file extension is normally a set of 2 — 4 characters after a full stop at the end of a file, which denotes what type of file it is. The following are common extensions:
- file.exe — an executable file, or program.
- file.txt — a plain text file.
- file.png, file.gif, file.jpg — an image.
In other systems such as Windows the extension is important and the system uses it to determine what type of file it is. Under Linux the system actually ignores the extension and looks inside the file to determine what type of file it is. So for instance I could have a file myself.png which is a picture of me. I could rename the file to myself.txt or just myself and Linux would still happily treat the file as an image file. As such it can sometimes be hard to know for certain what type of file a particular file is. Luckily there is a command called file which we can use to find this out.
Now you may be wondering why I specified the command line argument above as path instead of file. If you remember from the previous section, whenever we specify a file or directory on the command line it is actually a path. Also because directories (as mentioned above) are actually just a special type of file, it would be more accurate to say that a path is a means to get to a particular location in the system and that location is a file.
Linux is Case Sensitive
This is very important and a common source of problems for people new to Linux. Other systems such as Windows are case insensitive when it comes to referring to files. Linux is not like this. As such it is possible to have two or more files and directories with the same name but letters of different case.
- ls Documents
- FILE1.txt File1.txt file1.TXT
- .
- file Documents/file1.txt
- Documents/file1.txt: ERROR: cannot open ‘file1.txt’ (No such file or directory)
Linux sees these all as distinct and separate files.
Also be aware of case sensitivity when dealing with command line options. For instance with the command ls there are two options s and S both of which do different things. A common mistake is to see an option which is upper case but enter it as lower case and wonder why the output doesn’t match your expectation.
Spaces in names
Spaces in file and directory names are perfectly valid but we need to be a little careful with them. As you would remember, a space on the command line is how we seperate items. They are how we know what is the program name and can identify each command line argument. If we wanted to move into a directory called Holiday Photos for example the following would not work.
- ls Documents
- FILE1.txt File1.txt file1.TXT Holiday Photos
- .
- cd Holiday Photos
- bash: cd: Holiday: No such file or directory
What happens is that Holiday Photos is seen as two command line arguments. cd moves into whichever directory is specified by the first command line argument only. To get around this we need to identify to the terminal that we wish Holiday Photos to be seen as a single command line argument. There are two ways to go about this, either way is just as valid.
Quotes
The first approach involves using quotes around the entire item. You may use either single or double quotes (later on we will see that there is a subtle difference between the two but for now that difference is not a problem). Anything inside quotes is considered a single item.
- cd ‘Holiday Photos’
- pwd
- /home/ryan/Documents/Holiday Photos
Escape Characters
Another method is to use what is called an escape character, which is a backslash ( \ ). What the backslash does is escape (or nullify) the special meaning of the next character.
- cd Holiday\ Photos
- pwd
- /home/ryan/Documents/Holiday Photos
In the above example the space between Holiday and Photos would normally have a special meaning which is to separate them as distinct command line arguments. Because we placed a backslash in front of it, that special meaning was removed.
In the previous section we learnt about something called Tab Completion. If you use that before encountering the space in the directory name then the terminal will automatically escape any spaces in the name for you.
Hidden Files and Directories
Linux actually has a very simple and elegant mechanism for specifying that a file or directory is hidden. If the file or directory’s name begins with a . (full stop) then it is considered to be hidden. You don’t even need a special command or action to make a file hidden. Files and directories may be hidden for a variety of reasons. Configuration files for a particular user (which are normally stored in their home directory) are hidden for instance so that they don’t get in the way of the user doing their everyday tasks.
To make a file or directory hidden all you need to do is create the file or directory with it’s name beginning with a . or rename it to be as such. Likewise you may rename a hidden file to remove the . and it will become unhidden. The command ls which we have seen in the previous section will not list hidden files and directories by default. We may modify it by including the command line option -a so that it does show hidden files and directories.
- ls Documents
- FILE1.txt File1.txt file1.TXT
- .
- ls -a Documents
- . .. FILE1.txt File1.txt file1.TXT .hidden .file.txt
- .
In the above example you will see that when we listed all items in our current directory the first two items were . and .. If you’re unsure what these are then you may wish to have a read over our previous section on Paths.
Summary
Activities
Right, now let’s put this stuff into practice. Have a go at the following:
- Try running the command file giving it a few different entries. Make sure you use a variety of absolute and relative paths when doing this.
- Now issue a command that will list the contents of your home directory including hidden files and directories.
Источник
Why does linux use file extension to decide the default program for opening a file though it’s independent of file extensions
I have a text file as- abc.text and it has its contents as Hi I’m a text file.
If I double click to open this file, then the files is opened in gedit editor.
Whereas, if I rename the file to abc.html (without changing any of its contents) then by default it opens in Chrome.
This sort of behavior is acceptable on a Windows machine, since Windows uses file extensions to identify file types. But as far as I’ve read, Linux doesn’t need file extensions.
So why does changing file extensions in Linux changes the default program that opens it?
3 Answers 3
Linux doesn’t use file extensions to decide how to open a file, but Linux uses file extensions to decide how to open a file.
The problem here is that “Linux” can designate different parts of the operating system, and “opening a file” can mean different things too.
A difference between Linux and Windows is how they treat application files vs data files. On Windows, the line between the two is blurred; there are a few types of executable files, and they are determined by their extension ( .exe , .bat , etc.), but in most contexts you can “execute” any file (e.g. by clicking in Explorer), and this executes the executable that is associated with that file type, where the file type is entirely determined by the extension (so executing a .doc file might start c:\Program Files\something or other\winword.exe , executing a .py file might start a Python interpreter, etc.).
On Linux, there is a notion of executable file which is independent of the file name. Executables generally have no extension, because they’re meant to be typed by the user. The type of the file is irrelevant, all the user wants to do is execute the file. The kernel determines how to execute the file from the file contents: it knows some file types natively, and the shebang mechanism allows a file to declare any other executable file¹ as its interpreter.
On the other hand, data files usually do have an extension that indicates the type of data. The general idea here is that the type of data is not synonymous with what application to use to open the file with. You may want to view a PDF in Okular, or in Evince, or in Xpdf, or in Acroread, or in Mupdf, etc.
There are many tools that do however allow opening a data file without having to explicitly specify what application to use. These tools almost exclusively base their decision on the file extension. The file extension and the file’s content are the only information that these tools have at their disposal: Linux does not store any meta information regarding the file format. So when you click on a .pdf file in a file manager (or when you run the .pdf file on a suitably-configured zsh command line, etc.), the file manager consults a database to find what application is the preferred one for .pdf file. This database may be structured in two sections, one that associates extensions to MIME types ( /etc/mime.types ,
/.local/share/mime ) and one that associates MIME types to applications ( /etc/mailcap ,
/.local/share/applications ), but even so the origin is the extension. While it would often be possible to figure out the application from the file content, this would be slower, and not always possible (many formats look just like text files, a .jar is a type of .zip , etc.).
Linux doesn’t need file extensions, and it doesn’t use them to determine how to run an executable file, but it does use them to determine which program to use to open a data file.
¹ That file has to be a native executable, a shebang executable can’t point to another shebang executable to avoid potentially unending recursion.
Источник
File extensions and association with programs in linux
In windows we can associate a file’s extension with programs.
E.g. a file test.pl can be run by the installed Perl interpreter due to the pl extension.
In linux though it needs #!/usr/bin/perl as the first line.
Is this because there is no association between file extensions and programs in Linux?
6 Answers 6
No, it doesn’t mean that. If you have a text-file that has it’s execute-permission set (e.g. chmod a+x somefile ), and the first line of the file is something like
It just tells Unix what program to use to execute the script. If a text-file is marked as an executable (i.e. is a script), Unix will start whatever program is specified in this way, and send the rest of the text-file (the script) to this program. Typically the program specified will be a shell ( /bin/sh , /bin/csh or /bin/bash ), the interpreter for some programming-language (Perl, Python or Ruby) or some other program that executes scripts (like the text-manipulators Awk or Sed).
Usually the «#» specify a comment in many languages, it’s only if the first line begins with «#!» it’s something special. If a file is marked as executable but doesn’t start with a «#!», Unix will assume it’s some kind of binary (e.g. an ELF-executable made by the C-compiler and linker).
In general Unix doesn’t rely on the suffix of files. Many programs neither needs nor automatically adds their typical suffixes, one exception being the compression-programs (like gzip and bzip2 ) which usually replaces the original file with a compressed one, adding a suffix to mark compression-type (these are one of the few programs that complains about incorrect suffix).
Instead the file is identified by it’s content through a series of tests, looking for «magic-numbers» and other identifiers (you can try the command file on some files to test this). This is also used by the file-browsers under GNOME and KDE to select icons and the list of programs to open/edit the file. Here the MIME-type of the file is identified by such tests, then the suitable programs for viewing and editing is found from a list associated to the MIME-type — not the suffix as in Windows.
Since one of the tests would be to check if the first line of a text-file is «#!/something», and then look at what «something» is; you could say for example that #!/usr/bin/perl identified the file as a perl-script — but that is more of a side-effect. Even if the file doesn’t start with a «#!», the tests should be able to correctly identify the file. In any case, it’s the content of the file that is used to identify it, not an arbitrary suffix. As such, endings like .pl (Perl) and .awk (Awk) is purely to help a human user to identify the type of file, it’s not used by Unix to determent type (like the suffixes in Windows).
You can actually make a «script» without the «#!/something», but Unix wouldn’t be able to automatically run it as an executable (it wouldn’t know what program to run the script in). Instead you’d have to «manually» start it with something like perl myscript or python myscript . Many scripts in larger Python and Perl applications will actually not start with «#!/something», as they’re scripts for «internal use» and not intended to be invoked by the user directly.
Instead you’ll start the main script (which does start with a «#!/something»), and then it will pass these other scripts to the interpreter as this script runs.
Источник