- extract from line to line and then save to separate file
- 3 Answers 3
- cat line X to line Y on a huge file
- 7 Answers 7
- Copy a long single-line text from a terminal with undesired change-line
- 5 Answers 5
- Copy previous line to current line from text file
- 7 Answers 7
- Copy text from nano editor to shell [closed]
- 14 Answers 14
- To select text
- To copy:
- To paste:
extract from line to line and then save to separate file
I tried my luck with grep and sed but somehow I don’t manage to get it right.
I have a log file which is about 8 GB in size. I need to analyze a 15 minute time period of suspicious activity. I located the part of the log file that I need to look at and I am trying to extract those lines and save it into a separate file. How would I do that on a regular CentOS machine?
My last try was this but it didn’t work. I am at loss when it comes to sed and those type of commands.
3 Answers 3
Probably the best way to do this is with shell redirection, as others have mentioned. sed though, while a personal favorite, is probably not going to do this more efficiently than will head — which is designed to grab only so many lines from a file.
There are other answers on this site which demonstrably show that for large files head -n[num] | tail -n[num] will outperform sed every time, but probably even faster than that is to eschew the pipe altogether.
I created a file like:
And I ran it through:
I only used sed at all there to grab only the first and last line to show you.
This works because when you group commands with < . ; >and redirect the input for the group like . ; > all of them will share the same input. Most commands will exhaust the whole infile while reading it so in a < cmd1 ; cmd2; >case usually cmd1 reads from the head of the infile to its tail and cmd2 is left with none.
head , however, will always seek only so far through its infile as it is instructed to do, and so in a.
. case the first seeks through to [num] and dumps its output to /dev/null and the second is left to begin its read where the first left it.
This construct also works with other kinds of compound commands. For example:
But it might also work like:
Above the shell initially sets the $n and $d variables to .
- $n
- The line count as reported by wc for my test file /tmp/5mil_lines
- $d
- The quotient of $n/43 where 43 is just some arbitrarily selected divisor.
It then loops until it has decremented $n by $d to a value less $d . While doing so it saves its split count in $s and uses that value in the loop to increment the named > output file called /tmp/[num].split . The result is that it reads out an equal number of \n ewline delimited fields in its infile to a new outfile for each iteration — splitting it out equally 43 times over the course of the loop. It manages it without having to read its infile any more than 2 times — the first time is when wc does it to count its lines, and for the rest of the operation it only reads as many lines as it writes to the outfile each time.
After running it I checked my results like.
Источник
cat line X to line Y on a huge file
Say I have a huge text file (>2GB) and I just want to cat the lines X to Y (e.g. 57890000 to 57890010).
From what I understand I can do this by piping head into tail or viceversa, i.e.
where A , B , C and D can be computed from the number of lines in the file, X and Y .
But there are two problems with this approach:
- You have to compute A , B , C and D .
- The commands could pipe to each other many more lines than I am interested in reading (e.g. if I am reading just a few lines in the middle of a huge file)
Is there a way to have the shell just work with and output the lines I want? (while providing only X and Y )?
7 Answers 7
I suggest the sed solution, but for the sake of completeness,
To cut out after the last line:
Speed test (here on macOS, YMMV on other systems):
- 100,000,000-line file generated by seq 100000000 > test.in
- Reading lines 50,000,000-50,000,010
- Tests in no particular order
- real time as reported by bash ‘s builtin time
These are by no means precise benchmarks, but the difference is clear and repeatable enough* to give a good sense of the relative speed of each of these commands.
*: Except between the first two, sed -n p;q and head|tail , which seem to be essentially the same.
If you want lines X to Y inclusive (starting the numbering at 1), use
tail will read and discard the first X-1 lines (there’s no way around that), then read and print the following lines. head will read and print the requested number of lines, then exit. When head exits, tail receives a SIGPIPE signal and dies, so it won’t have read more than a buffer size’s worth (typically a few kilobytes) of lines from the input file.
Alternatively, as gorkypl suggested, use sed:
The sed solution is significantly slower though (at least for GNU utilities and Busybox utilities; sed might be more competitive if you extract a large part of the file on an OS where piping is slow and sed is fast). Here are quick benchmarks under Linux; the data was generated by seq 100000000 >/tmp/a , the environment is Linux/amd64, /tmp is tmpfs and the machine is otherwise idle and not swapping.
If you know the byte range you want to work with, you can extract it faster by skipping directly to the start position. But for lines, you have to read from the beginning and count newlines. To extract blocks from x inclusive to y exclusive starting at 0, with a block size of b:
The head | tail approach is one of the best and most «idiomatic» ways to do this:
As pointed out by Gilles in the comments, a faster way is
The reason this is faster is the first X — 1 lines don’t need to go through the pipe compared to the head | tail approach.
Your question as phrased is a bit misleading and probably explains some of your unfounded misgivings towards this approach.
You say you have to calculate A , B , C , D but as you can see, the line count of the file is not needed and at most 1 calculation is necessary, which the shell can do for you anyways.
You worry that piping will read more lines than necessary. In fact this is not true: tail | head is about as efficient as you can get in terms of file I/O. First, consider the minimum amount of work necessary: to find the X‘th line in a file, the only general way to do it is to read every byte and stop when you count X newline symbols as there is no way to divine the file offset of the X‘th line. Once you reach the *X*th line, you have to read all the lines in order to print them, stopping at the Y‘th line. Thus no approach can get away with reading less than Y lines. Now, head -n $Y reads no more than Y lines (rounded to the nearest buffer unit, but buffers if used correctly improve performance, so no need to worry about that overhead). In addition, tail will not read any more than head , so thus we have shown that head | tail reads the fewest number of lines possible (again, plus some negligible buffering that we are ignoring). The only efficiency advantage of a single tool approach that does not use pipes is fewer processes (and thus less overhead).
Источник
Copy a long single-line text from a terminal with undesired change-line
locate . | less outputs a long path, which is shown as several lines in the terminal, although it is actually a single line.
I copy the path from the terminal by selecting the text with mouse and hitting Ctrl + Shift + C . When I paste it into a text file, I get unwanted change-line in the text, exactly in the same way that it is shown in the terminal.
But I remember sometimes I can copy a long path output by locate . | less , without introducing unwanted line-change, and sometimes I can’t. I don’t realize if I do something differently.
So I wonder how to make sure the problem will not happen? Thanks!
My OS: Ubuntu 12.04.
My terminal: Gnome Terminal 3.4.1.1
5 Answers 5
The real behavior of this is the following:
if you are in less and you have a file with a very long line, then if you scroll over the long line down and scroll back up, you have multiple lines, when copying the whole text. After that, when you scroll the splitted lines down over the bottom of the terminal and scroll the lines up again, the line breaks will be removed again.
You can copy the lines of output without extra newlines (line breaks) if the text was output directly to the terminal.
The terminal can keep track of where the real line ends are.
But if the lines were output by less or a similar pager program, the terminal does not know where the newlines are. The pager uses the terminal as a full screen of characters, and tells the terminal «put these characters there» and operations like scrolling. But, for the terminal, there’s no way to see where newlines are intended.
Instead of copying what is displayed on the terminal (only what fits on the screen, with), copy the actual text. Use one of the external utilities xsel or xclip (they have mostly the same features, I’ll use xsel in this answer) to copy data from or to the X clipboard. To copy to the clipboard, pass the desired content on standard input. When pasting from the clipboard, the content is written to standard output.
In less, use the | command to pipe a bunch of lines through a command. Scroll to the first line you want to act on, type mm to set a mark, scroll to the last line, and type |mxsel -b and press Enter . Two marks are predefined: ^ for the beginning of the file, $ for the end of the file. Thus, to copy the whole file, use . To copy a single line, use mm|mxsel -b and Enter .
Remove the -b option to copy to the primary selection instead of the clipboard.
Источник
Copy previous line to current line from text file
I have a file with lines, just like this:
I want to create a duplicate file in bash that contains each line merged with the copy of next line, like:
7 Answers 7
which with your input, gives
Quick’n’dirty way (involves reading the file twice):
What that sed expression is doing:
- x : save the incoming line in hold space, and retrieve the previous one
- G : append the new line (from hold space) to the old one
- s_\n_;_ : replace line-break with a ; .
- 1d : if this is the first line, delete it (don’t print it) and advance to next
- $ <. ;>: if this is the last line.
- p : first print the joined pair
- x : retrieve the final line
- s_$_;_ : append final ;
Somewhat simpler sed solution without hold space:
- $!N to join next line (if any; the $! is not needed with GNU sed when not in POSIX mode)
- y/\n/;/ replace the newline with ;
- p rint the resulting line
- y/;/\n/ to change back to newline, so with
- D you can get rid of the first line and continue with the next one
Or, if you’re into the whole brevity thing:
A Vim Solution
One can issue this command (credit to Conspicuous Compiler for suggesting this):
Alternatively and probably more customarily, start Vim and open the file and issue the ex command:
Substitute the pattern:
- the new line character, \n
- a character group, \(.+\) , which composes the entire next line. The quantifier which follows, * , just indicates that there can be zero or more matches
with the following:
- a semicolon
- followed by a reference to the character group, \1
- followed by the new line character, \n
- followed by a second reference to the character group, \1 .
Источник
Copy text from nano editor to shell [closed]
Want to improve this question? Update the question so it’s on-topic for Stack Overflow.
Closed 5 months ago .
Is it possible to copy text from a file, opened with nano , to the shell?
I have a text file, and I want to copy several lines to the console, but I cannot find a keyboard shortcut to copy the text.
14 Answers 14
Nano to Shell:
1. Using mouse to mark the text.
2. Right-Click the mouse in the Shell.
Within Nano:
1. CTRL + 6 (or CTRL + Shift + 6 or hold Shift and move cursor) for Mark Set and mark what you want (the end could do some extra help).
2. ALT + 6 for copying the marked text.
3. CTRL + u at the place you want to paste.
or
1. CTRL + 6 (or CTRL + Shift + 6 or hold Shift and move cursor) for Mark Set and mark what you want (the end could do some extra help).
2. CTRL + k for cutting what you want to copy
3. CTRL + u for pasting what you have just cut because you just want to copy.
4. CTRL + u at the place you want to paste.
For whoever still looking for a copy + paste solution in nano editor
To select text
- ctrl + 6
- Use arrow to move the cursor to where you want the mark to end
Note: If you want to copy the whole line, no need to mark just move the cursor to the line
To copy:
To paste:
Much easier method:
$ cat my_file
Ctrl + Shift + c to copy the required output from the terminal
Ctrl + Shift + v to paste it wherever you like
5 minutes or so. SO rules.
nano does not seem to have the ability to copy/paste from the global/system clipboard or shell.
However, you can copy text from one file to another using nano ‘s file buffers. When you open another file buffer with ^R ( Ctrl + r ), you can use nano s built-in copy/paste functionality (outlined below) to copy between files:
- M-6 ( Meta + 6 ) to copy lines to nano ‘s clipboard.
- ^K ( Ctrl + k ) to cut the current line and store it in nano ‘s clipboard.
- ^^ ( Ctrl + Shift + 6 ) to select text. Once you have selected the text, you can use the above commands to copy it or cut it.
- ^U ( Ctrl + u ) to paste the text from nano ‘s clipboard.
Finally, if the above solution will not work for you and you are using a terminal emulator, you may be able to copy/paste from the global clipboard with Ctrl + Shift + c and Ctrl + Shift + v ( Cmd + c and Cmd + v on OSX) respectively. screen also provides an external copy/paste that should work in nano . Finally if all you need to do is capture certain lines or text from a file, consider using grep to find the lines and xclip or xsel (or pbcopy / pbpaste on OSX) to copy them to the global clipboard (and/or paste from the clipboard) instead of nano .
Источник