Linux bash parse string

Bash Beginner Series #6: String Operations in Bash

Let’s manipulate some strings!

If you are familiar with variables in bash, you already know that there are no separate data types for string, int etc. Everything is a variable.

But this doesn’t mean that you don’t have string manipulation functions.

In the previous chapter, you learned arithmetic operators in Bash. In this chapter, you will learn how to manipulate strings using a variety of string operations. You will learn how to get the length of a string, concatenate strings, extract substrings, replace substrings, and much more!

Get string length

Let’s start with getting the length of a string in bash.

A string is nothing but a sequence (array) of characters. Let’s create a string named distro and initialize its value to “Ubuntu”.

Now to get the length of the distro string, you just have to add # before the variable name. You can use the following echo statement:

Do note that echo command is for printing the value. <#string>is what gives the length of string.

Concatenating two strings

You can append a string to the end of another string; this process is called string concatenation.

To demonstrate, let’s first create two strings str1 andstr2 as follows:

Now you can join both strings and assign the result to a new string named str3 as follows:

It cannot be simpler than this, can it?

Finding substrings

You can find the position (index) of a specific letter or word in a string. To demonstrate, let’s first create a string named str as follows:

Now you can get the specific position (index) of the substring cool. To accomplish that, use the expr command:

The result 9 is the index where the word “Cool” starts in the str string.

I am deliberately avoiding using conditional statements such as if, else because in this bash beginner series, conditional statements will be covered later.

Extracting substrings

You can also extract substrings from a string; that is to say, you can extract a letter, a word, or a few words from a string.

To demonstrate, let’s first create a string named foss as follows:

Now let’s say you want to extract the first word “Fedora” in the foss string. You need to specify the starting position (index) of the desired substring and the number of characters you need to extract.

Therefore, to extract the substring “Fedora”, you will use 0 as the starting position and you will extract 6 characters from the starting position:

Notice that the first position in a string is zero just like the case with arrays in bash. You may also only specify the starting position of a substring and omit the number of characters. In this case, everything from the starting position to the end of the string will be extracted.

For example, to extract the substring “free operating system” from the foss string; we only need to specify the starting position 12:

Replacing substrings

You can also replace a substring with another substring; for example, you can replace “Fedora” with “Ubuntu” in the foss string as follows:

Let’s do another example, let’s replace the substring “free” with “popular”:

Since you are just printing the value with echo command, the original string is not reallt altered.

Deleting substrings

You can also remove substrings. To demonstrate, let’s first create a string named fact as follows:

You can now remove the substring “big” from the string fact:

Let’s create another string named cell:

Now let’s say you want to remove all the dashes from the cell string; the following statement will only remove the first dash occurrence in the cell string:

To remove all dash occurrences from the cell string, you have to use double forward slashes as follows:

Notice that you are using echo statements and so the cell string is intact and not modified; you are just displaying the desired result!

Читайте также:  Принципы работы реестра windows

To modify the string, you need to assign the result back to the string as follows:

Converting upper and lower-case letters in string

You can also convert a string to lowercase letter or uppercase letters. Let’s first create two string named legend and actor:

You can convert all the letters in the legend string to uppercase:

You can also convert all the letters in the actor string to lowercase:

You can also convert only the first character of the legend string to uppercase as follows:

Likewise, you can convert only the first character of the actor string to lowercase as follows:

You can also change certain characters in a string to uppercase or lowercase; for example, you can change the letters j and n to uppercase in the legend string as follows:

Awesome! This brings us to the end of this tutorial in the bash beginner series. I hope you have enjoyed doing string manipulation in bash and stay tuned for next week as you will learn how to add decision-making skills to your bash scripts!

Источник

Bash String Manipulation Examples – Length, Substring, Find and Replace

In bash shell, when you use a dollar sign followed by a variable name, shell expands the variable with its value. This feature of shell is called parameter expansion.

But parameter expansion has numerous other forms which allow you to expand a parameter and modify the value or substitute other values in the expansion process. In this article, let us review how to use the parameter expansion concept for string manipulation operations.

This article is part of the on-going bash tutorial series. Refer to our earlier article on bash < >expansion.

1. Identify String Length inside Bash Shell Script

The above format is used to get the length of the given bash variable.

2. Extract a Substring from a Variable inside Bash Shell Script

Bash provides a way to extract a substring from a string. The following example expains how to parse n characters starting from a particular position.

Extract substring from $string at $position

Extract $length of characters substring from $string starting from $position. In the below example, first echo statement returns the substring starting from 15th position. Second echo statement returns the 4 characters starting from 15th position. Length must be the number greater than or equal to zero.

Also, refer to our earlier article to understand more about $*, $@, $#, $$, $!, $?, $-, $_ bash special parameters.

3. Shortest Substring Match

Following syntax deletes the shortest match of $substring from front of $string

Following syntax deletes the shortest match of $substring from back of $string

Following sample shell script explains the above two shortest substring match concepts.

In the first echo statement substring ‘*.’ matches the characters and a dot, and # strips from the front of the string, so it strips the substring “bash.” from the variable called filename. In second echo statement substring ‘.*’ matches the substring starts with dot, and % strips from back of the string, so it deletes the substring ‘.txt’

4. Longest Substring Match

Following syntax deletes the longest match of $substring from front of $string

Following syntax deletes the longest match of $substring from back of $string

Following sample shell script explains the above two longest substring match concepts.

In the above example, ##*. strips longest match for ‘*.’ which matches “bash.string.” so after striping this, it prints the remaining txt. And %%.* strips the longest match for .* from back which matches “.string.txt”, after striping it returns “bash”.

5. Find and Replace String Values inside Bash Shell Script

Replace only first match

It matches the pattern in the variable $string, and replace only the first match of the pattern with the replacement.

Replace all the matches

It replaces all the matches of pattern with replacement.

Taking about find and replace, refer to our earlier articles – sed substitute examples and Vim find and replace.

Replace beginning and end

Following syntax replaces with the replacement string, only when the pattern matches beginning of the $string.

Following syntax replaces with the replacement string, only when the pattern matches at the end of the given $string.

Bash 101 Hacks, by Ramesh Natarajan. I spend most of my time on Linux environment. So, naturally I’m a huge fan of Bash command line and shell scripting. 15 years back, when I was working on different flavors of *nix, I used to write lot of code on C shell and Korn shell. Later years, when I started working on Linux as system administrator, I pretty much automated every possible task using Bash shell scripting. Based on my Bash experience, I’ve written Bash 101 Hacks eBook that contains 101 practical examples on both Bash command line and shell scripting. If you’ve been thinking about mastering Bash, do yourself a favor and read this book, which will help you take control of your Bash command line and shell scripting.

Читайте также:  Intel graphics media accelerator 950 linux

Источник

Linux bash parse string

expr length $string

stringZ=abcABC123ABCabc echo $ <#stringZ># 15 echo `expr length $stringZ` # 15 echo `expr «$stringZ» : ‘.*’` # 15

Example 10-1. Inserting a blank line between paragraphs in a text file

#!/bin/bash # paragraph-space.sh # Ver. 2.1, Reldate 29Jul12 [fixup] # Inserts a blank line between paragraphs of a single-spaced text file. # Usage: $0

Length of Matching Substring at Beginning of String

expr «$string» : ‘$substring’

$substring is a regular expression.

stringZ=abcABC123ABCabc # |——| # 12345678 echo `expr match «$stringZ» ‘abc[A-Z]*.2’` # 8 echo `expr «$stringZ» : ‘abc[A-Z]*.2’` # 8

Numerical position in $string of first character in $substring that matches.

stringZ=abcABC123ABCabc # 123456 . echo `expr index «$stringZ» C12` # 6 # C position. echo `expr index «$stringZ» 1c` # 3 # ‘c’ (in #3 position) matches before ‘1’.

This is the near equivalent of strchr() in C .

Extracts substring from $string at $position .

If the $string parameter is » * » or » @ » , then this extracts the positional parameters , [1] starting at $position .

Extracts $length characters of substring from $string at $position .

stringZ=abcABC123ABCabc # 0123456789. # 0-based indexing. echo $ # abcABC123ABCabc echo $ # bcABC123ABCabc echo $ # 23ABCabc echo $ # 23A # Three characters of substring. # Is it possible to index from the right end of the string? echo $ # abcABC123ABCabc # Defaults to full string, as in $. # However . . . echo $ # Cabc echo $ # Cabc # Now, it works. # Parentheses or added space «escape» the position parameter. # Thank you, Dan Jacobson, for pointing this out.

The position and length arguments can be «parameterized,» that is, represented as a variable, rather than as a numerical constant.

Example 10-2. Generating an 8-character «random» string

#!/bin/bash # rand-string.sh # Generating an 8-character «random» string. if [ -n «$1″ ] # If command-line argument present, then #+ then set start-string to it. str0=»$1″ else # Else use PID of script as start-string. str0=»$$» fi POS=2 # Starting from position 2 in the string. LEN=8 # Extract eight characters. str1=$( echo «$str0» | md5sum | md5sum ) # Doubly scramble ^^^^^^ ^^^^^^ #+ by piping and repiping to md5sum. randstring=»$» # Can parameterize ^^^^ ^^^^ echo «$randstring» exit $? # bozo$ ./rand-string.sh my-password # 1bdd88c4 # No, this is is not recommended #+ as a method of generating hack-proof passwords.

If the $string parameter is » * » or » @ » , then this extracts a maximum of $length positional parameters, starting at $position .

echo $ <*:2># Echoes second and following positional parameters. echo $ <@:2># Same as above. echo $ <*:2:3># Echoes three positional parameters, starting at second.

expr substr $string $position $length

Extracts $length characters from $string starting at $position .

stringZ=abcABC123ABCabc # 123456789. # 1-based indexing. echo `expr substr $stringZ 1 2` # ab echo `expr substr $stringZ 4 3` # ABC

expr match «$string» ‘\($substring\)’

Extracts $substring at beginning of $string , where $substring is a regular expression .

expr «$string» : ‘\($substring\)’

Extracts $substring at beginning of $string , where $substring is a regular expression.

stringZ=abcABC123ABCabc # ======= echo `expr match «$stringZ» ‘\(.[b-c]*[A-Z]..9\)’` # abcABC1 echo `expr «$stringZ» : ‘\(.[b-c]*[A-Z]..3\)’` # abcABC1 echo `expr «$stringZ» : ‘\(. \)’` # abcABC1 # All of the above forms give an identical result.

expr match «$string» ‘.*\($substring\)’

Extracts $substring at end of $string , where $substring is a regular expression.

expr «$string» : ‘.*\($substring\)’

Extracts $substring at end of $string , where $substring is a regular expression.

stringZ=abcABC123ABCabc # ====== echo `expr match «$stringZ» ‘.*\([A-C][A-C][A-C][a-c]*\)’` # ABCabc echo `expr «$stringZ» : ‘.*\(. \)’` # ABCabc

Deletes shortest match of $substring from front of $string .

Deletes longest match of $substring from front of $string .

stringZ=abcABC123ABCabc # |—-| shortest # |———-| longest echo $ # 123ABCabc # Strip out shortest match between ‘a’ and ‘C’. echo $ # abc # Strip out longest match between ‘a’ and ‘C’. # You can parameterize the substrings. X=’a*C’ echo $ # 123ABCabc echo $ # abc # As above.

Deletes shortest match of $substring from back of $string .

# Rename all filenames in $PWD with «TXT» suffix to a «txt» suffix. # For example, «file1.TXT» becomes «file1.txt» . . . SUFF=TXT suff=txt for i in $(ls *.$SUFF) do mv -f $i $.$suff # Leave unchanged everything *except* the shortest pattern match #+ starting from the right-hand-side of the variable $i . . . done ### This could be condensed into a «one-liner» if desired. # Thank you, Rory Winston.

Deletes longest match of $substring from back of $string .

stringZ=abcABC123ABCabc # || shortest # |————| longest echo $ # abcABC123ABCa # Strip out shortest match between ‘b’ and ‘c’, from back of $stringZ. echo $ # a # Strip out longest match between ‘b’ and ‘c’, from back of $stringZ.

Читайте также:  Для чего используются библиотеки windows dll

This operator is useful for generating filenames.

Example 10-3. Converting graphic file formats, with filename change

#!/bin/bash # cvt.sh: # Converts all the MacPaint image files in a directory to «pbm» format. # Uses the «macptopbm» binary from the «netpbm» package, #+ which is maintained by Brian Henderson (bryanh@giraffe-data.com). # Netpbm is a standard part of most Linux distros. OPERATION=macptopbm SUFFIX=pbm # New filename suffix. if [ -n «$1» ] then directory=$1 # If directory name given as a script argument. else directory=$PWD # Otherwise use current working directory. fi # Assumes all files in the target directory are MacPaint image files, #+ with a «.mac» filename suffix. for file in $directory/* # Filename globbing. do filename=$ # Strip «.mac» suffix off filename #+ (‘.*c’ matches everything #+ between ‘.’ and ‘c’, inclusive). $OPERATION $file > «$filename.$SUFFIX» # Redirect conversion to new filename. rm -f $file # Delete original files after converting. echo «$filename.$SUFFIX» # Log what is happening to stdout. done exit 0 # Exercise: # ——— # As it stands, this script converts *all* the files in the current #+ working directory. # Modify it to work *only* on files with a «.mac» suffix. # *** And here’s another way to do it. *** # #!/bin/bash # Batch convert into different graphic formats. # Assumes imagemagick installed (standard in most Linux distros). INFMT=png # Can be tif, jpg, gif, etc. OUTFMT=pdf # Can be tif, jpg, gif, pdf, etc. for pic in *»$INFMT» do p2=$(ls «$pic» | sed -e s/\.$INFMT//) # echo $p2 convert «$pic» $p2.$OUTFMT done exit $?

Example 10-4. Converting streaming audio files to ogg

#!/bin/bash # ra2ogg.sh: Convert streaming audio files (*.ra) to ogg. # Uses the «mplayer» media player program: # http://www.mplayerhq.hu/homepage # Uses the «ogg» library and «oggenc»: # http://www.xiph.org/ # # This script may need appropriate codecs installed, such as sipr.so . # Possibly also the compat-libstdc++ package. OFILEPREF=$ <1%%ra># Strip off the «ra» suffix. OFILESUFF=wav # Suffix for wav file. OUTFILE=»$OFILEPREF»»$OFILESUFF» E_NOARGS=85 if [ -z «$1» ] # Must specify a filename to convert. then echo «Usage: `basename $0` [filename]» exit $E_NOARGS fi ########################################################################## mplayer «$1» -ao pcm:file=$OUTFILE oggenc «$OUTFILE» # Correct file extension automatically added by oggenc. ########################################################################## rm «$OUTFILE» # Delete intermediate *.wav file. # If you want to keep it, comment out above line. exit $? # Note: # —- # On a Website, simply clicking on a *.ram streaming audio file #+ usually only downloads the URL of the actual *.ra audio file. # You can then use «wget» or something similar #+ to download the *.ra file itself. # Exercises: # ——— # As is, this script converts only *.ra filenames. # Add flexibility by permitting use of *.ram and other filenames. # # If you’re really ambitious, expand the script #+ to do automatic downloads and conversions of streaming audio files. # Given a URL, batch download streaming audio files (using «wget») #+ and convert them on the fly.

Example 10-5. Emulating getopt

#!/bin/bash # getopt-simple.sh # Author: Chris Morgan # Used in the ABS Guide with permission. getopt_simple() < echo "getopt_simple()" echo "Parameters are '$*'" until [ -z "$1" ] do echo "Processing parameter of: '$1'" if [ $<1:0:1>= ‘/’ ] then tmp=$ <1:1># Strip off leading ‘/’ . . . parameter=$ # Extract name. value=$ # Extract value. echo «Parameter: ‘$parameter’, value: ‘$value'» eval $parameter=$value fi shift done > # Pass all options to getopt_simple(). getopt_simple $* echo «test is ‘$test'» echo «test2 is ‘$test2′» exit 0 # See also, UseGetOpt.sh, a modified version of this script. — sh getopt_example.sh /test=value1 /test2=value2 Parameters are ‘/test=value1 /test2=value2’ Processing parameter of: ‘/test=value1’ Parameter: ‘test’, value: ‘value1’ Processing parameter of: ‘/test2=value2’ Parameter: ‘test2’, value: ‘value2’ test is ‘value1’ test2 is ‘value2’

Replace first match of $substring with $replacement . [2]

Replace all matches of $substring with $replacement .

stringZ=abcABC123ABCabc echo $ # xyzABC123ABCabc # Replaces first match of ‘abc’ with ‘xyz’. echo $ # xyzABC123ABCxyz # Replaces all matches of ‘abc’ with # ‘xyz’. echo ————— echo «$stringZ» # abcABC123ABCabc echo ————— # The string itself is not altered! # Can the match and replacement strings be parameterized? match=abc repl=000 echo $ # 000ABC123ABCabc # ^ ^ ^^^ echo $ # 000ABC123ABC000 # Yes! ^ ^ ^^^ ^^^ echo # What happens if no $replacement string is supplied? echo $ # ABC123ABCabc echo $ # ABC123ABC # A simple deletion takes place.

If $substring matches front end of $string , substitute $replacement for $substring .

If $substring matches back end of $string , substitute $replacement for $substring .

stringZ=abcABC123ABCabc echo $ # XYZABC123ABCabc # Replaces front-end match of ‘abc’ with ‘XYZ’. echo $ # abcABC123ABCXYZ # Replaces back-end match of ‘abc’ with ‘XYZ’.

A Bash script may invoke the string manipulation facilities of awk as an alternative to using its built-in operations.

Example 10-6. Alternate ways of extracting and locating substrings

Источник

Оцените статью