Linux string extract substring

Landoflinux

Testing Variables using the substring command

Substrings and Variables

A substring is basically a sequence of characters within a string. For example «am a subs» is subsection of the string «I am a substring«. Bash gives you the ability to extract information from these strings. Below are some of the frequently used methods of extracting information:

Length of a String

We can retrieve the length of a given string by using the following: $ .(Where string is the name of our variable). We can illustrate this easily with a simple script:

When we run the above script we will receive a value for the length of string. In this example, our string «Welcome to the land of Linux» is 28 characters in length:

Knowing the length of a string can be valuable when you need to validate an entry. You may ask for an entry to be a minimum of x characters and and have a Max size of y characters.

Extracting a Substring from a Variable

Another very useful feature within Bash is the ability to to extract specific sections of text from a variable. This can be achieved by using the following within your script:

«string» specifies our variable name and the «position» is our starting point to extract information from. We can fine tune this further by using the «length» parameter. When length is used with position this means display information starting at position x for a length of y. We can illustrate this with a simple script:

Output from the above script:

Notice our start position is «0» zero. This is the very first character of our string. Therefore if we look in position «0» for a length of 7 characters we will find the word «Welcome«. Likewise the statement $ returns «Land of Linux«. If we specify only $ we get the full string returned as position «0» is the starting position.

Replace FIRST match of $substring with replacement value

The following command allows you to replace the first match of the specified substring with a replacement value:

Example:

Output from the above script:

Replace ALL matches of $substring with replacement value

The following command allows you to replace all matches of the specified substring with a replacement value:

Example:

Output from the above script:

index — Numerical position in $string of first character in $substring that matches

The following «index» command is useful as it allows you to find the first character within a substring. The index command will report back the position within the substring if a match is found or a «0» if no match was found:

The following syntax can be used: expr index «$string» $substring

The script below illustrates the basic use of the «index» command.

Output from the above script:

From the above output, we can see that the character «C» was found in position «5«, however, no match was found for the character «Z» which resulted in a position of «0» being reported (not found).

Источник

Bash Beginner Series #6: String Operations in Bash

Let’s manipulate some strings!

If you are familiar with variables in bash, you already know that there are no separate data types for string, int etc. Everything is a variable.

But this doesn’t mean that you don’t have string manipulation functions.

In the previous chapter, you learned arithmetic operators in Bash. In this chapter, you will learn how to manipulate strings using a variety of string operations. You will learn how to get the length of a string, concatenate strings, extract substrings, replace substrings, and much more!

Get string length

Let’s start with getting the length of a string in bash.

A string is nothing but a sequence (array) of characters. Let’s create a string named distro and initialize its value to “Ubuntu”.

Читайте также:  Драйвер для мфу xerox 3119 драйвер windows

Now to get the length of the distro string, you just have to add # before the variable name. You can use the following echo statement:

Do note that echo command is for printing the value. <#string>is what gives the length of string.

Concatenating two strings

You can append a string to the end of another string; this process is called string concatenation.

To demonstrate, let’s first create two strings str1 andstr2 as follows:

Now you can join both strings and assign the result to a new string named str3 as follows:

It cannot be simpler than this, can it?

Finding substrings

You can find the position (index) of a specific letter or word in a string. To demonstrate, let’s first create a string named str as follows:

Now you can get the specific position (index) of the substring cool. To accomplish that, use the expr command:

The result 9 is the index where the word “Cool” starts in the str string.

I am deliberately avoiding using conditional statements such as if, else because in this bash beginner series, conditional statements will be covered later.

Extracting substrings

You can also extract substrings from a string; that is to say, you can extract a letter, a word, or a few words from a string.

To demonstrate, let’s first create a string named foss as follows:

Now let’s say you want to extract the first word “Fedora” in the foss string. You need to specify the starting position (index) of the desired substring and the number of characters you need to extract.

Therefore, to extract the substring “Fedora”, you will use 0 as the starting position and you will extract 6 characters from the starting position:

Notice that the first position in a string is zero just like the case with arrays in bash. You may also only specify the starting position of a substring and omit the number of characters. In this case, everything from the starting position to the end of the string will be extracted.

For example, to extract the substring “free operating system” from the foss string; we only need to specify the starting position 12:

Replacing substrings

You can also replace a substring with another substring; for example, you can replace “Fedora” with “Ubuntu” in the foss string as follows:

Let’s do another example, let’s replace the substring “free” with “popular”:

Since you are just printing the value with echo command, the original string is not reallt altered.

Deleting substrings

You can also remove substrings. To demonstrate, let’s first create a string named fact as follows:

You can now remove the substring “big” from the string fact:

Let’s create another string named cell:

Now let’s say you want to remove all the dashes from the cell string; the following statement will only remove the first dash occurrence in the cell string:

To remove all dash occurrences from the cell string, you have to use double forward slashes as follows:

Notice that you are using echo statements and so the cell string is intact and not modified; you are just displaying the desired result!

To modify the string, you need to assign the result back to the string as follows:

Converting upper and lower-case letters in string

You can also convert a string to lowercase letter or uppercase letters. Let’s first create two string named legend and actor:

You can convert all the letters in the legend string to uppercase:

You can also convert all the letters in the actor string to lowercase:

You can also convert only the first character of the legend string to uppercase as follows:

Likewise, you can convert only the first character of the actor string to lowercase as follows:

You can also change certain characters in a string to uppercase or lowercase; for example, you can change the letters j and n to uppercase in the legend string as follows:

Awesome! This brings us to the end of this tutorial in the bash beginner series. I hope you have enjoyed doing string manipulation in bash and stay tuned for next week as you will learn how to add decision-making skills to your bash scripts!

Читайте также:  Файловые системы linux сравнение

Источник

Linux string extract substring

expr length $string

stringZ=abcABC123ABCabc echo $ <#stringZ># 15 echo `expr length $stringZ` # 15 echo `expr «$stringZ» : ‘.*’` # 15

Example 10-1. Inserting a blank line between paragraphs in a text file

#!/bin/bash # paragraph-space.sh # Ver. 2.1, Reldate 29Jul12 [fixup] # Inserts a blank line between paragraphs of a single-spaced text file. # Usage: $0

Length of Matching Substring at Beginning of String

expr «$string» : ‘$substring’

$substring is a regular expression.

stringZ=abcABC123ABCabc # |——| # 12345678 echo `expr match «$stringZ» ‘abc[A-Z]*.2’` # 8 echo `expr «$stringZ» : ‘abc[A-Z]*.2’` # 8

Numerical position in $string of first character in $substring that matches.

stringZ=abcABC123ABCabc # 123456 . echo `expr index «$stringZ» C12` # 6 # C position. echo `expr index «$stringZ» 1c` # 3 # ‘c’ (in #3 position) matches before ‘1’.

This is the near equivalent of strchr() in C .

Extracts substring from $string at $position .

If the $string parameter is » * » or » @ » , then this extracts the positional parameters , [1] starting at $position .

Extracts $length characters of substring from $string at $position .

stringZ=abcABC123ABCabc # 0123456789. # 0-based indexing. echo $ # abcABC123ABCabc echo $ # bcABC123ABCabc echo $ # 23ABCabc echo $ # 23A # Three characters of substring. # Is it possible to index from the right end of the string? echo $ # abcABC123ABCabc # Defaults to full string, as in $. # However . . . echo $ # Cabc echo $ # Cabc # Now, it works. # Parentheses or added space «escape» the position parameter. # Thank you, Dan Jacobson, for pointing this out.

The position and length arguments can be «parameterized,» that is, represented as a variable, rather than as a numerical constant.

Example 10-2. Generating an 8-character «random» string

#!/bin/bash # rand-string.sh # Generating an 8-character «random» string. if [ -n «$1″ ] # If command-line argument present, then #+ then set start-string to it. str0=»$1″ else # Else use PID of script as start-string. str0=»$$» fi POS=2 # Starting from position 2 in the string. LEN=8 # Extract eight characters. str1=$( echo «$str0» | md5sum | md5sum ) # Doubly scramble ^^^^^^ ^^^^^^ #+ by piping and repiping to md5sum. randstring=»$» # Can parameterize ^^^^ ^^^^ echo «$randstring» exit $? # bozo$ ./rand-string.sh my-password # 1bdd88c4 # No, this is is not recommended #+ as a method of generating hack-proof passwords.

If the $string parameter is » * » or » @ » , then this extracts a maximum of $length positional parameters, starting at $position .

echo $ <*:2># Echoes second and following positional parameters. echo $ <@:2># Same as above. echo $ <*:2:3># Echoes three positional parameters, starting at second.

expr substr $string $position $length

Extracts $length characters from $string starting at $position .

stringZ=abcABC123ABCabc # 123456789. # 1-based indexing. echo `expr substr $stringZ 1 2` # ab echo `expr substr $stringZ 4 3` # ABC

expr match «$string» ‘\($substring\)’

Extracts $substring at beginning of $string , where $substring is a regular expression .

expr «$string» : ‘\($substring\)’

Extracts $substring at beginning of $string , where $substring is a regular expression.

stringZ=abcABC123ABCabc # ======= echo `expr match «$stringZ» ‘\(.[b-c]*[A-Z]..8\)’` # abcABC1 echo `expr «$stringZ» : ‘\(.[b-c]*[A-Z]..4\)’` # abcABC1 echo `expr «$stringZ» : ‘\(. \)’` # abcABC1 # All of the above forms give an identical result.

expr match «$string» ‘.*\($substring\)’

Extracts $substring at end of $string , where $substring is a regular expression.

expr «$string» : ‘.*\($substring\)’

Extracts $substring at end of $string , where $substring is a regular expression.

stringZ=abcABC123ABCabc # ====== echo `expr match «$stringZ» ‘.*\([A-C][A-C][A-C][a-c]*\)’` # ABCabc echo `expr «$stringZ» : ‘.*\(. \)’` # ABCabc

Deletes shortest match of $substring from front of $string .

Deletes longest match of $substring from front of $string .

stringZ=abcABC123ABCabc # |—-| shortest # |———-| longest echo $ # 123ABCabc # Strip out shortest match between ‘a’ and ‘C’. echo $ # abc # Strip out longest match between ‘a’ and ‘C’. # You can parameterize the substrings. X=’a*C’ echo $ # 123ABCabc echo $ # abc # As above.

Deletes shortest match of $substring from back of $string .

# Rename all filenames in $PWD with «TXT» suffix to a «txt» suffix. # For example, «file1.TXT» becomes «file1.txt» . . . SUFF=TXT suff=txt for i in $(ls *.$SUFF) do mv -f $i $.$suff # Leave unchanged everything *except* the shortest pattern match #+ starting from the right-hand-side of the variable $i . . . done ### This could be condensed into a «one-liner» if desired. # Thank you, Rory Winston.

Deletes longest match of $substring from back of $string .

stringZ=abcABC123ABCabc # || shortest # |————| longest echo $ # abcABC123ABCa # Strip out shortest match between ‘b’ and ‘c’, from back of $stringZ. echo $ # a # Strip out longest match between ‘b’ and ‘c’, from back of $stringZ.

This operator is useful for generating filenames.

Example 10-3. Converting graphic file formats, with filename change

Читайте также:  Windows server не работает маршрутизация

#!/bin/bash # cvt.sh: # Converts all the MacPaint image files in a directory to «pbm» format. # Uses the «macptopbm» binary from the «netpbm» package, #+ which is maintained by Brian Henderson (bryanh@giraffe-data.com). # Netpbm is a standard part of most Linux distros. OPERATION=macptopbm SUFFIX=pbm # New filename suffix. if [ -n «$1» ] then directory=$1 # If directory name given as a script argument. else directory=$PWD # Otherwise use current working directory. fi # Assumes all files in the target directory are MacPaint image files, #+ with a «.mac» filename suffix. for file in $directory/* # Filename globbing. do filename=$ # Strip «.mac» suffix off filename #+ (‘.*c’ matches everything #+ between ‘.’ and ‘c’, inclusive). $OPERATION $file > «$filename.$SUFFIX» # Redirect conversion to new filename. rm -f $file # Delete original files after converting. echo «$filename.$SUFFIX» # Log what is happening to stdout. done exit 0 # Exercise: # ——— # As it stands, this script converts *all* the files in the current #+ working directory. # Modify it to work *only* on files with a «.mac» suffix. # *** And here’s another way to do it. *** # #!/bin/bash # Batch convert into different graphic formats. # Assumes imagemagick installed (standard in most Linux distros). INFMT=png # Can be tif, jpg, gif, etc. OUTFMT=pdf # Can be tif, jpg, gif, pdf, etc. for pic in *»$INFMT» do p2=$(ls «$pic» | sed -e s/\.$INFMT//) # echo $p2 convert «$pic» $p2.$OUTFMT done exit $?

Example 10-4. Converting streaming audio files to ogg

#!/bin/bash # ra2ogg.sh: Convert streaming audio files (*.ra) to ogg. # Uses the «mplayer» media player program: # http://www.mplayerhq.hu/homepage # Uses the «ogg» library and «oggenc»: # http://www.xiph.org/ # # This script may need appropriate codecs installed, such as sipr.so . # Possibly also the compat-libstdc++ package. OFILEPREF=$ <1%%ra># Strip off the «ra» suffix. OFILESUFF=wav # Suffix for wav file. OUTFILE=»$OFILEPREF»»$OFILESUFF» E_NOARGS=85 if [ -z «$1» ] # Must specify a filename to convert. then echo «Usage: `basename $0` [filename]» exit $E_NOARGS fi ########################################################################## mplayer «$1» -ao pcm:file=$OUTFILE oggenc «$OUTFILE» # Correct file extension automatically added by oggenc. ########################################################################## rm «$OUTFILE» # Delete intermediate *.wav file. # If you want to keep it, comment out above line. exit $? # Note: # —- # On a Website, simply clicking on a *.ram streaming audio file #+ usually only downloads the URL of the actual *.ra audio file. # You can then use «wget» or something similar #+ to download the *.ra file itself. # Exercises: # ——— # As is, this script converts only *.ra filenames. # Add flexibility by permitting use of *.ram and other filenames. # # If you’re really ambitious, expand the script #+ to do automatic downloads and conversions of streaming audio files. # Given a URL, batch download streaming audio files (using «wget») #+ and convert them on the fly.

Example 10-5. Emulating getopt

#!/bin/bash # getopt-simple.sh # Author: Chris Morgan # Used in the ABS Guide with permission. getopt_simple() < echo "getopt_simple()" echo "Parameters are '$*'" until [ -z "$1" ] do echo "Processing parameter of: '$1'" if [ $<1:0:1>= ‘/’ ] then tmp=$ <1:1># Strip off leading ‘/’ . . . parameter=$ # Extract name. value=$ # Extract value. echo «Parameter: ‘$parameter’, value: ‘$value'» eval $parameter=$value fi shift done > # Pass all options to getopt_simple(). getopt_simple $* echo «test is ‘$test'» echo «test2 is ‘$test2′» exit 0 # See also, UseGetOpt.sh, a modified version of this script. — sh getopt_example.sh /test=value1 /test2=value2 Parameters are ‘/test=value1 /test2=value2’ Processing parameter of: ‘/test=value1’ Parameter: ‘test’, value: ‘value1’ Processing parameter of: ‘/test2=value2’ Parameter: ‘test2’, value: ‘value2’ test is ‘value1’ test2 is ‘value2’

Replace first match of $substring with $replacement . [2]

Replace all matches of $substring with $replacement .

stringZ=abcABC123ABCabc echo $ # xyzABC123ABCabc # Replaces first match of ‘abc’ with ‘xyz’. echo $ # xyzABC123ABCxyz # Replaces all matches of ‘abc’ with # ‘xyz’. echo ————— echo «$stringZ» # abcABC123ABCabc echo ————— # The string itself is not altered! # Can the match and replacement strings be parameterized? match=abc repl=000 echo $ # 000ABC123ABCabc # ^ ^ ^^^ echo $ # 000ABC123ABC000 # Yes! ^ ^ ^^^ ^^^ echo # What happens if no $replacement string is supplied? echo $ # ABC123ABCabc echo $ # ABC123ABC # A simple deletion takes place.

If $substring matches front end of $string , substitute $replacement for $substring .

If $substring matches back end of $string , substitute $replacement for $substring .

stringZ=abcABC123ABCabc echo $ # XYZABC123ABCabc # Replaces front-end match of ‘abc’ with ‘XYZ’. echo $ # abcABC123ABCXYZ # Replaces back-end match of ‘abc’ with ‘XYZ’.

A Bash script may invoke the string manipulation facilities of awk as an alternative to using its built-in operations.

Example 10-6. Alternate ways of extracting and locating substrings

Источник

Оцените статью