Bash One-Liners Explained, Part II: Working with strings
by Peteris Krumins
at 2012-08-23 20:36:46
original http://feedproxy.google.com/~r/catonmat/~3/HL3b-lAPWk8/bash-one-liners-explained-part-two
This is the second part of the Bash One-Liners Explained article series. In this part I'll show you how to do various string manipulations with bash. I'll use only the best bash practices, various bash idioms and tricks. I want to illustrate how to get various tasks done with just bash built-in commands and bash programming language constructs.
See the first part of the series for introduction. After I'm done with the series I'll release an ebook (similar to my ebooks on awk, sed, and perl), and also bash1line.txt (similar to my perl1line.txt).
Also see my other articles about working fast in bash from 2007 and 2008:
- Working Productively in Bash's Emacs Command Line Editing Mode (comes with a cheat sheet)
- Working Productively in Bash's Vi Command Line Editing Mode (comes with a cheat sheet)
- The Definitive Guide to Bash Command Line History (comes with a cheat sheet)
Let's start.
Part II: Working With Strings
1. Generate the alphabet from a-z
$ echo {a..z}
This one-liner uses brace expansion. Brace expansion is a mechanism for generating arbitrary strings. This one-liner uses a sequence expression of the form {x..y}, where x and y are single characters. The sequence expression expands to each character lexicographically between x and y, inclusive.
If you run it, you get all the letters from a-z:
$ echo {a..z} a b c d e f g h i j k l m n o p q r s t u v w x y z
2. Generate the alphabet from a-z without spaces between characters
$ printf "%c" {a..z}
This is an awesome bash trick that 99.99% bash users don't know about. If you supply a list of items to the printf
function it actually applies the format in a loop until the list is empty! printf
as a loop! There is nothing more awesome than that!
In this one-liner the printf format is "%c"
, which means "a character" and the arguments are all letters from a-z separated by space. So what printf
does is it iterates over the list outputting each character after character until it runs out of letters.
Here is the output if you run it:
abcdefghijklmnopqrstuvwxyz
This output is without a terminating newline because the format string was "%c"
and it doesn't include \n
. To have it newline terminated, just add $'\n'
to the list of chars to print:
$ printf "%c" {a..z} $'\n'
$'\n'
is bash idiomatic way to represent a newline character. printf
then just prints chars a to z, and the newline character.
Another way to add a trailing newline character is to echo the output of printf:
$ echo $(printf "%c" {a..z})
This one-liner uses command substitution, which runs printf "%c" {a..z}
and replaces the command with its output. Then echo
prints this output and adds a newline itself.
Want to output all letters in a column instead? Add a newline after each character!
$ printf "%c\n" {a..z}
Output:
a b ... z
Want to put the output from printf
in a variable quickly? Use the -v
argument:
$ printf -v alphabet "%c" {a..z}
This puts abcdefghijklmnopqrstuvwxyz
in the $alphabet
variable.
Similarly you can generate a list of numbers. Let's say from 1 to 100:
$ echo {1..100}
Output:
1 2 3 ... 100
Alternatively, if you forget this method, you can use the external seq
utility to generate a sequence of numbers:
$ seq 1 100
3. Pad numbers 0 to 9 with a leading zero
$ printf "%02d " {0..9}
Here we use the looping abilities of printf
again. This time the format is "%02d "
, which means "zero pad the integer up to two positions", and the items to loop through are the numbers 0-9, generated by the brace expansion (as explained in the previous one-liner).
Output:
00 01 02 03 04 05 06 07 08 09
If you use bash 4, you can do the same with the new feature of brace expansion:
$ echo {00..09}
Older bashes don't have this feature.
4. Produce 30 English words
$ echo {w,t,}h{e{n{,ce{,forth}},re{,in,fore,with{,al}}},ither,at}
This is an abuse of brace expansion. Just look at what this produces:
when whence whenceforth where wherein wherefore wherewith wherewithal whither what then thence thenceforth there therein therefore therewith therewithal thither that hen hence henceforth here herein herefore herewith herewithal hither hat
Crazy awesome!
Here is how it works - you can produce permutations of words/symbols with brace expansion. For example, if you do this,
$ echo {a,b,c}{1,2,3}
It will produce the result a1 a2 a3 b1 b2 b3 c1 c2 c3
. It takes the first a
, and combines it with {1,2,3}
, producing a1 a2 a3
. Then it takes b
and combines it with {1,2,3}
, and then it does the same for c
.
So this one-liner is just a smart combination of braces that when expanded produce all these English words!
5. Produce 10 copies of the same string
$ echo foo{,,,,,,,,,,}
This one-liner uses the brace expansion again. What happens here is foo
gets combined with 10 empty strings, so the output is 10 copies of foo
:
foo foo foo foo foo foo foo foo foo foo foo
6. Join two strings
$ echo "$x$y"
This one-liner simply concatenates two variables together. If the variable x
contains foo
and y
contains bar
then the result is foobar
.
Notice that "$x$y"
were quoted. If we didn't quote it, echo
would interpret the $x$y
as regular arguments, and would first try to parse them to see if they contain command line switches. So if $x
contains something beginning with -
, it would be a command line argument rather than an argument to echo:
x=-n y=" foo" echo $x$y
Output:
foo
Versus the correct way:
x=-n y=" foo" echo "$x$y"
Output:
-n foo
If you need to put the two joined strings in a variable, you can omit the quotes:
var=$x$y
7. Split a string on a given character
Let's say you have a string foo-bar-baz
in the variable $str
and you wish to split it on the dash and iterate over it. You can simply combine IFS
with read
to do it:
$ IFS=- read -r x y z <<< "$str"
Here we use the read x
command that reads data from stdin and puts the data in the x y z
variables. We set IFS
to -
as this variable is used for field splitting. If multiple variable names are specified to read
, IFS
is used to split the line of input so that each variable gets a single field of the input.
In this one-liner $x
gets foo
, $y
gets bar
, $z
gets baz
.
Also notice the use of <<<
operator. This is the here-string operator that allows strings to be passed to stdin of commands easily. In this case string $str
is passed as stdin to read
.
You can also put the split fields and put them in an array:
$ IFS=- read -ra parts <<< "foo-bar-baz"
The -a
argument to read
makes it put the split words in the given array. In this case the array is parts
. You can access array elements through ${parts[0]}
, ${parts[1]}
, and ${parts[0]}
. Or just access all of them through ${parts[@]}
.
8. Process a string character by character
$ while IFS= read -rn1 c; do # do something with $c done <<< "$str"
Here we use the -n1
argument to read
command to make it read the input character at a time. Similarly we can use -n2
to read two chars at a time, etc.
9. Replace "foo" with "bar" in a string
$ echo ${str/foo/bar}
This one-liner uses parameter expansion of form ${var/find/replace}
. It finds the string find
in var
and replaces it with replace
. Really simple!
To replace all occurrences of "foo" with "bar", use the ${var//find/replace}
form:
$ echo ${str//foo/bar}
10. Check if a string matches a pattern
$ if [[ $file = *.zip ]]; then # do something fi
Here the one-liner does something if $file
matches *.zip
. This is a simple glob pattern matching, and you can use symbols * ? [...]
to do matching. Code *
matches any string, ?
matches a single char, and [...]
matches any character in ...
or a character class.
Here is another example that matches if answer is Y
or y
:
$ if [[ $answer = [Yy]* ]]; then # do something fi
11. Check if a string matches a regular expression
$ if [[ $str =~ [0-9]+\.[0-9]+ ]]; then # do something fi
This one-liner tests if the string $str
matches regex [0-9]+\.[0-9]+
, which means match a number followed by a dot followed by number. The format for regular expressions is described in man 3 regex
.
12. Find the length of the string
$ echo ${#str}
Here we use parameter expansion ${#str}
which returns the length of the string in variable str
. Really simple.
13. Extract a substring from a string
$ str="hello world" $ echo ${str:6}
This one-liner extracts world
from hello world
. It uses the substring expansion. In general substring expansion looks like ${var:offset:length}
, and it extracts length
characters from var
starting at index offset
. In our one-liner we omit the length
that makes it extract all characters starting at offset 6
.
Here is another example:
$ echo ${str:7:2}
Output:
or
14. Uppercase a string
$ declare -u var $ var="foo bar"
The declare
command in bash declares variables and/or gives them attributes. In this case we give the variable var
attribute -u
, which upper-cases its content whenever it gets assigned something. Now if you echo it, the contents will be upper-cased:
$ echo $var FOO BAR
Note that -u
argument was introduced in bash 4. Similarly you can use another feature of bash 4, which is the ${var^^}
parameter expansion that upper-cases a string in var
:
$ str="zoo raw" $ echo ${str^^}
Output:
ZOO RAW
15. Lowercase a string
$ declare -l var $ var="FOO BAR"
Similar to the previous one-liner, -l
argument to declare
sets the lower-case attribute on var
, which makes it always be lower-case:
$ echo $var foo bar
The -l
argument is also available only in bash 4 and later.
Another way to lowercase a string is to use ${var,,}
parameter expansion:
$ str="ZOO RAW" $ echo ${str,,}
Output:
zoo raw
Enjoy!
Enjoy the article and let me know in the comments what you think about it! If you think that I forgot some interesting bash one-liners related to string operations, let me know in the comments also!