Cheat Sheet

[Best Practices Writing regular expressions]
1. Knowing what it is you want to match and how it might appear in the text
2. Writing a pattern to describe what you want to match
3. Testing the pattern to see what it matches

Note: When using the below you may need to change doubble quotes to single quotes if you are working on Linux.

Helpful sed awk and VIM links:
Best of VIM Tips
Bruce Barnett

–grabs value in parentheses and prints out 2nd field
sed -n -e "s/(.*)/\0/p" test.txt | awk "{print $2}" | sed -e "s/[^0-9]//g"

–convert everything in file to lowercase
sed -e "s/$.*$/\L\1/" input.txt > output.txt

–convert everything in file to lowercase using awk
awk "{print tolower($0)}" input.txt > output.txt

sed (option) (scriptfile) (inputfile) > (outputfile)

option -e editing instruction follows
option -f filename of script follows
option -n suppress automatic output of input lines

useful vim commands – may need modification for sed

vim ex command — remove first two characters from every line
:%normal 2x

vim ex command — remove brackets from select stmt – can be used w/ select stmt generated from sql server
:%s/[\[\]]//g

vim –search only for specific pattern \< -beginning of string \> -end of string
\<i\>

vim ex command — put any string of characters saparated by more than 1 whitespace into tab delimited format
:%s/[ \t]*// | %s/\t/ /g | %s/ \+ /\t/g | %s/[ \t]*$//

vim ex command — takes out spaces after text on a line (starts eliminating spaces once it ensounters more than a single space
–this can be done in Notepad as well by searching for 2 spaces and replacing w/ nothing
:%s/ //g

vim ex command — takes out tab characters
–this can be done in Notepad as well by searching for tab and replacing w/ nothing
:%s/\t//g

vim ex command — puts list into single quotes w/comma you can use in SQL stmt like this: 'some text',
–this cannot be done in Notepad!!
–if you could delete from end of line to beginning of 1st word you would do this w/o having to remove white spaces
:%s/.*/'\0',/g

vim ex command — this trims whitespace at end of sometext
–http://www.oualline.com/vim-cook.html#trim
:%s/[ \t]*$//

vim ex command — trims leading spaces
:%s/[ \t]*//

vim ex command — this command does it all!! Good to format list to use in SQL WHERE
–trim beg- –trim end– –put into quotes,
:%s/[ \t]*// | %s/[ \t]*$// | %s/.*/'\0',/g

–use this for mass INSERT STMTS
–eh line should contain the VALUES clause
–this stmt will format the INSERT stmt for each VALUE clause ..just replace the INSERT verbiage below with respective INSERT required
:%s/.*/INSERT dbo.yourTable (col1, col2, col3)\rVALUES \0\r/g

–sed append contents of one file into another
sed -e "s/\0/\0/" test.txt >> test2.txt

–view non-printable characters in a file
sed -n 'l' myfile.txt

–remove non-ascii chars
sed -e 's/[^[:print:]]*//' myfile.txt

vim ex command — delete every other line [googlefind]
:%norm jdd

–remove lines having specified pattern
sed -e "/|MNE|/d" test.txt > test_out.txt

vim ex command — remove everything after 1st (space-delimited) word
:%s/ .*/\1/g

vim ex command — replaces comma (,) with carriage return
:%s/,/\r/g

vim ex command — replaces new line with comma (,)
:%s/\n/,/g

vim ex command — put comma before each line
:%s!^!,!

vim ex command — put comma after each line
:%s!$!,!

vim ex command — remove 1st 3 characters (letters) of every line
:%s/^[a-zA-Z]\{3\}//g

vim ex command — remove characters up to period
:%s/.*\.//

vim ex command — removes all characters except numbers
:%s/[^0-9]//g

vim ex command — removes all characters up to first space. First character of each line is uppercase letter
:%s/^[A-Z].* //g

vim ex command — sort (eliminate dups / i.e. sort unique)
:sort u

vim ex command — vim single space file
:%s/^\n//g

vim ex command — use to extract tables from a stored procedure
Note: Can do 1 to 3 as pipe operation but need to do 4 separate
1.delete lines not having dbo. <>
2.be sure each table is on it's own line
3.for each line remove all characters up to the period <>
4.delete any line not starting with a letter <>

1. 2. 3. 4.
:v/dbo./d | %s/dbo/\rdbo/g | %s/.*\.// | v/^[A-Z,a-z]/d

vim ex command — delete all characters up to period (.)
:%s/.*\.//

vim ex command — delete all lines containing pattern
:g/sometext/d

vim ex command — delete all lines not having pattern
:v/pattern/d
or
:g!/pattern/d

sed -e "/sometext/!d" file

vim ex command — strip HTML tags
:%s/<[^>][^>]*>//g

–strip HTML tags (alternate version)
sed -e "s/<[^>]*>//g"

–put eh column of tab delimited row into quotes
%s/$[^ \t$]*$/\'\1\'/g

–here's an awk that does same thing
awk '{for (i=1; i<=NF; i++) { $i="\""$i"\"" }; print}' oldfile.txt > newfile.txt
————————————————————————————

Toggle case "HellO" to "hELLo" with g~ then a movement. (i.e. "l" for 1 letter or "w" for word etc)
Uppercase "HellO" to "HELLO" with gU then a movement.
Lowercase "HellO" to "hello" with gu then a movement.

–strips out everything on a line except numbers
sed -e "s/[^0-9]//g"

awk (option) (scriptfile) (inputfile(s))

–count number of lines
awk 'END{print NR}' YourFile.DAT

–command to take top 100 lines from large (or any) file
gawk 'NR < 101' YourFile.DAT > /home/user/YourFile_sample100.dat
gawk 'NR < 101' YourFile.DAT > /home/user/YourFile_sample100.dat

–command to take top lines from large (or any) file using > operator
gawk 'NR > 1495716' YourFile >YourFile_sample_Last100.dat

–basic takes 1st column and 2nd column and prints them on a single line
awk "{ print $1 \" \" $3 }" YourFile.txt

–lines up nice in columns
awk "{printf(\"%-40s%-4s\n\", $1, $3)}" YourFile.txt

–works best for putting into DML
awk "{ print $1 \" (\" $3 \")\" }" YourFile.txt

–puts in format string (000001) FILLER; –requires 1st col = col name and 2nd col = length
awk "{ print \"string (\"$2 \") \" $1 \";\" }" YourFile.txt >YourFile_out.txt

–print the last field of each line
awk '{ print $NF }'

–print specific lines of interest
awk "NR>=90401&&NR<=90403" YourFile.dat > output.txt
awk "NR<=5" YourFile.dat > output.txt

–awk replace column value in file 2 with column value in file 1
–FNR: The ordinal number of the current record in the current file.
–NR: The ordinal number of the current record from the start of input.
awk "FNR==NR{a[NR]=$4;next}{$2=a[FNR]}1" test1.txt test2.txt > output.txt

–find common fields in file1 & file2 and put them in output file
–in commented stmt (for debug / testing) need to set "i" = to the number of lines in file1 / the uncommented stmt should work w/o need to manually set number of records
–awk "FNR==NR{a[NR]=$1;next}{for (i=1; i<=2; ++i){if (tolower(a[i])==tolower($1)) print $1;}}" test1.txt test2.txt > output.txt
awk "BEGIN { X=0 }FNR==NR{a[NR]=$1;X=X+1;next}{for (i=1; i<=X; ++i){if (tolower(a[i])==tolower($1)) print $1;}}" test1.txt test2.txt > output.txt

option -f filename of script follows
option -F change field separator
option -v var=value followsawk

option -f filename of script follows
option -F change field separator
option -v var=value follows

(Permalink)

No comments yet.

Main Menu

Recent Posts

Archives

Categories