Split and Join large files with Linux

Posted by: Mo Mughrabi No Comments »

I've been neglecting this blog for a sometime, the last post from Yousef gave me a nice boost to write something new. Today someone came up to me and asked how can he upload a 3GB file into his remote machine. Well, I will try here to explain how you can split a large file into small parts with the size you want and then how you can join them together again, all using your command line.

To split a file, Let's assume you have a file called TV.SHow.avi (300MB) and you want to split it into 50MB files

 
split -b 50m TV.Show.avi
 

this command by default will generate 6 files with unique naming sequence (e.g xaa, xab, xac..etc) now to join the files you can do

 
cat xaa xab xac xad xae xaf xag > TV.Show.NewFILE.avi
 

or

 
cat x* > TV.Show.NewFILE.avi
 

Searching for text in a collection of files

Posted by: yousef 4 Comments »

I'm pretty sure there are literally tens of (possibly easier) ways to accomplish this, but I've always used the following snippet of bash code to search for text in files (the backslash at the end of the first line indicates continuation and should not be typed in; the two lines below are really one big command, but they are split into two separate lines so it doesn't screw up the page layout on lower resolutions):

for i in *.txt; do echo $i:; cat -n $i | \
grep -i 'search_string'; done

First, "*.txt" is expanded to a list of all files ending in '.txt'. For example, "one.txt two.txt three.txt". Then, $i is set to one of the files (in the order they are listed) at each iteration of the for loop. The commands between the 'do' and 'done' are executed for each file. The name of the current file is printed. Then, the output of `cat -n $i`, which is the contents of file $i (with each line prefixed with the line number), is piped to `grep -i 'search_string'`. This searches for the given text in file $i (ignoring case).

Of course, you can change the above snippet to search for whatever you want, in whatever files you specify. For example, to search for the string 'while' in all C files (.c) in the current directory, change "*.txt" to "*.c" and 'search_string' to 'while'. Or if you want to search in both C files and header files (.h), then use "*.c *.h". You can even use the find utility, as shown here:

for i in `find /some/path -iname "*.html" -print`; \
do echo $i:; cat -n $i | grep -i 'search_string'; done

This is exactly the same as the first command, except "*.txt" was replaced with `find ...`. The back-ticks (`) are substituted with the output of the command they enclose, in this case, the find command. The find command used here prints the absolute paths of all HTML files (i.e. ending in .html) in /some/path, including those in sub-directories, sub-sub-directories, ..etc. These files are then searched for 'search_string'.

Feel free ask questions or post what you personally use when searching for text in files.

WP Theme & Icons by N.Design Studio
Entries RSS Comments RSS Login