Tuesday, June 17, 2014

csplit ... text file splitter

csplit command

Today i faced a situation where i needed to split a text file into separate files based on text inside, a quick Google search showed that there is a command-line program called csplit that can do the trick, the program can be downloaded from the following link:

http://gnuwin32.sourceforge.net/packages/coreutils.htm

after installation don't forget to add the installation directory to PATH system variables so that it can be run without the need to write the full program path each time.

to use the program follow this syntax:

csplit [OPTION]... FILE PATTERN...
where options can be:

-b--suffix-format=FORMATuse sprintf FORMAT instead of %02d.
-f--prefix=PREFIXuse PREFIX instead of 'xx'.
-k--keep-filesdo not remove output files on errors.
-n--digits=DIGITSuse specified number of digits instead of 2.
-s--quiet--silentdo not print counts of output file sizes.
-z--elide-empty-filesremove empty output files.
--helpdisplay a help message and exit.
--versionoutput version information and exit.
and FILE is the file need to be split, and PATTERN may be as:

INTEGERcopy up to but not including specified line number.
/REGEXP/[OFFSET]copy up to but not including a matching line.
%REGEXP%[OFFSET]skip to, but not including a matching line.
{INTEGER}repeat the previous pattern specified number of times.
{*}repeat the previous pattern as many times as possible.
an example :
csplit -ks -f text FILE_TO_BE_SPLIT.txt /MATCHING_TEXT/ {*}

in this example csplit will search the text file named FILE_TO_BE_SPLIT.txt for the matching text MATCHING_TEXT and will split the text file each time this MATCHING_TEXT occurres

No comments:

Post a Comment