*BSD News Article 31136


Return to BSD News archive

Path: sserve!newshost.anu.edu.au!harbinger.cc.monash.edu.au!bunyip.cc.uq.oz.au!munnari.oz.au!constellation!convex!convex!news.duke.edu!MathWorks.Com!europa.eng.gtefsd.com!library.ucla.edu!ihnp4.ucsd.edu!qualcomm.com!qualcomm.com!not-for-mail
From: bri@qualcomm.com (Brian Ellis)
Newsgroups: comp.os.386bsd.apps
Subject: Re: Is there a split program that can split by bytecount?
Date: 31 May 1994 12:55:11 -0600
Organization: QUALCOMM, Incorporated; San Diego, CA, USA
Lines: 159
Message-ID: <2sg16f$s3h@qualcomm.com>
References: <klee.770406936@imagen>
NNTP-Posting-Host: redcloud.qualcomm.com
Keywords: split

In article <klee.770406936@imagen>, Kanghoon Lee <klee@imagen.com> wrote:

>Is there a split program that can split a file by bytecount, rather than line
>numbers?  If so, could somebody tell me where I can find one?

I have just the program for you. It's a little sparse on comments, but I
think you'll be able to figure it out.

-brian

/*
 * split a file into pieces. This is similar to the UNIX "csplit", but it
 * treats the source file as a binary file and splits on the given byte
 * boundaries, rather than splitting on newlines.
 *
 * usage: bsplit blocksize [file]
 *
 * if "file" is provided, you will end up with a series of new files
 * called file.0, file.1, file.2 and so on. the orginal file is not
 * modified.
 *
 * if "file" is omitted, bsplit reads from stdin and creates a series
 * of new files called stdin.0, stdin.1, stdin.2 and so on.
 *
 * Brian Ellis 2/9/94
 */

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>

#define TRUE (-1)
#define FALSE 0

main(argc, argv)
int argc;
char **argv;
{
	int blocksize, bytesread, byteswritten, remaining;
	int fd, fd2;
	char *basename;
	char *srcfilename;
	char destfilename[128];
	int segment;
	char *buf;
	int eof;

	/*
	 * check usage
	 */
	if ((argc != 2) && (argc != 3)) {
		printf("usage: %s blocksize [file]\n", argv[0]);
		exit(1);
	}

	/*
	 * read arguments
	 */
	blocksize = atoi(argv[1]);
	if (blocksize <= 0) {
		printf("invalid blocksize\n");
		exit(1);
	}
	if (argc == 3) {
		srcfilename = argv[2];
		basename = (char *)rindex(srcfilename, '/');
		if (basename == 0)
			basename = srcfilename;
		else
			basename++;
		fd = open(srcfilename, O_RDONLY, 0);
		if (fd < 0) {
			perror(srcfilename);
			exit(1);
		}
	} else {
		srcfilename = "stdin";
		basename = srcfilename;
		fd = 0; /* stdin */
	}

	/*
	 * allocate buffer
	 */
	buf = (char *)malloc(blocksize);
	if (buf == 0) {
		printf("out of memory\n");
		exit(1);
	}

	/*
	 * main loop
	 */
	segment = 0;
	eof = FALSE;
	while (!eof) {

		/*
		 * read block in. in order to accomodate reads on pipes,
		 * we have to have a loop rather than a single read.
		 */
		remaining = blocksize;
		while (remaining > 0) {
			bytesread =
			    read(fd, buf + (blocksize - remaining), remaining);
			if (bytesread < 0) {
				perror(srcfilename);
				exit(1);
			}
			if (bytesread == 0) {
				eof = TRUE;
				break;
			}
			remaining -= bytesread;
		}
		bytesread = blocksize - remaining;

		/*
		 * open segment file
		 */
		sprintf(destfilename, "%s.%03d", basename, segment);
		fd2 = open(destfilename, O_RDWR | O_CREAT | O_TRUNC, 0644);
		if (fd2 < 0) {
			perror(destfilename);
			exit(1);
		}
		++segment;

		/*
		 * write block back out
		 */
		byteswritten = write(fd2, buf, bytesread);
		if (byteswritten < 0) {
			perror(destfilename);
			exit(1);
		}
		if (byteswritten < bytesread) {
			printf("short write to: %s\n", destfilename);
			exit(1);
		}

		/*
		 * close segment file
		 */
		close(fd2);

	}

	/*
	 * clean up
	 */
	close(fd);
	free(buf);
	exit(0);
}