ʡڡ **:JZZjjzʢڢ *:JJJJJJJJJZjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjzzzzzzʣʣʣʣʣʣʣʣʣʣڣڣڣڣڣڣڣHCDB_File-0.860/G1RHCDB_File-0.86/ACKNOWLEDGE1_GD1The help of these people is gratefully acknowledged. AK Andreas Koenig BD Bert Driehuis CMC Chris Chalfant DB Dan Bernstein FvL Felix von Leitner FL Frederik Lindberg GT Gene Titus IP Ian Phillipps IW Ira Woodhead JB Jos Backus JH John Horne JPB Joao Bordalo MdlR Michael de la Rue MP Mark Powell NMS Nickolay Saukh RDM Raul Miller RDW Rich Williams SB Stephen Beckstrom-Sternberg Tim Goodwin 2001-05-25 G2RHCDB_File-0.86/CDB_File.pm2_G!%D2package CDB_File; use strict; use Carp; use vars qw($VERSION @ISA @EXPORT_OK); use DynaLoader (); use Exporter (); @ISA = qw(Exporter DynaLoader); @EXPORT_OK = qw(create); $VERSION = '0.86'; =head1 NAME CDB_File - Perl extension for access to cdb databases =head1 SYNOPSIS use CDB_File; $c = tie %h, 'CDB_File', 'file.cdb' or die "tie failed: $!\n"; $fh = $c->handle; sysseek $fh, $c->datapos, 0 or die ...; sysread $fh, $x, $c->datalen; undef $c; untie %h; $t = new CDB_File ('t.cdb', "t.$$") or die ...; $t->insert('key', 'value'); $t->finish; CDB_File::create %t, $file, "$file.$$"; or use CDB_File 'create'; create %t, $file, "$file.$$"; =head1 DESCRIPTION B is a module which provides a Perl interface to Dan Berstein's B package: cdb is a fast, reliable, lightweight package for creating and reading constant databases. =head2 Reading from a cdb After the C shown above, accesses to C<%h> will refer to the B file C, as described in L. Low level access to the database is provided by the three methods C, C, and C. To use them, you must remember the C object returned by the C call: C<$c> in the example above. The C and C methods return the file offset position and length respectively of the most recently visited key (for example, via C). Beware that if you create an extra reference to the C object (like C<$c> in the example above) you must destroy it (with C) before calling C on the hash. This ensures that the object's C method is called. Note that C will check this for you; see L for further details. =head2 Creating a cdb A B file is created in three steps. First call C, where C<$final> is the name of the database to be created, and C<$tmp> is the name of a temporary file which can be atomically renamed to C<$final>. Secondly, call the C method once for each (I, I) pair. Finally, call the C method to complete the creation and renaming of the B file. A simpler interface to B file creation is provided by C. This creates a B file named C<$final> containing the contents of C<%t>. As before, C<$tmp> must name a temporary file which can be atomically renamed to C<$final>. C may be imported. =head1 EXAMPLES These are all complete programs. 1. Convert a Berkeley DB (B-tree) database to B format. use CDB_File; use DB_File; tie %h, DB_File, $ARGV[0], O_RDONLY, undef, $DB_BTREE or die "$0: can't tie to $ARGV[0]: $!\n"; CDB_File::create %h, $ARGV[1], "$ARGV[1].$$" or die "$0: can't create cdb: $!\n"; 2. Convert a flat file to B format. In this example, the flat file consists of one key per line, separated by a colon from the value. Blank lines and lines beginning with B<#> are skipped. use CDB_File; $cdb = new CDB_File("data.cdb", "data.$$") or die "$0: new CDB_File failed: $!\n"; while (<>) { next if /^$/ or /^#/; chop; ($k, $v) = split /:/, $_, 2; if (defined $v) { $cdb->insert($k, $v); } else { warn "bogus line: $_\n"; } } $cdb->finish or die "$0: CDB_File finish failed: $!\n"; 3. Perl version of B. use CDB_File; tie %data, 'CDB_File', $ARGV[0] or die "$0: can't tie to $ARGV[0]: $!\n"; while (($k, $v) = each %data) { print '+', length $k, ',', length $v, ":$k->$v\n"; } print "\n"; 4. For really enormous data values, you can use C, C, and C, in combination with C and C, to avoid reading the values into memory. Here is the script F, which can extract uncompressed files and directories from a B file. use CDB_File; sub unnetstrings { my($netstrings) = @_; my @result; while ($netstrings =~ s/^([0-9]+)://) { push @result, substr($netstrings, 0, $1, ''); $netstrings =~ s/^,//; } return @result; } my $chunk = 8192; sub extract { my($file, $t, $b) = @_; my $head = $$b{"H$file"}; my ($code, $type) = $head =~ m/^([0-9]+)(.)/; if ($type eq "/") { mkdir $file, 0777; } elsif ($type eq "_") { my ($total, $now, $got, $x); open OUT, ">$file" or die "open for output: $!\n"; exists $$b{"D$code"} or die "corrupt bun file\n"; my $fh = $t->handle; sysseek $fh, $t->datapos, 0; $total = $t->datalen; while ($total) { $now = ($total > $chunk) ? $chunk : $total; $got = sysread $fh, $x, $now; if (not $got) { die "read error\n"; } $total -= $got; print OUT $x; } close OUT; } else { print STDERR "warning: skipping unknown file type\n"; } } die "usage\n" if @ARGV != 1; my (%b, $t); $t = tie %b, 'CDB_File', $ARGV[0] or die "tie: $!\n"; map { extract $_, $t, \%b } unnetstrings $b{""}; 5. Although a B file is constant, you can simulate updating it in Perl. This is an expensive operation, as you have to create a new database, and copy into it everything that's unchanged from the old database. (As compensation, the update does not affect database readers. The old database is available for them, till the moment the new one is Ced.) use CDB_File; $file = 'data.cdb'; $new = new CDB_File($file, "$file.$$") or die "$0: new CDB_File failed: $!\n"; # Add the new values; remember which keys we've seen. while (<>) { chop; ($k, $v) = split; $new->insert($k, $v); $seen{$k} = 1; } # Add any old values that haven't been replaced. tie %old, 'CDB_File', $file or die "$0: can't tie to $file: $!\n"; while (($k, $v) = each %old) { $new->insert($k, $v) unless $seen{$k}; } $new->finish or die "$0: CDB_File finish failed: $!\n"; =head1 REPEATED KEYS Most users can ignore this section. A B file can contain repeated keys. If the C method is called more than once with the same key during the creation of a B file, that key will be repeated. Here's an example. $cdb = new CDB_File ("$file.cdb", "$file.$$") or die ...; $cdb->insert('cat', 'gato'); $cdb->insert('cat', 'chat'); $cdb->finish; Normally, any attempt to access a key retrieves the first value stored under that key. This code snippet always prints B. $catref = tie %catalogue, CDB_File, "$file.cdb" or die ...; print "$catalogue{cat}"; However, all the usual ways of iterating over a hash---C, C, and C---do the Right Thing, even in the presence of repeated keys. This code snippet prints B. print join(' ', keys %catalogue, values %catalogue); And these two both print B, although the second is more efficient. foreach $key (keys %catalogue) { print "$key:$catalogue{$key} "; } while (($key, $val) = each %catalogue) { print "$key:$val "; } The C method retrieves all the values associated with a key. It returns a reference to an array containing all the values. This code prints B. print "@{$catref->multi_get('cat')}"; =head1 RETURN VALUES The routines C, C, and C return B if the attempted operation failed; C<$!> contains the reason for failure. =head1 DIAGNOSTICS The following fatal errors may occur. (See L if you want to trap them.) =over 4 =item Modification of a CDB_File attempted You attempted to modify a hash tied to a B. =item CDB database too large You attempted to create a B file larger than 4 gigabytes. =item [ Write to | Read of | Seek in ] CDB_File failed: If B is B, you tried to C to access something that isn't a B file. Otherwise a serious OS level problem occurred, for example, you have run out of disk space. =item Use CDB_File::FIRSTKEY before CDB_File::NEXTKEY If you are using the NEXTKEY method directly (I can't think of a reason why you'd want to do this), you need to call FIRSTKEY first. =back =head1 BUGS It ain't lightweight after you've plumbed Perl into it. The Perl interface to B imposes the restriction that data must fit into memory. =head1 SEE ALSO cdb(3). =head1 AUTHOR Tim Goodwin, . B began on 1997-01-08. =cut bootstrap CDB_File $VERSION; sub CLEAR { croak "Modification of a CDB_File attempted" } sub DELETE { &CLEAR } sub STORE { &CLEAR } # Must be preloaded for the prototype. sub create(\%$$) { my($RHdata, $fn, $fntemp) = @_; my $cdb = new CDB_File($fn, $fntemp) or return undef; my($k, $v); while (($k, $v) = each %$RHdata) { $cdb->insert($k, $v); } $cdb->finish; return 1; } 1; G3RHCDB_File-0.86/CDB_File.xs3_GJBD3/* Most of this is reasonably straightforward. The complications arise when we are "iterating" over the CDB file, that is to say, using `keys' or `values' or `each' to retrieve all the data in the file in order. This interface stores extra data to allow us to track iterations: end is a pointer to the end of data in the CDB file, and also a flag which indicates whether we are iterating or not (note that the end of data occurs at a position >= 2048); curkey is a copy of the current key; curpos is the file offset of curkey; and fetch_advance is 0 for FIRSTKEY, fetch, NEXTKEY, fetch, NEXTKEY, fetch, ... but 1 for FIRSTKEY, NEXTKEY, NEXTKEY, ..., fetch, fetch, fetch, ... Don't tell the OO Police, but there are actually two different objects called CDB_File. One is created by TIEHASH, and accessed by the usual tied hash methods (FETCH, FIRSTKEY, etc.). The other is created by new, and accessed by insert and finish. In both cases, the object is a blessed reference to a scalar. The scalar contains either a struct cdbobj or a struct cdbmakeobj. It gets a little messy in DESTROY: since this method will automatically be called for both sorts of object, it distinguishes them by their different sizes. */ #ifdef __cplusplus extern "C" { #endif #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include #include #include #include #include #include #ifdef HASMMAP #include #endif /* We need to whistle up an error number for a file that is not a CDB file. The BSDish EFTYPE probably gives the most useful error message; failing that we'll settle for the Single Unix Specification v2 EPROTO; and finally the rather inappropriate, but universally(?) implemented, EINVAL. */ #ifdef EFTYPE #else #ifdef EPROTO #define EFTYPE EPROTO #else #define EFTYPE EINVAL #endif #endif /* These two provide backwards compatibility with perl 5.005. */ #ifndef WARN_UNINITIALIZED #define ckWARN(x) dowarn #define report_uninit() warn(warn_uninit) #endif #ifdef __cplusplus } #endif struct cdb { GV *glob; /* */ #ifdef HASMMAP char *map; #endif U32 end; /* If non zero, the file offset of the first byte of hash tables. */ SV *curkey; /* While iterating: a copy of the current key; */ U32 curpos; /* the file offset of the current record. */ int fetch_advance; /* the kludge */ U32 size; /* initialized if map is nonzero */ U32 loop; /* number of hash slots searched under this key */ U32 khash; /* initialized if loop is nonzero */ U32 kpos; /* initialized if loop is nonzero */ U32 hpos; /* initialized if loop is nonzero */ U32 hslots; /* initialized if loop is nonzero */ U32 dpos; /* initialized if cdb_findnext() returns 1 */ U32 dlen; /* initialized if cdb_findnext() returns 1 */ } ; #define CDB_HPLIST 1000 struct cdb_hp { U32 h; U32 p; } ; struct cdb_hplist { struct cdb_hp hp[CDB_HPLIST]; struct cdb_hplist *next; int num; } ; struct cdb_make { PerlIO *f; /* Handle of file being created. */ char *fn; /* Final name of file. */ char *fntemp; /* Temporary name of file. */ char final[2048]; char bspace[1024]; U32 count[256]; U32 start[256]; struct cdb_hplist *head; struct cdb_hp *split; /* includes space for hash */ struct cdb_hp *hash; U32 numentries; U32 pos; int fd; } ; static void writeerror() { croak("Write to CDB_File failed: %s", Strerror(errno)); } static void readerror() { croak("Read of CDB_File failed: %s", Strerror(errno)); } static void seekerror() { croak("Seek in CDB_File failed: %s", Strerror(errno)); } static void nomem() { croak("Out of memory!"); } static int cdb_make_start(struct cdb_make *c) { c->head = 0; c->split = 0; c->hash = 0; c->numentries = 0; c->pos = sizeof c->final; return PerlIO_seek(c->f, c->pos, SEEK_SET); } static int posplus(struct cdb_make *c, U32 len) { U32 newpos = c->pos + len; if (newpos < len) { errno = ENOMEM; return -1; } c->pos = newpos; return 0; } static int cdb_make_addend(struct cdb_make *c, unsigned int keylen, unsigned int datalen, U32 h) { struct cdb_hplist *head; head = c->head; if (!head || (head->num >= CDB_HPLIST)) { New(0xCDB, head, 1, struct cdb_hplist); head->num = 0; head->next = c->head; c->head = head; } head->hp[head->num].h = h; head->hp[head->num].p = c->pos; ++head->num; ++c->numentries; if (posplus(c, 8) == -1) return -1; if (posplus(c, keylen) == -1) return -1; if (posplus(c, datalen) == -1) return -1; return 0; } #define CDB_HASHSTART 5381 static U32 cdb_hashadd(U32 h, unsigned char c) { h += (h << 5); return h ^ c; } static U32 cdb_hash(char *buf, unsigned int len) { U32 h; h = CDB_HASHSTART; while (len) { h = cdb_hashadd(h,*buf++); --len; } return h; } static void uint32_pack(char s[4], U32 u) { s[0] = u & 255; u >>= 8; s[1] = u & 255; u >>= 8; s[2] = u & 255; s[3] = u >> 8; } static void uint32_unpack(char s[4], U32 *u) { U32 result; result = (unsigned char) s[3]; result <<= 8; result += (unsigned char) s[2]; result <<= 8; result += (unsigned char) s[1]; result <<= 8; result += (unsigned char) s[0]; *u = result; } static void cdb_findstart(struct cdb *c) { c->loop = 0; } static int cdb_read(struct cdb *c, char *buf, unsigned int len, U32 pos) { #ifdef HASMMAP if (c->map) { if ((pos > c->size) || (c->size - pos < len)) { errno = EFTYPE; return -1; } memcpy(buf, c->map + pos, len); return 0; } #endif if (PerlIO_seek(IoIFP(GvIOn(c->glob)), pos, SEEK_SET) == -1) return -1; while (len > 0) { int r; do r = PerlIO_read(IoIFP(GvIOn(c->glob)), buf, len); while ((r == -1) && (errno == EINTR)); if (r == -1) return -1; if (r == 0) { errno = EFTYPE; return -1; } buf += r; len -= r; } return 0; } static int match(struct cdb *c,char *key,unsigned int len, U32 pos) { char buf[32]; int n; while (len > 0) { n = sizeof buf; if (n > len) n = len; if (cdb_read(c, buf, n, pos) == -1) return -1; if (memcmp(buf, key, n)) return 0; pos += n; key += n; len -= n; } return 1; } int cdb_findnext(struct cdb *c,char *key,unsigned int len) { char buf[8]; U32 pos; U32 u; if (!c->loop) { u = cdb_hash(key,len); if (cdb_read(c,buf,8,(u << 3) & 2047) == -1) return -1; uint32_unpack(buf + 4,&c->hslots); if (!c->hslots) return 0; uint32_unpack(buf,&c->hpos); c->khash = u; u >>= 8; u %= c->hslots; u <<= 3; c->kpos = c->hpos + u; } while (c->loop < c->hslots) { if (cdb_read(c,buf,8,c->kpos) == -1) return -1; uint32_unpack(buf + 4,&pos); if (!pos) return 0; c->loop += 1; c->kpos += 8; if (c->kpos == c->hpos + (c->hslots << 3)) c->kpos = c->hpos; uint32_unpack(buf,&u); if (u == c->khash) { if (cdb_read(c,buf,8,pos) == -1) return -1; uint32_unpack(buf,&u); if (u == len) switch(match(c,key,len,pos + 8)) { case -1: return -1; case 1: uint32_unpack(buf + 4,&c->dlen); c->dpos = pos + 8 + len; return 1; } } } return 0; } static int cdb_find(struct cdb *c, char *key, unsigned int len) { cdb_findstart(c); return cdb_findnext(c,key,len); } static void iter_start(struct cdb *c) { char buf[4]; c->curpos = 2048; if (cdb_read(c, buf, 4, 0) == -1) readerror(); uint32_unpack(buf, &c->end); c->curkey = NEWSV(0xcdb, 1); c->fetch_advance = 0; } static int iter_key(struct cdb *c) { char buf[8]; U32 klen; if (c->curpos < c->end) { if (cdb_read(c, buf, 8, c->curpos) == -1) readerror(); uint32_unpack(buf, &klen); (void)SvPOK_only(c->curkey); SvGROW(c->curkey, klen); SvCUR_set(c->curkey, klen); if (cdb_read(c, SvPVX(c->curkey), klen, c->curpos + 8) == -1) readerror(); return 1; } return 0; } static void iter_advance(struct cdb *c) { char buf[8]; U32 klen, dlen; if (cdb_read(c, buf, 8, c->curpos) == -1) readerror(); uint32_unpack(buf, &klen); uint32_unpack(buf + 4, &dlen); c->curpos += 8 + klen + dlen; } static void iter_end(struct cdb *c) { if (c->end != 0) { c->end = 0; SvREFCNT_dec(c->curkey); } } #define cdb_datapos(c) ((c)->dpos) #define cdb_datalen(c) ((c)->dlen) MODULE = CDB_File PACKAGE = CDB_File PREFIX = cdb_ # Some accessor methods. # WARNING: I don't really understand enough about Perl's guts (file # handles / globs, etc.) to write this code. I think this is right, and # it seems to work, but input from anybody with a deeper # understanding would be most welcome. SV * cdb_handle(db) SV * db PROTOTYPE: $ PREINIT: struct cdb *this; CODE: this = (struct cdb *)SvPV(SvRV(db), PL_na); RETVAL = newRV_inc((SV *)GvIOn(this->glob)); OUTPUT: RETVAL U32 cdb_datalen(db) SV * db PROTOTYPE: $ CODE: RETVAL = cdb_datalen((struct cdb *)SvPV(SvRV(db), PL_na)); OUTPUT: RETVAL U32 cdb_datapos(db) SV * db PROTOTYPE: $ CODE: RETVAL = cdb_datapos((struct cdb *)SvPV(SvRV(db), PL_na)); OUTPUT: RETVAL SV * cdb_TIEHASH(dbtype, filename) char * dbtype char * filename PROTOTYPE: $$ CODE: PerlIO *f; IO *io; struct cdb cdb; SV *cdbp; f = PerlIO_open(filename, "rb"); if (!f) XSRETURN_NO; cdb.glob = newGVgen("cdb"); io = GvIOn(cdb.glob); IoIFP(io) = f; cdb.end = 0; #ifdef HASMMAP { struct stat st; int fd = PerlIO_fileno(f); cdb.map = 0; if (fstat(fd, &st) == 0) { if (st.st_size <= 0xffffffff) { char *x; x = mmap(0, st.st_size, PROT_READ, MAP_SHARED, fd, 0); if (x != (char *)-1) { cdb.size = st.st_size; cdb.map = x; } } } } #endif cdbp = newSVpv((char *)&cdb, sizeof(struct cdb)); RETVAL = newRV_noinc(cdbp); sv_bless(RETVAL, gv_stashpv(dbtype, 0)); /* Prevent the user stomping on the cdb. */ SvREADONLY_on(cdbp); OUTPUT: RETVAL SV * cdb_FETCH(db, k) SV * db SV * k PROTOTYPE: $$ PREINIT: struct cdb *this; PerlIO *f; char buf[8]; int found; off_t pos; STRLEN klen, x; U32 klen0; char *kp; CODE: if (!SvOK(k)) { if (ckWARN(WARN_UNINITIALIZED)) report_uninit(); XSRETURN_UNDEF; } this = (struct cdb *)SvPV(SvRV(db), PL_na); kp = SvPV(k, klen); if (this->end && sv_eq(this->curkey, k)) { if (cdb_read(this, buf, 8, this->curpos) == -1) readerror(); uint32_unpack(buf + 4, &this->dlen); this->dpos = this->curpos + 8 + klen; if (this->fetch_advance) { iter_advance(this); if (!iter_key(this)) iter_end(this); } found = 1; } else { cdb_findstart(this); found = cdb_findnext(this, kp, klen); if ((found != 0) && (found != 1)) readerror(); } ST(0) = sv_newmortal(); if (found && sv_upgrade(ST(0), SVt_PV)) { U32 dlen = cdb_datalen(this); (void)SvPOK_only(ST(0)); SvGROW(ST(0), dlen + 1); SvCUR_set(ST(0), dlen); if (cdb_read(this, SvPVX(ST(0)), dlen, cdb_datapos(this)) == -1) readerror(); SvPV(ST(0), PL_na)[dlen] = '\0'; } AV * cdb_multi_get(db, k) SV * db SV * k PROTOTYPE: $$ PREINIT: struct cdb *this; PerlIO *f; char buf[8]; int found; off_t pos; STRLEN klen; U32 dlen, klen0; char *kp; SV *x; CODE: if (!SvOK(k)) { if (ckWARN(WARN_UNINITIALIZED)) report_uninit(); XSRETURN_UNDEF; } this = (struct cdb *)SvPV(SvRV(db), PL_na); cdb_findstart(this); RETVAL = newAV(); sv_2mortal((SV *)RETVAL); kp = SvPV(k, klen); for (;;) { found = cdb_findnext(this, kp, klen); if ((found != 0) && (found != 1)) readerror(); if (!found) break; x = newSVpvn("", 0); dlen = cdb_datalen(this); SvGROW(x, dlen + 1); SvCUR_set(x, dlen); if (cdb_read(this, SvPVX(x), dlen, cdb_datapos(this)) == -1) readerror(); SvPV(x, PL_na)[dlen] = '\0'; av_push(RETVAL, x); } OUTPUT: RETVAL int cdb_EXISTS(db, k) SV * db SV * k PROTOTYPE: $$ CODE: struct cdb *this; STRLEN klen; char *kp; if (!SvOK(k)) { if (ckWARN(WARN_UNINITIALIZED)) report_uninit(); XSRETURN_NO; } this = (struct cdb *)SvPV(SvRV(db), PL_na); kp = SvPV(k, klen); RETVAL = cdb_find(this, kp, klen); if (RETVAL != 0 && RETVAL != 1) readerror(); OUTPUT: RETVAL void cdb_DESTROY(db) SV * db PROTOTYPE: $ CODE: if (SvCUR(SvRV(db)) == sizeof(struct cdb)) { /* It came from TIEHASH. */ struct cdb *this; IO *io; this = (struct cdb *)SvPV(SvRV(db), PL_na); iter_end(this); #ifdef HASMMAP if (this->map) { munmap(this->map, this->size); this->map = 0; } #endif io = GvIOn(this->glob); PerlIO_close(IoIFP(io)); /* close() on O_RDONLY cannot fail */ IoIFP(io) = Nullfp; SvREFCNT_dec((SV *)this->glob); } else { struct cdb_make *this; this = (struct cdb_make *)SvPV(SvRV(db), PL_na); SvREFCNT_dec((SV *)this); } SV * cdb_FIRSTKEY(db) SV * db PROTOTYPE: $ CODE: struct cdb *this; char buf[8]; U32 klen; this = (struct cdb *)SvPV(SvRV(db), PL_na); iter_start(this); if (iter_key(this)) ST(0) = sv_mortalcopy(this->curkey); else XSRETURN_UNDEF; /* empty database */ SV * cdb_NEXTKEY(db, k) SV * db SV * k PROTOTYPE: $$ CODE: struct cdb *this; char buf[8], *kp; int found; off_t pos; U32 dlen, klen0; STRLEN klen1; if (!SvOK(k)) { if (ckWARN(WARN_UNINITIALIZED)) report_uninit(); XSRETURN_UNDEF; } this = (struct cdb *)SvPV(SvRV(db), PL_na); if (this->end == 0 || !sv_eq(this->curkey, k)) croak("Use CDB_File::FIRSTKEY before CDB_File::NEXTKEY"); iter_advance(this); if (iter_key(this)) ST(0) = sv_mortalcopy(this->curkey); else { iter_start(this); (void)iter_key(this); /* prepare curkey for FETCH */ this->fetch_advance = 1; XSRETURN_UNDEF; } SV * cdb_new(this, fn, fntemp) char * this char * fn char * fntemp PROTOTYPE: $$$ CODE: SV *cdbmp; struct cdb_make cdbmake; int i; cdbmake.f = PerlIO_open(fntemp, "wb"); if (!cdbmake.f) XSRETURN_UNDEF; if (cdb_make_start(&cdbmake) < 0) XSRETURN_UNDEF; /* Oh, for referential transparency. */ New(0, cdbmake.fn, strlen(fn) + 1, char); New(0, cdbmake.fntemp, strlen(fntemp) + 1, char); strncpy(cdbmake.fn, fn, strlen(fn) + 1); strncpy(cdbmake.fntemp, fntemp, strlen(fntemp) + 1); cdbmp = newSVpv((char *)&cdbmake, sizeof(struct cdb_make)); RETVAL = newRV_noinc(cdbmp); sv_bless(RETVAL, gv_stashpv(this, 0)); OUTPUT: RETVAL void cdb_insert(cdbmake, k, v) SV * cdbmake SV * k SV * v PROTOTYPE: $$$ CODE: char *kp, *vp, packbuf[8]; int c, i; STRLEN klen, vlen; struct cdb_make *this; U32 h; this = (struct cdb_make *)SvPV(SvRV(cdbmake), PL_na); kp = SvPV(k, klen); vp = SvPV(v, vlen); uint32_pack(packbuf, klen); uint32_pack(packbuf + 4, vlen); if (PerlIO_write(this->f, packbuf, 8) < 8) writeerror(); h = cdb_hash(kp, klen); if (PerlIO_write(this->f, kp, klen) < klen) writeerror(); if (PerlIO_write(this->f, vp, vlen) < vlen) writeerror(); if (cdb_make_addend(this, klen, vlen, h) == -1) nomem(); int cdb_finish(cdbmake) SV * cdbmake; PROTOTYPE: $ CODE: char buf[8]; int i; struct cdb_make *this; U32 len, u; U32 count, memsize, where; struct cdb_hplist *x; struct cdb_hp *hp; this = (struct cdb_make *)SvPV(SvRV(cdbmake), PL_na); for (i = 0; i < 256; ++i) this->count[i] = 0; for (x = this->head; x; x = x->next) { i = x->num; while (i--) ++this->count[255 & x->hp[i].h]; } memsize = 1; for (i = 0; i < 256; ++i) { u = this->count[i] * 2; if (u > memsize) memsize = u; } memsize += this->numentries; /* no overflow possible up to now */ u = (U32) 0 - (U32) 1; u /= sizeof(struct cdb_hp); if (memsize > u) { errno = ENOMEM; XSRETURN_UNDEF; } New(0xCDB, this->split, memsize, struct cdb_hp); this->hash = this->split + this->numentries; u = 0; for (i = 0; i < 256; ++i) { u += this->count[i]; /* bounded by numentries, so no overflow */ this->start[i] = u; } for (x = this->head; x; x = x->next) { i = x->num; while (i--) this->split[--this->start[255 & x->hp[i].h]] = x->hp[i]; } for (i = 0; i < 256; ++i) { count = this->count[i]; len = count + count; /* no overflow possible */ uint32_pack(this->final + 8 * i, this->pos); uint32_pack(this->final + 8 * i + 4, len); for (u = 0; u < len; ++u) this->hash[u].h = this->hash[u].p = 0; hp = this->split + this->start[i]; for (u = 0; u < count; ++u) { where = (hp->h >> 8) % len; while (this->hash[where].p) if (++where == len) where = 0; this->hash[where] = *hp++; } for (u = 0; u < len; ++u) { uint32_pack(buf, this->hash[u].h); uint32_pack(buf + 4, this->hash[u].p); if (PerlIO_write(this->f, buf, 8) == -1) XSRETURN_UNDEF; if (posplus(this, 8) == -1) XSRETURN_UNDEF; } } Safefree(this->split); if (PerlIO_flush(this->f) == EOF) writeerror(); PerlIO_rewind(this->f); if (PerlIO_write(this->f, this->final, sizeof this->final) < sizeof this->final) writeerror(); if (PerlIO_flush(this->f) == EOF) writeerror(); if (fsync(PerlIO_fileno(this->f)) == -1) XSRETURN_NO; if (PerlIO_close(this->f) == EOF) XSRETURN_NO; if (rename(this->fntemp, this->fn)) XSRETURN_NO; Safefree(this->fn); Safefree(this->fntemp); RETVAL = 1; OUTPUT: RETVAL HCDB_File-0.86/COPYRIGHT4_D4The files in this directory are Copyright 1997 - 2001 Tim Goodwin. You may redistribute them under the same terms as Perl itself. G5RHCDB_File-0.86/CHANGES5_G. D5Revision history for Perl extension CDB_File. 0.86 2001-05-25 - add handle, datalen, and datapos methods for low level access - simplify multi_get, and remove a memory leak - document need to destroy extra references - open files in binary mode (thanks IW) - use mmap() (thanks RDW) - beta release 0.85 2001-02-06 - multi_get now works during each (thanks MdlR) - move multi_get to CDB_File.xs, remove dumb O(n*n), and fix bug - don't make the database files read-only (thanks FL) - beta release 0.84 2000-11-21 - backwards compatibility with perl-5.005 (thanks BD) - EPROTO not available everywhere (thanks BD); EFTYPE preferred - beta release 0.83 2000-11-03 - fix stupid typo - beta release 0.82 2000-05-30 - fix bug in `each', introduced in 0.81 - beta release 0.81 2000-05-12 - port to perl 5.6.0 - cdb code derived from cdb-0.75 - cdb code incorporated into CDB_File.xs - multi_get works even for non-adjacent keys - fetching values in order from previously obtained keys array works - use perlapio 0.8 1999-09-08 - fix bug with undefined keys / values (thanks CMC, JPB) - beta release 0.7 1997-10-20 - use Perl's Strerror instead of strerror - fix bogus warning in multi_get (thanks MdlR) - fix bug with empty values (thanks RDM) - don't fail test 6 if run as root (thanks MP, JB) - alpha release 0.6 1997-03-25 - fix unsigned off_t bug - fix version number confusion - propagate Perl's idea of CC and LD to cdb (thanks IP, SB) - use safe cdb_bread() in preference to read() (thanks MdlR) - object is now a scalar again, containing struct cdbobj - support repeated keys (thanks MdlR) - split create into new, insert, finish - optimize FETCH and NEXTKEY - support building as a static extension - PERLIO_NOT_STDIO so it works with useperlio defined (thanks AK, NMS) - add multi_get method (thanks MdlR) - fix some core dumps (thanks MdlR) - make cdb object read only (thanks MdlR) - alpha release 0.5 1997-02-12 - fix order of @ISA, so imports work - alpha release 0.4 1997-02-06 - iteration (FIRSTKEY, NEXTKEY) added - "pre-alpha" release 0.3 1997-01-28 - no longer dependent on cdbmake - CDB_File::cdbm removed - temporary file name no longer optional - "pre-alpha" release 0.2 1997-01-14 - first "pre-alpha" release 0.1 1997-01-08 - original version; created by h2xs 1.16 G6RHCDB_File-0.86/INSTALL6_GD6You need Perl 5.005 or later. 1. Create a Makefile. perl Makefile.PL 2. Build the CDB_File extension. make 3. Test it (please don't omit this step). make test You should see `ok 1' through to `ok 37'. If any tests fail, please get in touch so we can sort out the problem. 4. Install the extension. If you have built CDB_File as a dynamic extension, it's as simple as this. make install If you have built CDB_File as a static extension, follow the instructions given during the build process. 5. If you have any problems, questions, or ideas for future enhancements, please contact me. Tim Goodwin 2001-02-06 G7RHCDB_File-0.86/MANIFEST7_GkD7ACKNOWLEDGE CDB_File.pm CDB_File.xs COPYRIGHT CHANGES INSTALL MANIFEST Makefile.PL README bun-x.pl test.pl G8RHCDB_File-0.86/Makefile.PL8_GD8use Config; use ExtUtils::MakeMaker; # See lib/ExtUtils/MakeMaker.pm for details of how to influence # the contents of the Makefile that is written. WriteMakefile( 'NAME' => 'CDB_File', 'VERSION_FROM' => 'CDB_File.pm', 'DEFINE' => ($Config{d_mmap} eq define) && '-DHASMMAP', ); G9RHCDB_File-0.86/README9_GD9This is beta prerelease 0.86 of CDB_File. See INSTALL for installation instructions. CDB_File is a module which provides a Perl interface to Dan Berstein's cdb package: cdb is a fast, reliable, lightweight package for creating and reading constant databases. See http://pobox.com/~djb/cdb.html for the latest information about cdb. Tim Goodwin 2001-05-25 G10RXHCDB_File-0.86/bun-x.pl10_G&D10#! /usr/bin/perl use CDB_File; use strict; sub unnetstrings { my($netstrings) = @_; my @result; while ($netstrings =~ s/^([0-9]+)://) { push @result, substr($netstrings, 0, $1, ''); $netstrings =~ s/^,//; } return @result; } my $chunk = 8192; sub extract { my($file, $t, $b) = @_; my $head = $$b{"H$file"}; my ($code, $type) = $head =~ m/^([0-9]+)(.)/; if ($type eq "/") { mkdir $file, 0777; } elsif ($type eq "_") { my ($total, $now, $got, $x); open OUT, ">$file" or die "open for output: $!\n"; exists $$b{"D$code"} or die "corrupt bun file\n"; my $fh = $t->handle; sysseek $fh, $t->datapos, 0; $total = $t->datalen; while ($total) { $now = ($total > $chunk) ? $chunk : $total; $got = sysread $fh, $x, $now; if (not $got) { die "read error\n"; } $total -= $got; print OUT $x; } close OUT; } else { print STDERR "warning: skipping unknown file type\n"; } } die "usage\n" if @ARGV != 1; my (%b, $t); $t = tie %b, 'CDB_File', $ARGV[0] or die "tie: $!\n"; map { extract $_, $t, \%b } unnetstrings $b{""}; G11RHCDB_File-0.86/test.pl11_GD11# Before `make install' is performed this script should be runnable with # `make test'. After `make install' it should work as `perl test.pl' BEGIN {print "1..38\n";} END {print "not ok 1\n" unless $loaded;} use CDB_File; $loaded = 1; print "ok 1\n"; # Test that attempt to tie to nonexist file fails. tie %h, CDB_File, 'nonesuch.cdb' and print 'not '; print "ok 2\n"; # Test that attempt to read incorrect file fails. open OUT, '> bad.cdb'; close OUT; tie %h, CDB_File, 'bad.cdb' or print 'not '; print "ok 3\n"; eval { print $h{'one'} }; print 'not ' unless $@ =~ /^Read of CDB_File failed:/; print "ok 4\n"; untie %h; unlink 'bad.cdb'; # Test that file can be created. %a = qw(one Hello two Goodbye); eval { CDB_File::create %a, 'good.cdb', 'good.tmp' or print 'not ' }; print "$@ not " if $@; print "ok 5\n"; # Test that good file works. tie %h, CDB_File, 'good.cdb' or print 'not '; print "ok 6\n"; $t = tied %h; $t->FETCH('one') eq 'Hello' or print 'not '; print "ok 7\n"; $h{'one'} eq 'Hello' or print 'not '; print "ok 8\n"; defined $h{'1'} and print 'not '; print "ok 9\n"; exists $h{'two'} or print 'not '; print "ok 10\n"; exists $h{'three'} and print 'not '; print "ok 11\n"; # Test low level access. $fh = $t->handle; exists $h{'one'}; sysseek $fh, $t->datapos, 0; sysread $fh, $x, $t->datalen; $x eq 'Hello' or print "not "; exists $h{'two'}; sysseek $fh, $t->datapos, 0; sysread $fh, $x, $t->datalen; $x eq 'Goodbye' or print "not "; print "ok 12\n"; @h = sort keys %h; @h == 2 and $h[0] eq 'one' and $h[1] eq 'two' or print 'not '; print "ok 13\n"; eval { $h{'four'} = 'foo' }; print 'not ' unless $@ =~ /Modification of a CDB_File attempted/; print "ok 14\n"; eval { delete $h{'five'} }; print 'not ' unless $@ =~ /Modification of a CDB_File attempted/; print "ok 15\n"; unlink 'good.cdb'; # Test empty file. undef %a; eval { CDB_File::create %a, 'empty.cdb', 'empty.tmp' or print 'not ' }; print "$@ not " if $@; print "ok 16\n"; tie %h, CDB_File, 'empty.cdb' or print 'not '; print "ok 17\n"; keys %h == 0 or print 'not '; print "ok 18\n"; unlink 'empty.cdb'; # Test failing new. new CDB_File '..', '.' and print 'not '; print "ok 19\n"; # Test file with repeated keys. $tmp = 'repeat.tmp'; $cdbm = new CDB_File 'repeat.cdb', $tmp or print 'not '; print "ok 20\n"; $cdbm->insert('dog', 'perro'); $cdbm->insert('cat', 'gato'); $cdbm->insert('cat', 'chat'); $cdbm->insert('dog', 'chien'); $cdbm->insert('rabbit', 'conejo'); $tmp = 'ERROR!'; # Test that name was stashed correctly. $cdbm->finish; $t = tie %h, CDB_File, 'repeat.cdb' or print 'not '; print "ok 21\n"; # Test that NEXTKEY can't be used immediately after TIEHASH. eval { $t->NEXTKEY('dog') }; print 'not ' unless $@ =~ /^Use CDB_File::FIRSTKEY before CDB_File::NEXTKEY/; print "ok 22\n"; @k = keys %h; @v = values %h; $k[0] eq 'dog' and $k[1] eq 'cat' and $k[2] eq 'cat' and $k[3] eq 'dog' and $k[4] eq 'rabbit' and $v[0] eq 'perro' and $v[1] eq 'gato' and $v[2] eq 'chat' and $v[3] eq 'chien' and $v[4] eq 'conejo' or print 'not '; print "ok 23\n"; @k = (); @v = (); while (($k, $v) = each %h) { push @k, $k; push @v, $v; } $k[0] eq 'dog' and $k[1] eq 'cat' and $k[2] eq 'cat' and $k[3] eq 'dog' and $k[4] eq 'rabbit' and $v[0] eq 'perro' and $v[1] eq 'gato' and $v[2] eq 'chat' and $v[3] eq 'chien' and $v[4] eq 'conejo' or print 'not '; print "ok 24\n"; $v = $t->multi_get('cat'); @$v == 2 and $$v[0] eq 'gato' and $$v[1] eq 'chat' or print 'not '; print "ok 25\n"; $v = $t->multi_get('dog'); @$v == 2 and $$v[0] eq 'perro' and $$v[1] eq 'chien' or print 'not '; print "ok 26\n"; $v = $t->multi_get('rabbit'); @$v == 1 and $$v[0] eq 'conejo' or print 'not '; print "ok 27\n"; $v = $t->multi_get('foo'); @$v and print 'not '; print "ok 28\n"; while (($k, $v) = each %h) { $v = $t->multi_get($k); $k eq 'cat' and $$v[0] eq 'gato' and $$v[1] eq 'chat' or $k eq 'dog' and $$v[0] eq 'perro' and $$v[1] eq 'chien' or $k eq 'rabbit' and $$v[0] eq 'conejo' or print 'not '; } print "ok 29\n"; # Test undefined keys. { local $SIG{__WARN__} = sub { $warned = 1 if $_[0] =~ /^Use of uninitialized value/ }; local $^W = 1; $warned = 0; $x = undef; not defined $h{$x} and $warned or print 'not '; print "ok 30\n"; $warned = 0; not exists $h{$x} and $warned or print 'not '; print "ok 31\n"; $warned = 0; $v = $t->multi_get('rabbit') and not $warned or print 'not '; print "ok 32\n"; } # Check that object is readonly. eval { $$t = 'foo' }; $@ =~ /^Modification of a read-only value/ and $h{'cat'} eq 'gato' or print 'not '; print "ok 33\n"; unlink 'repeat.cdb'; # Regression test - dumps core in 0.6. %a = ('one', ''); CDB_File::create %a, 'good.cdb', 'good.tmp' or print "not "; tie %h, CDB_File, 'good.cdb' or print "not "; print "not " if $h{'zero'} or $h{'one'}; print "ok 34\n"; # And here's one I introduced while fixing 34 :-(. defined $h{'one'} or print "not "; print "ok 35\n"; unlink 'good.cdb'; # Test numeric data (broken before 0.8) $h = new CDB_File 't.cdb', 't.tmp' or print "not "; $h->insert(1, 1 * 23); $h->finish or print "not "; tie %h, CDB_File, 't.cdb' or print "not "; $h{1} == 23 or print "not "; untie %h; print "ok 36\n"; unlink 't.cdb'; # Test zero value with multi_get (broken before 0.85) $h = new CDB_File 't.cdb', 't.tmp' or print "not "; $h->insert('x', 0); $h->insert('x', 1); $h->finish or print "not "; $t = tie %h, CDB_File, 't.cdb' or print "not "; $x = $t->multi_get('x'); scalar @$x == 2 or print "not "; print "ok 37\n"; unlink 't.cdb'; $h = new CDB_File 't.cdb', 't.tmp' or print "not "; for ($i = 0; $i < 10; ++$i) { $h->insert($i, $i); } $h->finish or print "not "; $t = tie %h, CDB_File, 't.cdb' or print "not "; for ($i = 0; $i < 10; ++$i) { ($k, $v) = each %h; if ($k == 2) { exists $h{4} or print "not "; } if ($k == 5) { exists $h{23} and print "not "; } if ($k == 7) { $m = $t->multi_get(3); @$m == 1 and $$m[0] = 3 or print "not "; } $k == $i and $v == $i or print "not "; } print "ok 38\n"; unlink 't.cdb'; 713:CDB_File-0.86,25:CDB_File-0.86/ACKNOWLEDGE,25:CDB_File-0.86/CDB_File.pm,25:CDB_File-0.86/CDB_File.xs,23:CDB_File-0.86/COPYRIGHT,21:CDB_File-0.86/CHANGES,21:CDB_File-0.86/INSTALL,22:CDB_File-0.86/MANIFEST,25:CDB_File-0.86/Makefile.PL,20:CDB_File-0.86/README,22:CDB_File-0.86/bun-x.pl,21:CDB_File-0.86/test.pl,   &  rY rYV1rYrY}rYrYtJ& a1rYWrY0rYH2rY13rY+ 4rYt5Db5rYs6rY7rY~8rY؂9rYB ˈC j:s-&Ytg׈݆w2]}(E GK#