tar-stream

tar-stream is a streaming tar parser and generator.

README

tar-stream


tar-stream is a streaming tar parser and generator and nothing else. It operates purely using streams which means you can easily extract/parse tarballs without ever hitting the file system.

Note that you still need to gunzip your data if you have a .tar.gz. We recommend using gunzip-maybe in conjunction with this.

  1. ```
  2. npm install tar-stream
  3. ```

build status
License

Usage


tar-stream exposes two streams, pack which creates tarballs and extract which extracts tarballs. To modify an existing tarball use both.


It implementes USTAR with additional support for pax extended headers. It should be compatible with all popular tar distributions out there (gnutar, bsdtar etc)

Related


If you want to pack/unpack directories on the file system check out tar-fs which provides file system bindings to this module.

Packing


To create a pack stream use tar.pack() and call pack.entry(header, [callback]) to add tar entries.

  1. ``` js
  2. var tar = require('tar-stream')
  3. var pack = tar.pack() // pack is a stream

  4. // add a file called my-test.txt with the content "Hello World!"
  5. pack.entry({ name: 'my-test.txt' }, 'Hello World!')

  6. // add a file called my-stream-test.txt from a stream
  7. var entry = pack.entry({ name: 'my-stream-test.txt', size: 11 }, function(err) {
  8.   // the stream was added
  9.   // no more entries
  10.   pack.finalize()
  11. })

  12. entry.write('hello')
  13. entry.write(' ')
  14. entry.write('world')
  15. entry.end()

  16. // pipe the pack stream somewhere
  17. pack.pipe(process.stdout)
  18. ```

Extracting


To extract a stream use tar.extract() and listen for extract.on('entry', (header, stream, next) )

  1. ``` js
  2. var extract = tar.extract()

  3. extract.on('entry', function(header, stream, next) {
  4.   // header is the tar header
  5.   // stream is the content body (might be an empty stream)
  6.   // call next when you are done with this entry

  7.   stream.on('end', function() {
  8.     next() // ready for next entry
  9.   })

  10.   stream.resume() // just auto drain the stream
  11. })

  12. extract.on('finish', function() {
  13.   // all entries read
  14. })

  15. pack.pipe(extract)
  16. ```

The tar archive is streamed sequentially, meaning you must drain each entry's stream as you get them or else the main extract stream will receive backpressure and stop reading.

Headers


The header object using in entry should contain the following properties.
Most of these values can be found by stat'ing a file.

  1. ``` js
  2. {
  3.   name: 'path/to/this/entry.txt',
  4.   size: 1314,        // entry size. defaults to 0
  5.   mode: 0o644,       // entry mode. defaults to to 0o755 for dirs and 0o644 otherwise
  6.   mtime: new Date(), // last modified date for entry. defaults to now.
  7.   type: 'file',      // type of entry. defaults to file. can be:
  8.                      // file | link | symlink | directory | block-device
  9.                      // character-device | fifo | contiguous-file
  10.   linkname: 'path',  // linked file name
  11.   uid: 0,            // uid of entry owner. defaults to 0
  12.   gid: 0,            // gid of entry owner. defaults to 0
  13.   uname: 'maf',      // uname of entry owner. defaults to null
  14.   gname: 'staff',    // gname of entry owner. defaults to null
  15.   devmajor: 0,       // device major version. defaults to 0
  16.   devminor: 0        // device minor version. defaults to 0
  17. }
  18. ```

Modifying existing tarballs


Using tar-stream it is easy to rewrite paths / change modes etc in an existing tarball.

  1. ``` js
  2. var extract = tar.extract()
  3. var pack = tar.pack()
  4. var path = require('path')

  5. extract.on('entry', function(header, stream, callback) {
  6.   // let's prefix all names with 'tmp'
  7.   header.name = path.join('tmp', header.name)
  8.   // write the new entry to the pack stream
  9.   stream.pipe(pack.entry(header, callback))
  10. })

  11. extract.on('finish', function() {
  12.   // all entries done - lets finalize it
  13.   pack.finalize()
  14. })

  15. // pipe the old tarball to the extractor
  16. oldTarballStream.pipe(extract)

  17. // pipe the new tarball the another stream
  18. pack.pipe(newTarballStream)
  19. ```

Saving tarball to fs



  1. ``` js
  2. var fs = require('fs')
  3. var tar = require('tar-stream')

  4. var pack = tar.pack() // pack is a stream
  5. var path = 'YourTarBall.tar'
  6. var yourTarball = fs.createWriteStream(path)

  7. // add a file called YourFile.txt with the content "Hello World!"
  8. pack.entry({name: 'YourFile.txt'}, 'Hello World!', function (err) {
  9.   if (err) throw err
  10.   pack.finalize()
  11. })

  12. // pipe the pack stream to your file
  13. pack.pipe(yourTarball)

  14. yourTarball.on('close', function () {
  15.   console.log(path + ' has been written')
  16.   fs.stat(path, function(err, stats) {
  17.     if (err) throw err
  18.     console.log(stats)
  19.     console.log('Got file info successfully!')
  20.   })
  21. })
  22. ```

Performance



License


MIT