cascalog.cascading.tap documentation

->CascalogTap

(->CascalogTap source sink)
Positional factory function for class cascalog.cascading.tap.CascalogTap.

cascalog-tap

(cascalog-tap source sink)

fill-tap!

(fill-tap! tap xs)

get-sink-tuples

(get-sink-tuples sink)

glob-hfs

(glob-hfs scheme path-or-file source-pattern)

hfs

(hfs scheme path-or-file)(hfs scheme path-or-file sinkmode)

hfs-seqfile

(hfs-seqfile path & opts)
Creates a tap on HDFS using sequence file format. Different
 filesystems can be selected by using different prefixes for `path`.

Supports keyword option for `:outfields`. See
`cascalog.cascading.tap/hfs-tap` for more keyword arguments.

 See http://www.cascading.org/javadoc/cascading/tap/Hfs.html and
 http://www.cascading.org/javadoc/cascading/scheme/SequenceFile.html

hfs-tap

(hfs-tap scheme path-or-file & {:keys [sinkmode sinkparts sink-template source-pattern templatefields], :or {templatefields Fields/ALL}})
Returns a Cascading Hfs tap with support for the supplied scheme,
opened up on the supplied path or file object. Supported keyword
options are:

`:sinkmode` - can be `:keep`, `:update` or `:replace`.

`:sinkparts` - used to constrain the segmentation of output files.

`:source-pattern` - Causes resulting tap to respond as a GlobHfs tap
when used as source.

`:sink-template` - Causes resulting tap to respond as a TemplateTap when
used as a sink.

`:templatefields` - When pattern is supplied via :sink-template,
this option allows a subset of output fields to be used in the
naming scheme.

See f.ex. the
http://docs.cascading.org/cascading/2.0/javadoc/cascading/scheme/local/TextDelimited.html
scheme.

hfs-textline

(hfs-textline path & opts)
Creates a tap on HDFS using textline format. Different filesystems
can be selected by using different prefixes for `path`. Supported
keyword options are:

`:outfields` - used to select the fields written to the tap

`:compression` - one of `:enable`, `:disable` or `:default`

See `cascalog.cascading.tap/hfs-tap` for more keyword arguments.

See http://www.cascading.org/javadoc/cascading/tap/Hfs.html and
http://www.cascading.org/javadoc/cascading/scheme/TextLine.html

lfs

(lfs scheme path-or-file)(lfs scheme path-or-file sinkmode)

lfs-seqfile

(lfs-seqfile path & opts)
Creates a tap that reads data off of the local filesystem in
 sequence file format.

Supports keyword option for `:outfields`. See
`cascalog.cascading.tap/lfs-tap` for more keyword arguments.

 See http://www.cascading.org/javadoc/cascading/tap/Lfs.html and
 http://www.cascading.org/javadoc/cascading/scheme/SequenceFile.html

lfs-tap

(lfs-tap scheme path-or-file & {:keys [sinkmode sinkparts sink-template source-pattern templatefields], :or {templatefields Fields/ALL}})
Returns a Cascading Lfs tap with support for the supplied scheme,
opened up on the supplied path or file object. Supported keyword
options are:

`:sinkmode` - can be `:keep`, `:update` or `:replace`.

`:sinkparts` - used to constrain the segmentation of output files.

`:source-pattern` - Causes resulting tap to respond as a GlobHfs tap
when used as source.

`:sink-template` - Causes resulting tap to respond as a TemplateTap
when used as a sink.

`:templatefields` - When pattern is supplied via :sink-template,
this option allows a subset of output fields to be used in the
naming scheme.

lfs-textline

(lfs-textline path & opts)
Creates a tap on the local filesystem using textline format.

Supports keyword option for `:outfields`. See
`cascalog.cascading.tap/lfs-tap` for more keyword arguments.

 See http://www.cascading.org/javadoc/cascading/tap/Lfs.html and
 http://www.cascading.org/javadoc/cascading/scheme/TextLine.html

map->CascalogTap

(map->CascalogTap m__5818__auto__)
Factory function for class cascalog.cascading.tap.CascalogTap, taking a map of keywords to field values.

memory-source-tap

(memory-source-tap tuples)(memory-source-tap fields-in tuples)

pluck-tuple

(pluck-tuple tap)

sequence-file

(sequence-file field-names)

set-sinkparts!

(set-sinkparts! scheme sinkparts)
If `sinkparts` is truthy, returns the supplied cascading scheme
with the `sinkparts` field updated appropriately; else, acts as
identity.  identity.

stdout

(stdout)
Creates a tap that prints tuples sunk to it to standard
output. Useful for experimentation in the REPL.

template-tap

(template-tap parent sink-template)(template-tap parent sink-template templatefields)

text-line

(text-line)(text-line field-names)(text-line source-fields sink-fields)(text-line source-fields sink-fields compression)

valid-sinkmode?