cascalog.cascading.tap documentation
->CascalogTap
(->CascalogTap source sink)
Positional factory function for class cascalog.cascading.tap.CascalogTap.
cascalog-tap
(cascalog-tap source sink)
fill-tap!
(fill-tap! tap xs)
get-sink-tuples
(get-sink-tuples sink)
glob-hfs
(glob-hfs scheme path-or-file source-pattern)
hfs
(hfs scheme path-or-file)
(hfs scheme path-or-file sinkmode)
hfs-seqfile
(hfs-seqfile path & opts)
Creates a tap on HDFS using sequence file format. Different
filesystems can be selected by using different prefixes for `path`.
Supports keyword option for `:outfields`. See
`cascalog.cascading.tap/hfs-tap` for more keyword arguments.
See http://www.cascading.org/javadoc/cascading/tap/Hfs.html and
http://www.cascading.org/javadoc/cascading/scheme/SequenceFile.html
hfs-tap
(hfs-tap scheme path-or-file & {:keys [sinkmode sinkparts sink-template source-pattern templatefields], :or {templatefields Fields/ALL}})
Returns a Cascading Hfs tap with support for the supplied scheme,
opened up on the supplied path or file object. Supported keyword
options are:
`:sinkmode` - can be `:keep`, `:update` or `:replace`.
`:sinkparts` - used to constrain the segmentation of output files.
`:source-pattern` - Causes resulting tap to respond as a GlobHfs tap
when used as source.
`:sink-template` - Causes resulting tap to respond as a TemplateTap when
used as a sink.
`:templatefields` - When pattern is supplied via :sink-template,
this option allows a subset of output fields to be used in the
naming scheme.
See f.ex. the
http://docs.cascading.org/cascading/2.0/javadoc/cascading/scheme/local/TextDelimited.html
scheme.
hfs-textline
(hfs-textline path & opts)
Creates a tap on HDFS using textline format. Different filesystems
can be selected by using different prefixes for `path`. Supported
keyword options are:
`:outfields` - used to select the fields written to the tap
`:compression` - one of `:enable`, `:disable` or `:default`
See `cascalog.cascading.tap/hfs-tap` for more keyword arguments.
See http://www.cascading.org/javadoc/cascading/tap/Hfs.html and
http://www.cascading.org/javadoc/cascading/scheme/TextLine.html
lfs
(lfs scheme path-or-file)
(lfs scheme path-or-file sinkmode)
lfs-seqfile
(lfs-seqfile path & opts)
Creates a tap that reads data off of the local filesystem in
sequence file format.
Supports keyword option for `:outfields`. See
`cascalog.cascading.tap/lfs-tap` for more keyword arguments.
See http://www.cascading.org/javadoc/cascading/tap/Lfs.html and
http://www.cascading.org/javadoc/cascading/scheme/SequenceFile.html
lfs-tap
(lfs-tap scheme path-or-file & {:keys [sinkmode sinkparts sink-template source-pattern templatefields], :or {templatefields Fields/ALL}})
Returns a Cascading Lfs tap with support for the supplied scheme,
opened up on the supplied path or file object. Supported keyword
options are:
`:sinkmode` - can be `:keep`, `:update` or `:replace`.
`:sinkparts` - used to constrain the segmentation of output files.
`:source-pattern` - Causes resulting tap to respond as a GlobHfs tap
when used as source.
`:sink-template` - Causes resulting tap to respond as a TemplateTap
when used as a sink.
`:templatefields` - When pattern is supplied via :sink-template,
this option allows a subset of output fields to be used in the
naming scheme.
lfs-textline
(lfs-textline path & opts)
Creates a tap on the local filesystem using textline format.
Supports keyword option for `:outfields`. See
`cascalog.cascading.tap/lfs-tap` for more keyword arguments.
See http://www.cascading.org/javadoc/cascading/tap/Lfs.html and
http://www.cascading.org/javadoc/cascading/scheme/TextLine.html
map->CascalogTap
(map->CascalogTap m__5818__auto__)
Factory function for class cascalog.cascading.tap.CascalogTap, taking a map of keywords to field values.
memory-source-tap
(memory-source-tap tuples)
(memory-source-tap fields-in tuples)
pluck-tuple
(pluck-tuple tap)
sequence-file
(sequence-file field-names)
set-sinkparts!
(set-sinkparts! scheme sinkparts)
If `sinkparts` is truthy, returns the supplied cascading scheme
with the `sinkparts` field updated appropriately; else, acts as
identity. identity.
stdout
(stdout)
Creates a tap that prints tuples sunk to it to standard
output. Useful for experimentation in the REPL.
template-tap
(template-tap parent sink-template)
(template-tap parent sink-template templatefields)
text-line
(text-line)
(text-line field-names)
(text-line source-fields sink-fields)
(text-line source-fields sink-fields compression)