cascalog.cascading.flow documentation

IRunnable

All runnable items should implement this function.

all-to-memory

(all-to-memory & args)
Return the results of the supplied workflows as data
structures. Accepts many workflows, and (optionally) a flow name as
the first argument.

compile-flow

(compile-flow & args)
Attaches output taps to some number of subqueries and creates a
Cascading flow. The flow can be executed with `.complete`, or
introspection can be done on the flow.

Syntax: (compile-flow sink1 query1 sink2 query2 ...)
or (compile-flow flow-name sink1 query1 sink2 query2)

 If the first argument is a string, that will be used as the name
for the query and will show up in the JobTracker UI.

compile-hadoop

(compile-hadoop fd)
Compiles the supplied FlowDef into a Hadoop flow.

flow-def

(flow-def {:keys [source-map sink-map trap-map tails name]})
Generates an instance of FlowDef off of the supplied ClojureFlow.

graph

(graph flow path)
Writes a dotfile for the flow at hand to the supplied path.

parse-exec-args

(parse-exec-args [f & rest :as args])
Accept a sequence of (maybe) string and other items and returns a
vector of [theString or "", [other items]].

run!

(run! x)

to-memory

(to-memory m)
Executes the supplied flow and returns the results as a sequence of
tuples.