cascalog.logic.ops documentation
all
(all & ops)
Accepts any number of filtering ops and returns a new op that
checks that every every one of the original filters passes. For
example:
((all #'even? #'positive? #'small?) ?x) ;; within some query
Is equivalent to:
;; within some query
(even? ?x :> ?temp1)
(positive? ?x :> ?temp2)
(small? ?x) :> ?temp3)
(and ?temp1 ?temp2 ?temp3)
any
(any & ops)
Accepts any number of filtering ops and returns a new op that
checks that at least one of the original filters passes. For
example:
((any #'even? #'positive? #'small?) ?x) ;; within some query
Is equivalent to:
;; within some query
(even? ?x :> ?temp1)
(positive? ?x :> ?temp2)
(small? ?x :> ?temp3)
(or ?temp1 ?temp2 ?temp3)
avg
Predicate operation that produces the average value of the
supplied input variable. For example:
(let [src [[1] [2]]]
(<- [?avg]
(src ?x)
(avg ?x :> ?avg)))
;;=> ([1.5])
comp
(comp & ops)
Accepts any number of predicate ops and returns an op that is the
composition of those ops.
(require '[cascalog.ops :as c])
((c/comp #'str #'+) ?x ?y :> ?sum-string) ;; within some query
Is equivalent to:
;; within some query
(+ ?x ?y :> ?intermediate)
(str ?intermediate :> ?sum-string)
distinct-count
Predicate operation that produces a count of all distinct
values of the supplied input variable. For example:
(let [src [[1] [2] [2]]]
(<- [?count]
(src ?x)
(distinct-count ?x :> ?count)))
;;=> ([2])
each
(each op)
Accepts an operation and returns a predicate macro that maps `op`
across any number of input variables. For example:
((each #'str) ?x ?y ?z :> ?x-str ?y-str ?z-str) ;; within some query
Is equivalent to
;; within some query
(str ?x :> ?x-str)
(str ?y :> ?y-str)
(str ?z :> ?z-str)
first-n
(first-n gen n & options__3658__auto__)
Accepts a generator and a number `n` and returns a subquery that
produces the first n elements from the supplied generator. Two
boolean keyword arguments are supported:
:sort -- accepts a vector of variables on which to sort. Defaults to
nil (unsorted).
:reverse -- If true, sorts items in reverse order. (false by default).
For example:
(def src [[1] [3] [2]]) ;; produces 3 tuples
;; produces ([1 2] [3 4] [2 3]) when executed
(def query (<- [?x ?y] (src ?x) (inc ?x :> ?y)))
;; produces ([3 4]) when executed
(first-n query 1 :sort ["?x"] :reverse true)
fixed-sample
(fixed-sample gen n)
Returns a subquery getting a random sample of n elements from the generator
fixed-sample-agg
(fixed-sample-agg amt)
juxt
(juxt & ops)
Accepts any number of predicate ops and returns an op that is the
juxtaposition of those ops.
(require '[cascalog.ops :as c])
((c/juxt #'+ #'- #'<) !x !y :> !sum !diff !mult) ;; within some query
Is equivalent to:
;; within some query
(+ !x !y :> !sum)
(- !x !y :> !diff)
(* !x !y :> !mult)
lazy-generator
(lazy-generator tmp-path [tuple :as l-seq])
Returns a cascalog generator on the supplied sequence of
tuples. `lazy-generator` serializes each item in the lazy sequence
into a sequencefile located at the supplied temporary directory and returns
a tap for the data in that directory.
It's recommended to wrap queries that use this tap with
`cascalog.cascading.io/with-fs-tmp`; for example,
(with-fs-tmp [_ tmp-dir]
(let [lazy-tap (lazy-generator tmp-dir lazy-seq)]
(?<- (stdout)
[?field1 ?field2 ... etc]
(lazy-tap ?field1 ?field2)
...)))
limit-buffer
(limit-buffer n)
limit-combine
(limit-combine options n)
limit-init
(limit-init sort-tuple & tuple)
limit-rank-buffer
(limit-rank-buffer n)
negate
(negate op)
Accepts a filtering op and returns an new op that acts as the
negation (or complement) of the original. For example:
((negate #'string?) ?string-var) ;; within some query
Is equivalent to
;; within some query
(string? ?string-var :> ?temp-bool)
(not ?temp-bool)
partial
(partial op & args)
Accepts an operation and fewer than normal arguments, and returns a
new operation that can be called with the remaining unspecified
args. For example, given this require and defmapop:
(require '[cascalog.logic.ops :as c])
(defmapop plus [x y] (+ x y))
The following two forms are equivalent:
(let [plus-10 (c/partial plus 10)]
(<- [?y] (src ?x) (plus-10 ?x :> ?y)))
(<- [?y] (src ?x) (plus-10 ?x :> ?y))
With the benefit that `10` doesn't need to be hardcoded into the
first query.
re-parse
(re-parse pattern)
Accepts a regex `pattern` and a string argument `str` and returns
the groups within `str` that match the supplied `pattern`.