q vs python: Tables joins
(Series) A side-by-side comparison of q/kdb+ and python for qbies (q newbies).
This cheat sheet is a list of code snippets that do the same thing in q and python -- something I wished existed when I first came to q/KDB+ from python. Note that in both languages, the code shown is sometimes not the most elegant or succinct; that's on purpose because I wanted to show the same constructs in both languages. For a comprehensive intro to q, check out Q for Mortals.
Note q code snippets are edited to have line breaks for readability (my opinion; ymmv). Multiline statements do not work at the prompt, but they do work in scripts. See docs.
Versions: q (3.6), python (3.7)
Init*
q)t1:([]k1:`a`b`c; v1:til 3)
q)t2:([]k2:`a`X`c; v2:3+til 3)
q)t1
k1 v1
-----
a 0
b 1
c 2
q)t2
k2 v2
-----
a 3
X 4
c 5
|
>>> df = pd.DataFrame(dict(k1=list('abc'), v1=range(3)))
>>> dg = pd.DataFrame(dict(k2=list('aXc'), v2=range(3, 6)))
>>> df
k1 v1
--------
0 a 0
1 b 1
2 c 2
>>> dg
k2 v2
--------
0 a 3
1 X 4
2 c 5
|
Join and Concatenate*
// left-join
q)t1 lj `k1 xkey `k1 xcol t2
k1 v1 v2
--------
a 0 3
b 1
c 2 5
// concat side-by-side
q)t1 ,' t2
k1 v1 k2 v2
-----------
a 0 a 3
b 1 X 4
c 2 c 5
// concat one-below-the-other
q)t1, `k1`v1 xcol t2
k1 v1
-----
a 0
b 1
c 2
a 3
X 4
c 5
// concat side-by-side with dup col names
// (second table wins)
q)t1 ,' `k1`v1 xcol t2
k1 v1
-----
a 3
X 4
c 5
|
# left-join
>>> df.set_index('k1').join(dg.set_index('k2')).reset_index()
k1 v1 v2
-------------
0 a 0 3.0
1 b 1 NaN
2 c 2 5.0
# concat side-by-side
>>> pd.concat([df, dg], axis=1)
k1 v1 k2 v2
---------------
0 a 0 a 3
1 b 1 X 4
2 c 2 c 5
# concat one-below-the-other
>>> pd.concat([df, dg.rename(columns=dict(k2="k1", v2="v1"))],
... axis=0, ignore_index=True)
k1 v1
--------
0 a 0
1 b 1
2 c 2
3 a 3
4 X 4
5 c 5
# concat side-by-side with dup col names
# (output has duplicated col names; don't do this)
>>> pd.concat([df, dg.rename(columns=dict(k2="k1", v2="v1"))], axis=1)
k1 v1 k1 v1
---------------
0 a 0 a 3
1 b 1 X 4
2 c 2 c 5
|