JAX Quickstart — JAX documentation
JAX is NumPy on the CPU, GPU, and TPU, with nice computerized differentiation for highperformance machine studying analysis.
With its up to date model of Autograd, JAX
can robotically differentiate native Python and NumPy code. It may
differentiate by means of a big subset of Python’s options, together with loops, ifs,
recursion, and closures, and it might probably even take derivatives of derivatives of
derivatives. It helps reversemode in addition to forwardmode differentiation, and the 2 might be composed arbitrarily
to any order.
What’s new is that JAX makes use of
XLA
to compile and run your NumPy code on accelerators, like GPUs and TPUs.
Compilation occurs underneath the hood by default, with library calls getting
justintime compiled and executed. However JAX even allows you to justintime compile
your personal Python capabilities into XLAoptimized kernels utilizing a onefunction API.
Compilation and computerized differentiation might be composed arbitrarily, so that you
can specific subtle algorithms and get maximal efficiency with out having
to depart Python.
import jax.numpy as jnp
from jax import grad, jit, vmap
from jax import random
Multiplying Matrices
We’ll be producing random knowledge within the following examples. One huge distinction between NumPy and JAX is the way you generate random numbers. For extra particulars, see Common Gotchas in JAX.
key = random.PRNGKey(0)
x = random.regular(key, (10,))
print(x)
[0.3721109 0.26423115 0.18252768 0.7368197 0.44030377 0.1521442
0.67135346 0.5908641 0.73168886 0.5673026 ]
Let’s dive proper in and multiply two huge matrices.
dimension = 3000
x = random.regular(key, (dimension, dimension), dtype=jnp.float32)
%timeit jnp.dot(x, x.T).block_until_ready() # runs on the GPU
13.5 ms ± 1.89 ms per loop (imply ± std. dev. of seven runs, 1 loop every)
We added that block_until_ready
as a result of JAX makes use of asynchronous execution by default (see Asynchronous dispatch).
JAX NumPy capabilities work on common NumPy arrays.
import numpy as np
x = np.random.regular(dimension=(dimension, dimension)).astype(np.float32)
%timeit jnp.dot(x, x.T).block_until_ready()
80 ms ± 30.2 ms per loop (imply ± std. dev. of seven runs, 10 loops every)
That’s slower as a result of it has to switch knowledge to the GPU each time. You’ll be able to make sure that an NDArray is backed by machine reminiscence utilizing device_put()
.
from jax import device_put
x = np.random.regular(dimension=(dimension, dimension)).astype(np.float32)
x = device_put(x)
%timeit jnp.dot(x, x.T).block_until_ready()
15.8 ms ± 113 µs per loop (imply ± std. dev. of seven runs, 100 loops every)
The output of device_put()
nonetheless acts like an NDArray, nevertheless it solely copies values again to the CPU once they’re wanted for printing, plotting, saving to disk, branching, and so on. The conduct of device_put()
is equal to the operate jit(lambda x: x)
, nevertheless it’s quicker.
If in case you have a GPU (or TPU!) these calls run on the accelerator and have the potential to be a lot quicker than on CPU.
See Is JAX faster than NumPy? for extra comparability of efficiency traits of NumPy and JAX
JAX is far more than only a GPUbacked NumPy. It additionally comes with a number of program transformations which are helpful when writing numerical code. For now, there are three predominant ones:

jit()
, for dashing up your code 
grad()
, for taking derivatives 
vmap()
, for computerized vectorization or batching.
Let’s go over these, onebyone. We’ll additionally find yourself composing these in fascinating methods.
Utilizing jit()
to hurry up capabilities
JAX runs transparently on the GPU or TPU (falling again to CPU should you don’t have one). Nevertheless, within the above instance, JAX is dispatching kernels to the GPU one operation at a time. If now we have a sequence of operations, we are able to use the @jit
decorator to compile a number of operations collectively utilizing XLA. Let’s attempt that.
def selu(x, alpha=1.67, lmbda=1.05):
return lmbda * jnp.the place(x > 0, x, alpha * jnp.exp(x)  alpha)
x = random.regular(key, (1000000,))
%timeit selu(x).block_until_ready()
1.07 ms ± 261 µs per loop (imply ± std. dev. of seven runs, 1 loop every)
We will pace it up with @jit
, which can jitcompile the primary time selu
is known as and shall be cached thereafter.
selu_jit = jit(selu)
%timeit selu_jit(x).block_until_ready()
127 µs ± 1.43 µs per loop (imply ± std. dev. of seven runs, 10000 loops every)
Taking derivatives with grad()
Along with evaluating numerical capabilities, we additionally need to remodel them. One transformation is automatic differentiation. In JAX, identical to in Autograd, you possibly can compute gradients with the grad()
operate.
def sum_logistic(x):
return jnp.sum(1.0 / (1.0 + jnp.exp(x)))
x_small = jnp.arange(3.)
derivative_fn = grad(sum_logistic)
print(derivative_fn(x_small))
[0.25 0.19661194 0.10499357]
Let’s confirm with finite variations that our result’s appropriate.
def first_finite_differences(f, x):
eps = 1e3
return jnp.array([(f(x + eps * v)  f(x  eps * v)) / (2 * eps)
for v in jnp.eye(len(x))])
print(first_finite_differences(sum_logistic, x_small))
[0.24998187 0.1965761 0.10502338]
Taking derivatives is as straightforward as calling grad()
. grad()
and jit()
compose and might be combined arbitrarily. Within the above instance we jitted sum_logistic
after which took its byproduct. We will go additional:
print(grad(jit(grad(jit(grad(sum_logistic)))))(1.0))
For extra superior autodiff, you need to use jax.vjp()
for reversemode vectorJacobian merchandise and jax.jvp()
for forwardmode Jacobianvector merchandise. The 2 might be composed arbitrarily with each other, and with different JAX transformations. Right here’s one approach to compose them to make a operate that effectively computes full Hessian matrices:
from jax import jacfwd, jacrev
def hessian(enjoyable):
return jit(jacfwd(jacrev(enjoyable)))
Autovectorization with vmap()
JAX has yet one more transformation in its API that you just would possibly discover helpful: vmap()
, the vectorizing map. It has the acquainted semantics of mapping a operate alongside array axes, however as an alternative of holding the loop on the surface, it pushes the loop down right into a operate’s primitive operations for higher efficiency. When composed with jit()
, it may be simply as quick as including the batch dimensions by hand.
We’re going to work with a easy instance, and promote matrixvector merchandise into matrixmatrix merchandise utilizing vmap()
. Though that is straightforward to do by hand on this particular case, the identical approach can apply to extra difficult capabilities.
mat = random.regular(key, (150, 100))
batched_x = random.regular(key, (10, 100))
def apply_matrix(v):
return jnp.dot(mat, v)
Given a operate resembling apply_matrix
, we are able to loop over a batch dimension in Python, however often the efficiency of doing so is poor.
def naively_batched_apply_matrix(v_batched):
return jnp.stack([apply_matrix(v) for v in v_batched])
print('Naively batched')
%timeit naively_batched_apply_matrix(batched_x).block_until_ready()
Naively batched
3.12 ms ± 176 µs per loop (imply ± std. dev. of seven runs, 1 loop every)
We all know find out how to batch this operation manually. On this case, jnp.dot
handles additional batch dimensions transparently.
@jit
def batched_apply_matrix(v_batched):
return jnp.dot(v_batched, mat.T)
print('Manually batched')
%timeit batched_apply_matrix(batched_x).block_until_ready()
Manually batched
45.6 µs ± 5.03 µs per loop (imply ± std. dev. of seven runs, 10000 loops every)
Nevertheless, suppose we had a extra difficult operate with out batching help. We will use vmap()
so as to add batching help robotically.
@jit
def vmap_batched_apply_matrix(v_batched):
return vmap(apply_matrix)(v_batched)
print('Autovectorized with vmap')
%timeit vmap_batched_apply_matrix(batched_x).block_until_ready()
Autovectorized with vmap
48.3 µs ± 1.06 µs per loop (imply ± std. dev. of seven runs, 10000 loops every)
After all, vmap()
might be arbitrarily composed with jit()
, grad()
, and every other JAX transformation.
That is only a style of what JAX can do. We’re actually excited to see what you do with it!