Sample Page Title

December 22, 2025

33

NumPy-style broadcasting for R TensorFlow customers

We develop, prepare, and deploy TensorFlow fashions from R. However that doesn’t imply we don’t make use of documentation, weblog posts, and examples written in Python. We glance up particular performance within the official TensorFlow API docs; we get inspiration from different folks’s code.

Relying on how comfy you’re with Python, there’s an issue. For instance: You’re speculated to know the way broadcasting works. And maybe, you’d say you’re vaguely accustomed to it: So when arrays have totally different shapes, some parts get duplicated till their shapes match and … and isn’t R vectorized anyway?

Whereas such a world notion may fit typically, like when skimming a weblog publish, it’s not sufficient to grasp, say, examples within the TensorFlow API docs. On this publish, we’ll attempt to arrive at a extra actual understanding, and examine it on concrete examples.

Talking of examples, listed below are two motivating ones.

Broadcasting in motion

The primary makes use of TensorFlow’s matmul to multiply two tensors. Would you wish to guess the end result – not the numbers, however the way it comes about typically? Does this even run with out error – shouldn’t matrices be two-dimensional (rank-2 tensors, in TensorFlow converse)?

a <- tf$fixed(keras::array_reshape(1:12, dim = c(2, 2, 3)))
a 
# tf.Tensor(
# [[[ 1.  2.  3.]
#   [ 4.  5.  6.]]
# 
#  [[ 7.  8.  9.]
#   [10. 11. 12.]]], form=(2, 2, 3), dtype=float64)

b <- tf$fixed(keras::array_reshape(101:106, dim = c(1, 3, 2)))
b  
# tf.Tensor(
# [[[101. 102.]
#   [103. 104.]
#   [105. 106.]]], form=(1, 3, 2), dtype=float64)

c <- tf$matmul(a, b)

Second, here’s a “actual instance” from a TensorFlow Likelihood (TFP) github concern. (Translated to R, however preserving the semantics).
In TFP, we will have batches of distributions. That, per se, is no surprise. However have a look at this:

library(tfprobability)
d <- tfd_normal(loc = c(0, 1), scale = matrix(1.5:4.5, ncol = 2, byrow = TRUE))
d
# tfp.distributions.Regular("Regular", batch_shape=[2, 2], event_shape=[], dtype=float64)

We create a batch of 4 regular distributions: every with a special scale (1.5, 2.5, 3.5, 4.5). However wait: there are solely two location parameters given. So what are their scales, respectively?
Fortunately, TFP builders Brian Patton and Chris Suter defined the way it works: TFP truly does broadcasting – with distributions – identical to with tensors!

We get again to each examples on the finish of this publish. Our most important focus will probably be to elucidate broadcasting as finished in NumPy, as NumPy-style broadcasting is what quite a few different frameworks have adopted (e.g., TensorFlow).

Earlier than although, let’s rapidly evaluation a couple of fundamentals about NumPy arrays: The best way to index or slice them (indexing usually referring to single-element extraction, whereas slicing would yield – effectively – slices containing a number of parts); how you can parse their shapes; some terminology and associated background.
Although not sophisticated per se, these are the sorts of issues that may be complicated to rare Python customers; but they’re typically a prerequisite to efficiently making use of Python documentation.

Acknowledged upfront, we’ll actually limit ourselves to the fundamentals right here; for instance, we gained’t contact superior indexing which – identical to heaps extra –, could be seemed up intimately within the NumPy documentation.

Few information about NumPy

Fundamental slicing

For simplicity, we’ll use the phrases indexing and slicing kind of synonymously any further. The fundamental gadget here’s a slice, particularly, a begin:cease construction indicating, for a single dimension, which vary of parts to incorporate within the choice.

In distinction to R, Python indexing is zero-based, and the tip index is unique:

import numpy as np
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

x[1:7] 
# array([1, 2, 3, 4, 5, 6])

x[5:] 
# array([5, 6, 7, 8, 9])

x[:7]
# array([0, 1, 2, 3, 4, 5, 6])

x[:] 
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

x = np.array([[1, 2], [3, 4], [5, 6]])
x
# array([[1, 2],
#        [3, 4],
#        [5, 6]])

x[1, :] 
# array([3, 4])

x[1] 
# array([3, 4])

x[1, ] 
# array([3, 4])

Whereas the second certain appears to be like a bit like R, the mechanism is totally different. Technically, these begin:cease issues are elements of a Python tuple – that list-like, however immutable information construction that may be written with or with out parentheses, e.g., 1,2 or (1,2) –, and at any time when we’ve got extra dimensions within the array than parts within the tuple NumPy will assume we meant : for that dimension: Simply choose all the pieces.

x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
x
# array([[[1],
#         [2],
#         [3]],
# 
#        [[4],
#         [5],
#         [6]]])

x.form
# (2, 3, 1)

x[0,]
#array([[1],
#       [2],
#       [3]])

x[0, ...]
#array([[1],
#       [2],
#       [3]])

np.zeros(24).reshape(4, 3, 2)

c1 = np.array([[[0, 0, 0]]])
c2 = np.array([[[0], [0], [0]]]) 
c3 = np.array([[[0]], [[0]], [[0]]])

c1.form # (1, 1, 3)
c2.form # (1, 3, 1)
c3.form # (3, 1, 1)

a = np.array([[1, 2, 3], [4, 5, 6]])
a
# array([[1, 2, 3],
#        [4, 5, 6]])

1 2 3 4 5 6

1 4 2 5 3 6

c1 = np.array([[[0, 0, 0]]])
c1.form   # (1, 1, 3)
c1.strides # (24, 24, 8)

c2 = np.array([[[0], [0], [0]]]) 
c2.form   # (1, 3, 1)
c2.strides # (24, 8, 8)

c3 = np.array([[[0]], [[0]], [[0]]])
c3.form   # (3, 1, 1) 
c3.strides # (8, 8, 8)

a = np.array([1,2,3])
b = 1
a + b

array([2, 3, 4])

a = np.array([1,2,3])
b = np.array([[1,2,3], [4,5,6]])
a + b

array([[2, 4, 6],
       [5, 7, 9]])

a = np.array([1,2,3])
b = np.array([[1,2,3], [4,5,6]])
a + b

ValueError: operands couldn't be broadcast along with shapes (2,) (2,3)

   # array 1, form:     8  1  6  1
   # array 2, form:        7  1  5

a = np.zeros([2, 3]) # form (2, 3)
b = np.zeros([2])    # form (2,)
c = np.zeros([3])    # form (3,)

a + b # error

a + c
# array([[0., 0., 0.],
#        [0., 0., 0.]])

# begin with the above "non-vector"
c = np.array([0, 0])
c.form
# (2,)

# approach 1: reshape
c.reshape(2, 1).form
# (2, 1)

# np.newaxis inserts new axis
c[ :, np.newaxis].form
# (2, 1)

# None does the identical
c[ :, None].form
# (2, 1)

# or assemble instantly as (2, 1), listening to the parentheses...
c = np.array([[0], [0]])
c.form
# (2, 1)

c = np.array([[0], [0]])
c.form
# (2, 1)

a = np.zeros([2, 3])
a.form
# (2, 3)
a + c
# array([[0., 0., 0.],
#       [0., 0., 0.]])

a = np.zeros([3, 2])
a.form
# (3, 2)
a + c
# ValueError: operands couldn't be broadcast along with shapes (3,2) (2,1)

a = np.array([0.0, 10.0, 20.0, 30.0])
a.form
# (4,)

b = np.array([1.0, 2.0, 3.0])
b.form
# (3,)

a[:, np.newaxis] * b
# array([[ 0.,  0.,  0.],
#        [10., 20., 30.],
#        [20., 40., 60.],
#        [30., 60., 90.]])

If by now, you’re feeling lower than smitten by listening to an in depth exposition of how TensorFlow broadcasting differs from NumPy’s, there may be excellent news: Principally, the principles are the identical. Nevertheless, when matrix operations work on batches – as within the case of matmul and mates – , issues should get sophisticated; one of the best recommendation right here most likely is to rigorously learn the documentation (and as at all times, strive issues out).

a <- tf$ones(form = c(4L, 1L))
a
# tf.Tensor(
# [[1.]
#  [1.]
#  [1.]
#  [1.]], form=(4, 1), dtype=float32)

b <- tf$fixed(c(1, 2, 3, 4))
b
# tf.Tensor([1. 2. 3. 4.], form=(4,), dtype=float32)

a + b
# tf.Tensor(
# [[2. 3. 4. 5.]
# [2. 3. 4. 5.]
# [2. 3. 4. 5.]
# [2. 3. 4. 5.]], form=(4, 4), dtype=float32)

And second, once we add tensors with shapes (3, 3) and (3,), the 1-d tensor ought to get added to each row (not each column):

a <- tf$fixed(matrix(1:9, ncol = 3, byrow = TRUE), dtype = tf$float32)
a
# tf.Tensor(
# [[1. 2. 3.]
#  [4. 5. 6.]
#  [7. 8. 9.]], form=(3, 3), dtype=float32)

b <- tf$fixed(c(100, 200, 300))
b
# tf.Tensor([100. 200. 300.], form=(3,), dtype=float32)

a + b
# tf.Tensor(
# [[101. 202. 303.]
#  [104. 205. 306.]
#  [107. 208. 309.]], form=(3, 3), dtype=float32)

Now again to the preliminary matmul instance.

Again to the puzzles

The documentation for matmul says,

The inputs should, following any transpositions, be tensors of rank >= 2 the place the interior 2 dimensions specify legitimate matrix multiplication dimensions, and any additional outer dimensions specify matching batch measurement.

So right here (see code just under), the interior two dimensions look good – (2, 3) and (3, 2) – whereas the one (one and solely, on this case) batch dimension reveals mismatching values 2 and 1, respectively.
A case for broadcasting thus: Each “batches” of a get matrix-multiplied with b.

a <- tf$fixed(keras::array_reshape(1:12, dim = c(2, 2, 3)))
a 
# tf.Tensor(
# [[[ 1.  2.  3.]
#   [ 4.  5.  6.]]
# 
#  [[ 7.  8.  9.]
#   [10. 11. 12.]]], form=(2, 2, 3), dtype=float64)

b <- tf$fixed(keras::array_reshape(101:106, dim = c(1, 3, 2)))
b  
# tf.Tensor(
# [[[101. 102.]
#   [103. 104.]
#   [105. 106.]]], form=(1, 3, 2), dtype=float64)

c <- tf$matmul(a, b)
c
# tf.Tensor(
# [[[ 622.  628.]
#   [1549. 1564.]]
# 
#  [[2476. 2500.]
#   [3403. 3436.]]], form=(2, 2, 2), dtype=float64)

Let’s rapidly examine this actually is what occurs, by multiplying each batches individually:

tf$matmul(a[1, , ], b)
# tf.Tensor(
# [[[ 622.  628.]
#   [1549. 1564.]]], form=(1, 2, 2), dtype=float64)

tf$matmul(a[2, , ], b)
# tf.Tensor(
# [[[2476. 2500.]
#   [3403. 3436.]]], form=(1, 2, 2), dtype=float64)

Is it too bizarre to be questioning if broadcasting would additionally occur for matrix dimensions? E.g., may we strive matmuling tensors of shapes (2, 4, 1) and (2, 3, 1), the place the 4 x 1 matrix can be broadcast to 4 x 3? – A fast check reveals that no.

To see how actually, when coping with TensorFlow operations, it pays off overcoming one’s preliminary reluctance and really seek the advice of the documentation, let’s strive one other one.

Within the documentation for matvec, we’re instructed:

Multiplies matrix a by vector b, producing a * b.
The matrix a should, following any transpositions, be a tensor of rank >= 2, with form(a)[-1] == form(b)[-1], and form(a)[:-2] in a position to broadcast with form(b)[:-1].

In our understanding, given enter tensors of shapes (2, 2, 3) and (2, 3), matvec ought to carry out two matrix-vector multiplications: as soon as for every batch, as listed by every enter’s leftmost dimension. Let’s examine this – thus far, there isn’t any broadcasting concerned:

# two matrices
a <- tf$fixed(keras::array_reshape(1:12, dim = c(2, 2, 3)))
a
# tf.Tensor(
# [[[ 1.  2.  3.]
#   [ 4.  5.  6.]]
# 
#  [[ 7.  8.  9.]
#   [10. 11. 12.]]], form=(2, 2, 3), dtype=float64)

b = tf$fixed(keras::array_reshape(101:106, dim = c(2, 3)))
b
# tf.Tensor(
# [[101. 102. 103.]
#  [104. 105. 106.]], form=(2, 3), dtype=float64)

c <- tf$linalg$matvec(a, b)
c
# tf.Tensor(
# [[ 614. 1532.]
#  [2522. 3467.]], form=(2, 2), dtype=float64)

Doublechecking, we manually multiply the corresponding matrices and vectors, and get:

tf$linalg$matvec(a[1,  , ], b[1, ])
# tf.Tensor([ 614. 1532.], form=(2,), dtype=float64)

tf$linalg$matvec(a[2,  , ], b[2, ])
# tf.Tensor([2522. 3467.], form=(2,), dtype=float64)

The identical. Now, will we see broadcasting if b has only a single batch?

b = tf$fixed(keras::array_reshape(101:103, dim = c(1, 3)))
b
# tf.Tensor([[101. 102. 103.]], form=(1, 3), dtype=float64)

c <- tf$linalg$matvec(a, b)
c
# tf.Tensor(
# [[ 614. 1532.]
#  [2450. 3368.]], form=(2, 2), dtype=float64)

Multiplying each batch of a with b, for comparability:

tf$linalg$matvec(a[1,  , ], b)
# tf.Tensor([ 614. 1532.], form=(2,), dtype=float64)

tf$linalg$matvec(a[2,  , ], b)
# tf.Tensor([[2450. 3368.]], form=(1, 2), dtype=float64)

It labored!

Now, on to the opposite motivating instance, utilizing tfprobability.

Broadcasting in every single place

Right here once more is the setup:

library(tfprobability)
d <- tfd_normal(loc = c(0, 1), scale = matrix(1.5:4.5, ncol = 2, byrow = TRUE))
d
# tfp.distributions.Regular("Regular", batch_shape=[2, 2], event_shape=[], dtype=float64)

What’s going on? Let’s examine location and scale individually:

d$loc
# tf.Tensor([0. 1.], form=(2,), dtype=float64)

d$scale
# tf.Tensor(
# [[1.5 2.5]
#  [3.5 4.5]], form=(2, 2), dtype=float64)

Simply specializing in these tensors and their shapes, and having been instructed that there’s broadcasting occurring, we will cause like this: Aligning each shapes on the best and lengthening loc’s form by 1 (on the left), we’ve got (1, 2) which can be broadcast with (2,2) – in matrix-speak, loc is handled as a row and duplicated.

That means: Now we have two distributions with imply (0) (one in all scale (1.5), the opposite of scale (3.5)), and in addition two with imply (1) (corresponding scales being (2.5) and (4.5)).

Right here’s a extra direct technique to see this:

d$imply()
# tf.Tensor(
# [[0. 1.]
#  [0. 1.]], form=(2, 2), dtype=float64)

d$stddev()
# tf.Tensor(
# [[1.5 2.5]
#  [3.5 4.5]], form=(2, 2), dtype=float64)

Puzzle solved!

Summing up, broadcasting is straightforward “in principle” (its guidelines are), however might have some training to get it proper. Particularly along side the truth that features / operators do have their very own views on which elements of its inputs ought to broadcast, and which shouldn’t. Actually, there isn’t any approach round wanting up the precise behaviors within the documentation.

Hopefully although, you’ve discovered this publish to be begin into the subject. Possibly, just like the writer, you are feeling such as you would possibly see broadcasting occurring wherever on this planet now. Thanks for studying!

Sample Page Title

Broadcasting in motion

Few information about NumPy

Fundamental slicing

Again to the puzzles

Broadcasting in every single place

Related Articles

Tesla Plans to Deploy Optimus in Its Factories First — Labor Teams Say Robots Might Displace Human Employees

Senators File Invoice to Manufacture Extra Bitcoin Miners in US

X ROBOT GOLD EA: TECHNICAL PARAMETERS & USER MANUAL – Buying and selling Techniques – 30 March 2026

LEAVE A REPLY Cancel reply

Latest Articles

Tesla Plans to Deploy Optimus in Its Factories First — Labor Teams Say Robots Might Displace Human Employees

Senators File Invoice to Manufacture Extra Bitcoin Miners in US

X ROBOT GOLD EA: TECHNICAL PARAMETERS & USER MANUAL – Buying and selling Techniques – 30 March 2026

‘Pin Bar’ Foreign exchange Buying and selling Technique

Pete Hegseth Is Vice Signaling

EDITOR PICKS

Tesla Plans to Deploy Optimus in Its Factories First — Labor...

Senators File Invoice to Manufacture Extra Bitcoin Miners in US

X ROBOT GOLD EA: TECHNICAL PARAMETERS & USER MANUAL – Buying...

POPULAR POSTS

Qubic’s Mining Pool Attacking Monero Falls Beneath Assault

What’s nano-texture glass and do I would like it?

Feedback on the brand new buying and selling dialog in Metatrader...

POPULAR CATEGORY