x, y : pl.Tensor — input tensors (same shape or broadcastable). output : pl.Tensor — pre-allocated output tensor. tile_op : callable(tile_a, tile_b) -> tile ...