torch¶
The torch package contains data structures for multi-dimensional tensors and defines mathematical operations over these tensors. Additionally, it provides many utilities for efficient serializing of Tensors and arbitrary types, and other useful utilities.
It has a CUDA counterpart, that enables you to run your tensor computations on an NVIDIA GPU with compute capability >= 3.0.
Tensors¶
- 
torch.is_tensor(obj)[source]¶ Returns True if obj is a PyTorch tensor.
Note that this function is simply doing
isinstance(obj, Tensor). Using thatisinstancecheck is better for typechecking with mypy, and more explicit - so it’s recommended to use that instead ofis_tensor.- Parameters
 obj (Object) – Object to test
- 
torch.is_storage(obj)[source]¶ Returns True if obj is a PyTorch storage object.
- Parameters
 obj (Object) – Object to test
- 
torch.set_default_dtype(d)[source]¶ Sets the default floating point dtype to
d. This dtype is:The inferred dtype for python floats in
torch.tensor().Used to infer dtype for python complex numbers. The default complex dtype is set to
torch.complex128if default floating point dtype istorch.float64, otherwise it’s set totorch.complex64
The default floating point dtype is initially
torch.float32.- Parameters
 d (
torch.dtype) – the floating point dtype to make the default
Example
>>> # initial default for floating point is torch.float32 >>> torch.tensor([1.2, 3]).dtype torch.float32 >>> # initial default for floating point is torch.complex64 >>> torch.tensor([1.2, 3j]).dtype torch.complex64 >>> torch.set_default_dtype(torch.float64) >>> torch.tensor([1.2, 3]).dtype # a new floating point tensor torch.float64 >>> torch.tensor([1.2, 3j]).dtype # a new complex tensor torch.complex128
- 
torch.get_default_dtype() → torch.dtype¶ Get the current default floating point
torch.dtype.Example:
>>> torch.get_default_dtype() # initial default for floating point is torch.float32 torch.float32 >>> torch.set_default_dtype(torch.float64) >>> torch.get_default_dtype() # default is now changed to torch.float64 torch.float64 >>> torch.set_default_tensor_type(torch.FloatTensor) # setting tensor type also affects this >>> torch.get_default_dtype() # changed to torch.float32, the dtype for torch.FloatTensor torch.float32
- 
torch.set_default_tensor_type(t)[source]¶ Sets the default
torch.Tensortype to floating point tensor typet. This type will also be used as default floating point type for type inference intorch.tensor().The default floating point tensor type is initially
torch.FloatTensor.- Parameters
 t (type or string) – the floating point tensor type or its name
Example:
>>> torch.tensor([1.2, 3]).dtype # initial default for floating point is torch.float32 torch.float32 >>> torch.set_default_tensor_type(torch.DoubleTensor) >>> torch.tensor([1.2, 3]).dtype # a new floating point tensor torch.float64
- 
torch.numel(input) → int¶ Returns the total number of elements in the
inputtensor.- Parameters
 input (Tensor) – the input tensor.
Example:
>>> a = torch.randn(1, 2, 3, 4, 5) >>> torch.numel(a) 120 >>> a = torch.zeros(4,4) >>> torch.numel(a) 16
- 
torch.set_printoptions(precision=None, threshold=None, edgeitems=None, linewidth=None, profile=None, sci_mode=None)[source]¶ Set options for printing. Items shamelessly taken from NumPy
- Parameters
 precision – Number of digits of precision for floating point output (default = 4).
threshold – Total number of array elements which trigger summarization rather than full repr (default = 1000).
edgeitems – Number of array items in summary at beginning and end of each dimension (default = 3).
linewidth – The number of characters per line for the purpose of inserting line breaks (default = 80). Thresholded matrices will ignore this parameter.
profile – Sane defaults for pretty printing. Can override with any of the above options. (any one of default, short, full)
sci_mode – Enable (True) or disable (False) scientific notation. If None (default) is specified, the value is defined by torch._tensor_str._Formatter. This value is automatically chosen by the framework.
- 
torch.set_flush_denormal(mode) → bool¶ Disables denormal floating numbers on CPU.
Returns
Trueif your system supports flushing denormal numbers and it successfully configures flush denormal mode.set_flush_denormal()is only supported on x86 architectures supporting SSE3.- Parameters
 mode (bool) – Controls whether to enable flush denormal mode or not
Example:
>>> torch.set_flush_denormal(True) True >>> torch.tensor([1e-323], dtype=torch.float64) tensor([ 0.], dtype=torch.float64) >>> torch.set_flush_denormal(False) True >>> torch.tensor([1e-323], dtype=torch.float64) tensor(9.88131e-324 * [ 1.0000], dtype=torch.float64)
Creation Ops¶
Note
Random sampling creation ops are listed under Random sampling and
include:
torch.rand()
torch.rand_like()
torch.randn()
torch.randn_like()
torch.randint()
torch.randint_like()
torch.randperm()
You may also use torch.empty() with the In-place random sampling
methods to create torch.Tensor s with values sampled from a broader
range of distributions.
- 
torch.tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor¶ Constructs a tensor with
data.Warning
torch.tensor()always copiesdata. If you have a Tensordataand want to avoid a copy, usetorch.Tensor.requires_grad_()ortorch.Tensor.detach(). If you have a NumPyndarrayand want to avoid a copy, usetorch.as_tensor().Warning
When data is a tensor x,
torch.tensor()reads out ‘the data’ from whatever it is passed, and constructs a leaf variable. Thereforetorch.tensor(x)is equivalent tox.clone().detach()andtorch.tensor(x, requires_grad=True)is equivalent tox.clone().detach().requires_grad_(True). The equivalents usingclone()anddetach()are recommended.- Parameters
 data (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy
ndarray, scalar, and other types.- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, infers data type fromdata.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.
Example:
>>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]]) tensor([[ 0.1000, 1.2000], [ 2.2000, 3.1000], [ 4.9000, 5.2000]]) >>> torch.tensor([0, 1]) # Type inference on data tensor([ 0, 1]) >>> torch.tensor([[0.11111, 0.222222, 0.3333333]], ... dtype=torch.float64, ... device=torch.device('cuda:0')) # creates a torch.cuda.DoubleTensor tensor([[ 0.1111, 0.2222, 0.3333]], dtype=torch.float64, device='cuda:0') >>> torch.tensor(3.14159) # Create a scalar (zero-dimensional tensor) tensor(3.1416) >>> torch.tensor([]) # Create an empty tensor (of size (0,)) tensor([])
- 
torch.sparse_coo_tensor(indices, values, size=None, *, dtype=None, device=None, requires_grad=False) → Tensor¶ Constructs a sparse tensor in COO(rdinate) format with specified values at the given
indices.Note
This function returns an uncoalesced tensor.
- Parameters
 indices (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy
ndarray, scalar, and other types. Will be cast to atorch.LongTensorinternally. The indices are the coordinates of the non-zero values in the matrix, and thus should be two-dimensional where the first dimension is the number of tensor dimensions and the second dimension is the number of non-zero values.values (array_like) – Initial values for the tensor. Can be a list, tuple, NumPy
ndarray, scalar, and other types.size (list, tuple, or
torch.Size, optional) – Size of the sparse tensor. If not provided the size will be inferred as the minimum size big enough to hold all non-zero elements.
- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: if None, infers data type fromvalues.device (
torch.device, optional) – the desired device of returned tensor. Default: if None, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> i = torch.tensor([[0, 1, 1], ... [2, 0, 2]]) >>> v = torch.tensor([3, 4, 5], dtype=torch.float32) >>> torch.sparse_coo_tensor(i, v, [2, 4]) tensor(indices=tensor([[0, 1, 1], [2, 0, 2]]), values=tensor([3., 4., 5.]), size=(2, 4), nnz=3, layout=torch.sparse_coo) >>> torch.sparse_coo_tensor(i, v) # Shape inference tensor(indices=tensor([[0, 1, 1], [2, 0, 2]]), values=tensor([3., 4., 5.]), size=(2, 3), nnz=3, layout=torch.sparse_coo) >>> torch.sparse_coo_tensor(i, v, [2, 4], ... dtype=torch.float64, ... device=torch.device('cuda:0')) tensor(indices=tensor([[0, 1, 1], [2, 0, 2]]), values=tensor([3., 4., 5.]), device='cuda:0', size=(2, 4), nnz=3, dtype=torch.float64, layout=torch.sparse_coo) # Create an empty sparse tensor with the following invariants: # 1. sparse_dim + dense_dim = len(SparseTensor.shape) # 2. SparseTensor._indices().shape = (sparse_dim, nnz) # 3. SparseTensor._values().shape = (nnz, SparseTensor.shape[sparse_dim:]) # # For instance, to create an empty sparse tensor with nnz = 0, dense_dim = 0 and # sparse_dim = 1 (hence indices is a 2D tensor of shape = (1, 0)) >>> S = torch.sparse_coo_tensor(torch.empty([1, 0]), [], [1]) tensor(indices=tensor([], size=(1, 0)), values=tensor([], size=(0,)), size=(1,), nnz=0, layout=torch.sparse_coo) # and to create an empty sparse tensor with nnz = 0, dense_dim = 1 and # sparse_dim = 1 >>> S = torch.sparse_coo_tensor(torch.empty([1, 0]), torch.empty([0, 2]), [1, 2]) tensor(indices=tensor([], size=(1, 0)), values=tensor([], size=(0, 2)), size=(1, 2), nnz=0, layout=torch.sparse_coo)
- 
torch.as_tensor(data, dtype=None, device=None) → Tensor¶ Convert the data into a torch.Tensor. If the data is already a Tensor with the same dtype and device, no copy will be performed, otherwise a new Tensor will be returned with computational graph retained if data Tensor has
requires_grad=True. Similarly, if the data is anndarrayof the corresponding dtype and the device is the cpu, no copy will be performed.- Parameters
 data (array_like) – Initial data for the tensor. Can be a list, tuple, NumPy
ndarray, scalar, and other types.dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, infers data type fromdata.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.
Example:
>>> a = numpy.array([1, 2, 3]) >>> t = torch.as_tensor(a) >>> t tensor([ 1, 2, 3]) >>> t[0] = -1 >>> a array([-1, 2, 3]) >>> a = numpy.array([1, 2, 3]) >>> t = torch.as_tensor(a, device=torch.device('cuda')) >>> t tensor([ 1, 2, 3]) >>> t[0] = -1 >>> a array([1, 2, 3])
- 
torch.from_numpy(ndarray) → Tensor¶ Creates a
Tensorfrom anumpy.ndarray.The returned tensor and
ndarrayshare the same memory. Modifications to the tensor will be reflected in thendarrayand vice versa. The returned tensor is not resizable.It currently accepts
ndarraywith dtypes ofnumpy.float64,numpy.float32,numpy.float16,numpy.complex64,numpy.complex128,numpy.int64,numpy.int32,numpy.int16,numpy.int8,numpy.uint8, andnumpy.bool.Example:
>>> a = numpy.array([1, 2, 3]) >>> t = torch.from_numpy(a) >>> t tensor([ 1, 2, 3]) >>> t[0] = -1 >>> a array([-1, 2, 3])
- 
torch.zeros(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a tensor filled with the scalar value 0, with the shape defined by the variable argument
size.- Parameters
 size (int...) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.zeros(2, 3) tensor([[ 0., 0., 0.], [ 0., 0., 0.]]) >>> torch.zeros(5) tensor([ 0., 0., 0., 0., 0.])
- 
torch.zeros_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor¶ Returns a tensor filled with the scalar value 0, with the same size as
input.torch.zeros_like(input)is equivalent totorch.zeros(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).Warning
As of 0.4, this function does not support an
outkeyword. As an alternative, the oldtorch.zeros_like(input, out=output)is equivalent totorch.zeros(input.size(), out=output).- Parameters
 input (Tensor) – the size of
inputwill determine size of the output tensor.- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned Tensor. Default: ifNone, defaults to the dtype ofinput.layout (
torch.layout, optional) – the desired layout of returned tensor. Default: ifNone, defaults to the layout ofinput.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, defaults to the device ofinput.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
Example:
>>> input = torch.empty(2, 3) >>> torch.zeros_like(input) tensor([[ 0., 0., 0.], [ 0., 0., 0.]])
- 
torch.ones(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a tensor filled with the scalar value 1, with the shape defined by the variable argument
size.- Parameters
 size (int...) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.ones(2, 3) tensor([[ 1., 1., 1.], [ 1., 1., 1.]]) >>> torch.ones(5) tensor([ 1., 1., 1., 1., 1.])
- 
torch.ones_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor¶ Returns a tensor filled with the scalar value 1, with the same size as
input.torch.ones_like(input)is equivalent totorch.ones(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).Warning
As of 0.4, this function does not support an
outkeyword. As an alternative, the oldtorch.ones_like(input, out=output)is equivalent totorch.ones(input.size(), out=output).- Parameters
 input (Tensor) – the size of
inputwill determine size of the output tensor.- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned Tensor. Default: ifNone, defaults to the dtype ofinput.layout (
torch.layout, optional) – the desired layout of returned tensor. Default: ifNone, defaults to the layout ofinput.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, defaults to the device ofinput.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
Example:
>>> input = torch.empty(2, 3) >>> torch.ones_like(input) tensor([[ 1., 1., 1.], [ 1., 1., 1.]])
- 
torch.arange(start=0, end, step=1, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a 1-D tensor of size \(\left\lceil \frac{\text{end} - \text{start}}{\text{step}} \right\rceil\) with values from the interval
[start, end)taken with common differencestepbeginning from start.Note that non-integer
stepis subject to floating point rounding errors when comparing againstend; to avoid inconsistency, we advise adding a small epsilon toendin such cases.\[\text{out}_{{i+1}} = \text{out}_{i} + \text{step} \]- Parameters
 start (Number) – the starting value for the set of points. Default:
0.end (Number) – the ending value for the set of points
step (Number) – the gap between each pair of adjacent points. Default:
1.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()). If dtype is not given, infer the data type from the other input arguments. If any of start, end, or stop are floating-point, the dtype is inferred to be the default dtype, seeget_default_dtype(). Otherwise, the dtype is inferred to be torch.int64.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.arange(5) tensor([ 0, 1, 2, 3, 4]) >>> torch.arange(1, 4) tensor([ 1, 2, 3]) >>> torch.arange(1, 2.5, 0.5) tensor([ 1.0000, 1.5000, 2.0000])
- 
torch.range(start=0, end, step=1, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a 1-D tensor of size \(\left\lfloor \frac{\text{end} - \text{start}}{\text{step}} \right\rfloor + 1\) with values from
starttoendwith stepstep. Step is the gap between two values in the tensor.\[\text{out}_{i+1} = \text{out}_i + \text{step}. \]Warning
This function is deprecated and will be removed in a future release because its behavior is inconsistent with Python’s range builtin. Instead, use
torch.arange(), which produces values in [start, end).- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()). If dtype is not given, infer the data type from the other input arguments. If any of start, end, or stop are floating-point, the dtype is inferred to be the default dtype, seeget_default_dtype(). Otherwise, the dtype is inferred to be torch.int64.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.range(1, 4) tensor([ 1., 2., 3., 4.]) >>> torch.range(1, 4, 0.5) tensor([ 1.0000, 1.5000, 2.0000, 2.5000, 3.0000, 3.5000, 4.0000])
- 
torch.linspace(start, end, steps, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Creates a one-dimensional tensor of size
stepswhose values are evenly spaced fromstarttoend, inclusive. That is, the value are:\[(\text{start}, \text{start} + \frac{\text{end} - \text{start}}{\text{steps} - 1}, \ldots, \text{start} + (\text{steps} - 2) * \frac{\text{end} - \text{start}}{\text{steps} - 1}, \text{end}) \]Warning
Not providing a value for
stepsis deprecated. For backwards compatibility, not providing a value forstepswill create a tensor with 100 elements. Note that this behavior is not reflected in the documented function signature and should not be relied on. In a future PyTorch release, failing to provide a value forstepswill throw a runtime error.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.linspace(3, 10, steps=5) tensor([ 3.0000, 4.7500, 6.5000, 8.2500, 10.0000]) >>> torch.linspace(-10, 10, steps=5) tensor([-10., -5., 0., 5., 10.]) >>> torch.linspace(start=-10, end=10, steps=5) tensor([-10., -5., 0., 5., 10.]) >>> torch.linspace(start=-10, end=10, steps=1) tensor([-10.])
- 
torch.logspace(start, end, steps, base=10.0, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Creates a one-dimensional tensor of size
stepswhose values are evenly spaced from \({{\text{{base}}}}^{{\text{{start}}}}\) to \({{\text{{base}}}}^{{\text{{end}}}}\), inclusive, on a logarithmic scale with basebase. That is, the values are:\[(\text{base}^{\text{start}}, \text{base}^{(\text{start} + \frac{\text{end} - \text{start}}{ \text{steps} - 1})}, \ldots, \text{base}^{(\text{start} + (\text{steps} - 2) * \frac{\text{end} - \text{start}}{ \text{steps} - 1})}, \text{base}^{\text{end}}) \]Warning
Not providing a value for
stepsis deprecated. For backwards compatibility, not providing a value forstepswill create a tensor with 100 elements. Note that this behavior is not reflected in the documented function signature and should not be relied on. In a future PyTorch release, failing to provide a value forstepswill throw a runtime error.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.logspace(start=-10, end=10, steps=5) tensor([ 1.0000e-10, 1.0000e-05, 1.0000e+00, 1.0000e+05, 1.0000e+10]) >>> torch.logspace(start=0.1, end=1.0, steps=5) tensor([ 1.2589, 2.1135, 3.5481, 5.9566, 10.0000]) >>> torch.logspace(start=0.1, end=1.0, steps=1) tensor([1.2589]) >>> torch.logspace(start=2, end=2, steps=1, base=2) tensor([4.0])
- 
torch.eye(n, m=None, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a 2-D tensor with ones on the diagonal and zeros elsewhere.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
- Returns
 A 2-D tensor with ones on the diagonal and zeros elsewhere
- Return type
 
Example:
>>> torch.eye(3) tensor([[ 1., 0., 0.], [ 0., 1., 0.], [ 0., 0., 1.]])
- 
torch.empty(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False, pin_memory=False) → Tensor¶ Returns a tensor filled with uninitialized data. The shape of the tensor is defined by the variable argument
size.- Parameters
 size (int...) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.contiguous_format.
Example:
>>> torch.empty(2, 3) tensor(1.00000e-08 * [[ 6.3984, 0.0000, 0.0000], [ 0.0000, 0.0000, 0.0000]])
- 
torch.empty_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor¶ Returns an uninitialized tensor with the same size as
input.torch.empty_like(input)is equivalent totorch.empty(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).- Parameters
 input (Tensor) – the size of
inputwill determine size of the output tensor.- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned Tensor. Default: ifNone, defaults to the dtype ofinput.layout (
torch.layout, optional) – the desired layout of returned tensor. Default: ifNone, defaults to the layout ofinput.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, defaults to the device ofinput.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
Example:
>>> torch.empty((2,3), dtype=torch.int64) tensor([[ 9.4064e+13, 2.8000e+01, 9.3493e+13], [ 7.5751e+18, 7.1428e+18, 7.5955e+18]])
- 
torch.full(size, fill_value, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Creates a tensor of size
sizefilled withfill_value. The tensor’s dtype is inferred fromfill_value.- Parameters
 size (int...) – a list, tuple, or
torch.Sizeof integers defining the shape of the output tensor.fill_value (Scalar) – the value to fill the output tensor with.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.full((2, 3), 3.141592) tensor([[ 3.1416, 3.1416, 3.1416], [ 3.1416, 3.1416, 3.1416]])
- 
torch.full_like(input, fill_value, \*, dtype=None, layout=torch.strided, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor¶ Returns a tensor with the same size as
inputfilled withfill_value.torch.full_like(input, fill_value)is equivalent totorch.full(input.size(), fill_value, dtype=input.dtype, layout=input.layout, device=input.device).- Parameters
 input (Tensor) – the size of
inputwill determine size of the output tensor.fill_value – the number to fill the output tensor with.
- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned Tensor. Default: ifNone, defaults to the dtype ofinput.layout (
torch.layout, optional) – the desired layout of returned tensor. Default: ifNone, defaults to the layout ofinput.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, defaults to the device ofinput.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
Indexing, Slicing, Joining, Mutating Ops¶
- 
torch.cat(tensors, dim=0, *, out=None) → Tensor¶ Concatenates the given sequence of
seqtensors in the given dimension. All tensors must either have the same shape (except in the concatenating dimension) or be empty.torch.cat()can be seen as an inverse operation fortorch.split()andtorch.chunk().torch.cat()can be best understood via examples.- Parameters
 tensors (sequence of Tensors) – any python sequence of tensors of the same type. Non-empty tensors provided must have the same shape, except in the cat dimension.
dim (int, optional) – the dimension over which the tensors are concatenated
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> x = torch.randn(2, 3) >>> x tensor([[ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497]]) >>> torch.cat((x, x, x), 0) tensor([[ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497], [ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497], [ 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497]]) >>> torch.cat((x, x, x), 1) tensor([[ 0.6580, -1.0969, -0.4614, 0.6580, -1.0969, -0.4614, 0.6580, -1.0969, -0.4614], [-0.1034, -0.5790, 0.1497, -0.1034, -0.5790, 0.1497, -0.1034, -0.5790, 0.1497]])
- 
torch.chunk(input, chunks, dim=0) → List of Tensors¶ Splits a tensor into a specific number of chunks. Each chunk is a view of the input tensor.
Last chunk will be smaller if the tensor size along the given dimension
dimis not divisible bychunks.
- 
torch.gather(input, dim, index, *, sparse_grad=False, out=None) → Tensor¶ Gathers values along an axis specified by dim.
For a 3-D tensor the output is specified by:
out[i][j][k] = input[index[i][j][k]][j][k] # if dim == 0 out[i][j][k] = input[i][index[i][j][k]][k] # if dim == 1 out[i][j][k] = input[i][j][index[i][j][k]] # if dim == 2
inputandindexmust have the same number of dimensions. It is also required thatindex.size(d) <= input.size(d)for all dimensionsd != dim.outwill have the same shape asindex. Note thatinputandindexdo not broadcast against each other.- Parameters
 - Keyword Arguments
 
Example:
>>> t = torch.tensor([[1, 2], [3, 4]]) >>> torch.gather(t, 1, torch.tensor([[0, 0], [1, 0]])) tensor([[ 1, 1], [ 4, 3]])
- 
torch.index_select(input, dim, index, *, out=None) → Tensor¶ Returns a new tensor which indexes the
inputtensor along dimensiondimusing the entries inindexwhich is a LongTensor.The returned tensor has the same number of dimensions as the original tensor (
input). Thedimth dimension has the same size as the length ofindex; other dimensions have the same size as in the original tensor.Note
The returned tensor does not use the same storage as the original tensor. If
outhas a different shape than expected, we silently change it to the correct shape, reallocating the underlying storage if necessary.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> x = torch.randn(3, 4) >>> x tensor([[ 0.1427, 0.0231, -0.5414, -1.0009], [-0.4664, 0.2647, -0.1228, -1.1068], [-1.1734, -0.6571, 0.7230, -0.6004]]) >>> indices = torch.tensor([0, 2]) >>> torch.index_select(x, 0, indices) tensor([[ 0.1427, 0.0231, -0.5414, -1.0009], [-1.1734, -0.6571, 0.7230, -0.6004]]) >>> torch.index_select(x, 1, indices) tensor([[ 0.1427, -0.5414], [-0.4664, -0.1228], [-1.1734, 0.7230]])
- 
torch.masked_select(input, mask, *, out=None) → Tensor¶ Returns a new 1-D tensor which indexes the
inputtensor according to the boolean maskmaskwhich is a BoolTensor.The shapes of the
masktensor and theinputtensor don’t need to match, but they must be broadcastable.Note
The returned tensor does not use the same storage as the original tensor
- Parameters
 input (Tensor) – the input tensor.
mask (BoolTensor) – the tensor containing the binary mask to index with
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> x = torch.randn(3, 4) >>> x tensor([[ 0.3552, -2.3825, -0.8297, 0.3477], [-1.2035, 1.2252, 0.5002, 0.6248], [ 0.1307, -2.0608, 0.1244, 2.0139]]) >>> mask = x.ge(0.5) >>> mask tensor([[False, False, False, False], [False, True, True, True], [False, False, False, True]]) >>> torch.masked_select(x, mask) tensor([ 1.2252, 0.5002, 0.6248, 2.0139])
- 
torch.narrow(input, dim, start, length) → Tensor¶ Returns a new tensor that is a narrowed version of
inputtensor. The dimensiondimis input fromstarttostart + length. The returned tensor andinputtensor share the same underlying storage.- Parameters
 
Example:
>>> x = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) >>> torch.narrow(x, 0, 0, 2) tensor([[ 1, 2, 3], [ 4, 5, 6]]) >>> torch.narrow(x, 1, 1, 2) tensor([[ 2, 3], [ 5, 6], [ 8, 9]])
- 
torch.nonzero(input, *, out=None, as_tuple=False) → LongTensor or tuple of LongTensors¶ Note
torch.nonzero(..., as_tuple=False)(default) returns a 2-D tensor where each row is the index for a nonzero value.torch.nonzero(..., as_tuple=True)returns a tuple of 1-D index tensors, allowing for advanced indexing, sox[x.nonzero(as_tuple=True)]gives all nonzero values of tensorx. Of the returned tuple, each index tensor contains nonzero indices for a certain dimension.See below for more details on the two behaviors.
When
inputis on CUDA,torch.nonzero()causes host-device synchronization.When
as_tupleis ``False`` (default):Returns a tensor containing the indices of all non-zero elements of
input. Each row in the result contains the indices of a non-zero element ininput. The result is sorted lexicographically, with the last index changing the fastest (C-style).If
inputhas \(n\) dimensions, then the resulting indices tensoroutis of size \((z \times n)\), where \(z\) is the total number of non-zero elements in theinputtensor.When
as_tupleis ``True``:Returns a tuple of 1-D tensors, one for each dimension in
input, each containing the indices (in that dimension) of all non-zero elements ofinput.If
inputhas \(n\) dimensions, then the resulting tuple contains \(n\) tensors of size \(z\), where \(z\) is the total number of non-zero elements in theinputtensor.As a special case, when
inputhas zero dimensions and a nonzero scalar value, it is treated as a one-dimensional tensor with one element.- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (LongTensor, optional) – the output tensor containing indices
- Returns
 If
as_tupleisFalse, the output tensor containing indices. Ifas_tupleisTrue, one 1-D tensor for each dimension, containing the indices of each nonzero element along that dimension.- Return type
 LongTensor or tuple of LongTensor
Example:
>>> torch.nonzero(torch.tensor([1, 1, 1, 0, 1])) tensor([[ 0], [ 1], [ 2], [ 4]]) >>> torch.nonzero(torch.tensor([[0.6, 0.0, 0.0, 0.0], ... [0.0, 0.4, 0.0, 0.0], ... [0.0, 0.0, 1.2, 0.0], ... [0.0, 0.0, 0.0,-0.4]])) tensor([[ 0, 0], [ 1, 1], [ 2, 2], [ 3, 3]]) >>> torch.nonzero(torch.tensor([1, 1, 1, 0, 1]), as_tuple=True) (tensor([0, 1, 2, 4]),) >>> torch.nonzero(torch.tensor([[0.6, 0.0, 0.0, 0.0], ... [0.0, 0.4, 0.0, 0.0], ... [0.0, 0.0, 1.2, 0.0], ... [0.0, 0.0, 0.0,-0.4]]), as_tuple=True) (tensor([0, 1, 2, 3]), tensor([0, 1, 2, 3])) >>> torch.nonzero(torch.tensor(5), as_tuple=True) (tensor([0]),)
- 
torch.reshape(input, shape) → Tensor¶ Returns a tensor with the same data and number of elements as
input, but with the specified shape. When possible, the returned tensor will be a view ofinput. Otherwise, it will be a copy. Contiguous inputs and inputs with compatible strides can be reshaped without copying, but you should not depend on the copying vs. viewing behavior.See
torch.Tensor.view()on when it is possible to return a view.A single dimension may be -1, in which case it’s inferred from the remaining dimensions and the number of elements in
input.- Parameters
 input (Tensor) – the tensor to be reshaped
shape (tuple of python:ints) – the new shape
Example:
>>> a = torch.arange(4.) >>> torch.reshape(a, (2, 2)) tensor([[ 0., 1.], [ 2., 3.]]) >>> b = torch.tensor([[0, 1], [2, 3]]) >>> torch.reshape(b, (-1,)) tensor([ 0, 1, 2, 3])
- 
torch.split(tensor, split_size_or_sections, dim=0)[source]¶ Splits the tensor into chunks. Each chunk is a view of the original tensor.
If
split_size_or_sectionsis an integer type, thentensorwill be split into equally sized chunks (if possible). Last chunk will be smaller if the tensor size along the given dimensiondimis not divisible bysplit_size.If
split_size_or_sectionsis a list, thentensorwill be split intolen(split_size_or_sections)chunks with sizes indimaccording tosplit_size_or_sections.- Parameters
 
- Example::
 >>> a = torch.arange(10).reshape(5,2) >>> a tensor([[0, 1], [2, 3], [4, 5], [6, 7], [8, 9]]) >>> torch.split(a, 2) (tensor([[0, 1], [2, 3]]), tensor([[4, 5], [6, 7]]), tensor([[8, 9]])) >>> torch.split(a, [1,4]) (tensor([[0, 1]]), tensor([[2, 3], [4, 5], [6, 7], [8, 9]]))
- 
torch.squeeze(input, dim=None, *, out=None) → Tensor¶ Returns a tensor with all the dimensions of
inputof size 1 removed.For example, if input is of shape: \((A \times 1 \times B \times C \times 1 \times D)\) then the out tensor will be of shape: \((A \times B \times C \times D)\).
When
dimis given, a squeeze operation is done only in the given dimension. If input is of shape: \((A \times 1 \times B)\),squeeze(input, 0)leaves the tensor unchanged, butsqueeze(input, 1)will squeeze the tensor to the shape \((A \times B)\).Note
The returned tensor shares the storage with the input tensor, so changing the contents of one will change the contents of the other.
Warning
If the tensor has a batch dimension of size 1, then squeeze(input) will also remove the batch dimension, which can lead to unexpected errors.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> x = torch.zeros(2, 1, 2, 1, 2) >>> x.size() torch.Size([2, 1, 2, 1, 2]) >>> y = torch.squeeze(x) >>> y.size() torch.Size([2, 2, 2]) >>> y = torch.squeeze(x, 0) >>> y.size() torch.Size([2, 1, 2, 1, 2]) >>> y = torch.squeeze(x, 1) >>> y.size() torch.Size([2, 2, 1, 2])
- 
torch.stack(tensors, dim=0, *, out=None) → Tensor¶ Concatenates a sequence of tensors along a new dimension.
All tensors need to be of the same size.
- 
torch.t(input) → Tensor¶ Expects
inputto be <= 2-D tensor and transposes dimensions 0 and 1.0-D and 1-D tensors are returned as is. When input is a 2-D tensor this is equivalent to
transpose(input, 0, 1).- Parameters
 input (Tensor) – the input tensor.
Example:
>>> x = torch.randn(()) >>> x tensor(0.1995) >>> torch.t(x) tensor(0.1995) >>> x = torch.randn(3) >>> x tensor([ 2.4320, -0.4608, 0.7702]) >>> torch.t(x) tensor([ 2.4320, -0.4608, 0.7702]) >>> x = torch.randn(2, 3) >>> x tensor([[ 0.4875, 0.9158, -0.5872], [ 0.3938, -0.6929, 0.6932]]) >>> torch.t(x) tensor([[ 0.4875, 0.3938], [ 0.9158, -0.6929], [-0.5872, 0.6932]])
- 
torch.take(input, index) → Tensor¶ Returns a new tensor with the elements of
inputat the given indices. The input tensor is treated as if it were viewed as a 1-D tensor. The result takes the same shape as the indices.- Parameters
 input (Tensor) – the input tensor.
indices (LongTensor) – the indices into tensor
Example:
>>> src = torch.tensor([[4, 3, 5], ... [6, 7, 8]]) >>> torch.take(src, torch.tensor([0, 2, 5])) tensor([ 4, 5, 8])
- 
torch.transpose(input, dim0, dim1) → Tensor¶ Returns a tensor that is a transposed version of
input. The given dimensionsdim0anddim1are swapped.The resulting
outtensor shares its underlying storage with theinputtensor, so changing the content of one would change the content of the other.- Parameters
 
Example:
>>> x = torch.randn(2, 3) >>> x tensor([[ 1.0028, -0.9893, 0.5809], [-0.1669, 0.7299, 0.4942]]) >>> torch.transpose(x, 0, 1) tensor([[ 1.0028, -0.1669], [-0.9893, 0.7299], [ 0.5809, 0.4942]])
- 
torch.unbind(input, dim=0) → seq¶ Removes a tensor dimension.
Returns a tuple of all slices along a given dimension, already without it.
Example:
>>> torch.unbind(torch.tensor([[1, 2, 3], >>> [4, 5, 6], >>> [7, 8, 9]])) (tensor([1, 2, 3]), tensor([4, 5, 6]), tensor([7, 8, 9]))
- 
torch.unsqueeze(input, dim) → Tensor¶ Returns a new tensor with a dimension of size one inserted at the specified position.
The returned tensor shares the same underlying data with this tensor.
A
dimvalue within the range[-input.dim() - 1, input.dim() + 1)can be used. Negativedimwill correspond tounsqueeze()applied atdim=dim + input.dim() + 1.- Parameters
 
Example:
>>> x = torch.tensor([1, 2, 3, 4]) >>> torch.unsqueeze(x, 0) tensor([[ 1, 2, 3, 4]]) >>> torch.unsqueeze(x, 1) tensor([[ 1], [ 2], [ 3], [ 4]])
- 
torch.where(condition, x, y) → Tensor¶ Return a tensor of elements selected from either
xory, depending oncondition.The operation is defined as:
\[\text{out}_i = \begin{cases} \text{x}_i & \text{if } \text{condition}_i \\ \text{y}_i & \text{otherwise} \\ \end{cases} \]Note
The tensors
condition,x,ymust be broadcastable.Note
Currently valid scalar and tensor combination are 1. Scalar of floating dtype and torch.double 2. Scalar of integral dtype and torch.long 3. Scalar of complex dtype and torch.complex128
- Parameters
 - Returns
 A tensor of shape equal to the broadcasted shape of
condition,x,y- Return type
 
Example:
>>> x = torch.randn(3, 2) >>> y = torch.ones(3, 2) >>> x tensor([[-0.4620, 0.3139], [ 0.3898, -0.7197], [ 0.0478, -0.1657]]) >>> torch.where(x > 0, x, y) tensor([[ 1.0000, 0.3139], [ 0.3898, 1.0000], [ 0.0478, 1.0000]]) >>> x = torch.randn(2, 2, dtype=torch.double) >>> x tensor([[ 1.0779, 0.0383], [-0.8785, -1.1089]], dtype=torch.float64) >>> torch.where(x > 0, x, 0.) tensor([[1.0779, 0.0383], [0.0000, 0.0000]], dtype=torch.float64)
- 
torch.where(condition) → tuple of LongTensor¶ 
torch.where(condition)is identical totorch.nonzero(condition, as_tuple=True).Note
See also
torch.nonzero().
Random sampling¶
- 
torch.manual_seed(seed) → torch._C.Generator[source]¶ Sets the seed for generating random numbers. Returns a torch.Generator object.
- Parameters
 seed (int) – The desired seed. Value must be within the inclusive range [-0x8000_0000_0000_0000, 0xffff_ffff_ffff_ffff]. Otherwise, a RuntimeError is raised. Negative inputs are remapped to positive values with the formula 0xffff_ffff_ffff_ffff + seed.
- 
torch.initial_seed() → int[source]¶ Returns the initial seed for generating random numbers as a Python long.
- 
torch.get_rng_state() → torch.Tensor[source]¶ Returns the random number generator state as a torch.ByteTensor.
- 
torch.set_rng_state(new_state) → None[source]¶ Sets the random number generator state.
- Parameters
 new_state (torch.ByteTensor) – The desired state
- 
torch.default_generator= <torch._C.Generator object>¶ 
- 
torch.bernoulli(input, *, generator=None, out=None) → Tensor¶ Draws binary random numbers (0 or 1) from a Bernoulli distribution.
The
inputtensor should be a tensor containing probabilities to be used for drawing the binary random number. Hence, all values ininputhave to be in the range: \(0 \leq \text{input}_i \leq 1\).The \(\text{i}^{th}\) element of the output tensor will draw a value \(1\) according to the \(\text{i}^{th}\) probability value given in
input.\[\text{out}_{i} \sim \mathrm{Bernoulli}(p = \text{input}_{i}) \]The returned
outtensor only has values 0 or 1 and is of the same shape asinput.outcan have integraldtype, butinputmust have floating pointdtype.- Parameters
 input (Tensor) – the input tensor of probability values for the Bernoulli distribution
- Keyword Arguments
 generator (
torch.Generator, optional) – a pseudorandom number generator for samplingout (Tensor, optional) – the output tensor.
Example:
>>> a = torch.empty(3, 3).uniform_(0, 1) # generate a uniform random matrix with range [0, 1] >>> a tensor([[ 0.1737, 0.0950, 0.3609], [ 0.7148, 0.0289, 0.2676], [ 0.9456, 0.8937, 0.7202]]) >>> torch.bernoulli(a) tensor([[ 1., 0., 0.], [ 0., 0., 0.], [ 1., 1., 1.]]) >>> a = torch.ones(3, 3) # probability of drawing "1" is 1 >>> torch.bernoulli(a) tensor([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.]]) >>> a = torch.zeros(3, 3) # probability of drawing "1" is 0 >>> torch.bernoulli(a) tensor([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])
- 
torch.multinomial(input, num_samples, replacement=False, *, generator=None, out=None) → LongTensor¶ Returns a tensor where each row contains
num_samplesindices sampled from the multinomial probability distribution located in the corresponding row of tensorinput.Note
The rows of
inputdo not need to sum to one (in which case we use the values as weights), but must be non-negative, finite and have a non-zero sum.Indices are ordered from left to right according to when each was sampled (first samples are placed in first column).
If
inputis a vector,outis a vector of sizenum_samples.If
inputis a matrix with m rows,outis an matrix of shape \((m \times \text{num\_samples})\).If replacement is
True, samples are drawn with replacement.If not, they are drawn without replacement, which means that when a sample index is drawn for a row, it cannot be drawn again for that row.
Note
When drawn without replacement,
num_samplesmust be lower than number of non-zero elements ininput(or the min number of non-zero elements in each row ofinputif it is a matrix).- Parameters
 - Keyword Arguments
 generator (
torch.Generator, optional) – a pseudorandom number generator for samplingout (Tensor, optional) – the output tensor.
Example:
>>> weights = torch.tensor([0, 10, 3, 0], dtype=torch.float) # create a tensor of weights >>> torch.multinomial(weights, 2) tensor([1, 2]) >>> torch.multinomial(weights, 4) # ERROR! RuntimeError: invalid argument 2: invalid multinomial distribution (with replacement=False, not enough non-negative category to sample) at ../aten/src/TH/generic/THTensorRandom.cpp:320 >>> torch.multinomial(weights, 4, replacement=True) tensor([ 2, 1, 1, 1])
- 
torch.normal(mean, std, *, generator=None, out=None) → Tensor¶ Returns a tensor of random numbers drawn from separate normal distributions whose mean and standard deviation are given.
The
meanis a tensor with the mean of each output element’s normal distributionThe
stdis a tensor with the standard deviation of each output element’s normal distributionThe shapes of
meanandstddon’t need to match, but the total number of elements in each tensor need to be the same.Note
When the shapes do not match, the shape of
meanis used as the shape for the returned output tensor- Parameters
 - Keyword Arguments
 generator (
torch.Generator, optional) – a pseudorandom number generator for samplingout (Tensor, optional) – the output tensor.
Example:
>>> torch.normal(mean=torch.arange(1., 11.), std=torch.arange(1, 0, -0.1)) tensor([ 1.0425, 3.5672, 2.7969, 4.2925, 4.7229, 6.2134, 8.0505, 8.1408, 9.0563, 10.0566])
Similar to the function above, but the means are shared among all drawn elements.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.normal(mean=0.5, std=torch.arange(1., 6.)) tensor([-1.2793, -1.0732, -2.0687, 5.1177, -1.2303])
Similar to the function above, but the standard-deviations are shared among all drawn elements.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor
Example:
>>> torch.normal(mean=torch.arange(1., 6.)) tensor([ 1.1552, 2.6148, 2.6535, 5.8318, 4.2361])
Similar to the function above, but the means and standard deviations are shared among all drawn elements. The resulting tensor has size given by
size.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.normal(2, 3, size=(1, 4)) tensor([[-1.3987, -1.9544, 3.6048, 0.7909]])
- 
torch.rand(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a tensor filled with random numbers from a uniform distribution on the interval \([0, 1)\)
The shape of the tensor is defined by the variable argument
size.- Parameters
 size (int...) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.rand(4) tensor([ 0.5204, 0.2503, 0.3525, 0.5673]) >>> torch.rand(2, 3) tensor([[ 0.8237, 0.5781, 0.6879], [ 0.3816, 0.7249, 0.0998]])
- 
torch.rand_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor¶ Returns a tensor with the same size as
inputthat is filled with random numbers from a uniform distribution on the interval \([0, 1)\).torch.rand_like(input)is equivalent totorch.rand(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).- Parameters
 input (Tensor) – the size of
inputwill determine size of the output tensor.- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned Tensor. Default: ifNone, defaults to the dtype ofinput.layout (
torch.layout, optional) – the desired layout of returned tensor. Default: ifNone, defaults to the layout ofinput.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, defaults to the device ofinput.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- 
torch.randint(low=0, high, size, \*, generator=None, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a tensor filled with random integers generated uniformly between
low(inclusive) andhigh(exclusive).The shape of the tensor is defined by the variable argument
size.Note
With the global dtype default (
torch.float32), this function returns a tensor with dtypetorch.int64.- Parameters
 - Keyword Arguments
 generator (
torch.Generator, optional) – a pseudorandom number generator for samplingout (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.randint(3, 5, (3,)) tensor([4, 3, 4]) >>> torch.randint(10, (2, 2)) tensor([[0, 2], [5, 5]]) >>> torch.randint(3, 10, (2, 2)) tensor([[4, 5], [6, 7]])
- 
torch.randint_like(input, low=0, high, \*, dtype=None, layout=torch.strided, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor¶ Returns a tensor with the same shape as Tensor
inputfilled with random integers generated uniformly betweenlow(inclusive) andhigh(exclusive).- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned Tensor. Default: ifNone, defaults to the dtype ofinput.layout (
torch.layout, optional) – the desired layout of returned tensor. Default: ifNone, defaults to the layout ofinput.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, defaults to the device ofinput.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- 
torch.randn(*size, *, out=None, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Returns a tensor filled with random numbers from a normal distribution with mean 0 and variance 1 (also called the standard normal distribution).
\[\text{out}_{i} \sim \mathcal{N}(0, 1) \]The shape of the tensor is defined by the variable argument
size.- Parameters
 size (int...) – a sequence of integers defining the shape of the output tensor. Can be a variable number of arguments or a collection like a list or tuple.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()).layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
Example:
>>> torch.randn(4) tensor([-2.1436, 0.9966, 2.3426, -0.6366]) >>> torch.randn(2, 3) tensor([[ 1.5954, 2.8929, -1.0923], [ 1.1719, -0.4709, -0.1996]])
- 
torch.randn_like(input, *, dtype=None, layout=None, device=None, requires_grad=False, memory_format=torch.preserve_format) → Tensor¶ Returns a tensor with the same size as
inputthat is filled with random numbers from a normal distribution with mean 0 and variance 1.torch.randn_like(input)is equivalent totorch.randn(input.size(), dtype=input.dtype, layout=input.layout, device=input.device).- Parameters
 input (Tensor) – the size of
inputwill determine size of the output tensor.- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned Tensor. Default: ifNone, defaults to the dtype ofinput.layout (
torch.layout, optional) – the desired layout of returned tensor. Default: ifNone, defaults to the layout ofinput.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, defaults to the device ofinput.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.memory_format (
torch.memory_format, optional) – the desired memory format of returned Tensor. Default:torch.preserve_format.
- 
torch.randperm(n, *, generator=None, out=None, dtype=torch.int64, layout=torch.strided, device=None, requires_grad=False, pin_memory=False) → Tensor¶ Returns a random permutation of integers from
0ton - 1.- Parameters
 n (int) – the upper bound (exclusive)
- Keyword Arguments
 generator (
torch.Generator, optional) – a pseudorandom number generator for samplingout (Tensor, optional) – the output tensor.
dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default:torch.int64.layout (
torch.layout, optional) – the desired layout of returned Tensor. Default:torch.strided.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.pin_memory (bool, optional) – If set, returned tensor would be allocated in the pinned memory. Works only for CPU tensors. Default:
False.
Example:
>>> torch.randperm(4) tensor([2, 1, 0, 3])
In-place random sampling¶
There are a few more in-place random sampling functions defined on Tensors as well. Click through to refer to their documentation:
torch.Tensor.bernoulli_()- in-place version oftorch.bernoulli()torch.Tensor.cauchy_()- numbers drawn from the Cauchy distributiontorch.Tensor.exponential_()- numbers drawn from the exponential distributiontorch.Tensor.geometric_()- elements drawn from the geometric distributiontorch.Tensor.log_normal_()- samples from the log-normal distributiontorch.Tensor.normal_()- in-place version oftorch.normal()torch.Tensor.random_()- numbers sampled from the discrete uniform distributiontorch.Tensor.uniform_()- numbers sampled from the continuous uniform distribution
Serialization¶
- 
torch.save(obj, f: Union[str, os.PathLike, BinaryIO, IO[bytes]], pickle_module=<module 'pickle' from '/Users/brandongreen/opt/anaconda3/envs/pytorch/lib/python3.6/pickle.py'>, pickle_protocol=2, _use_new_zipfile_serialization=True) → None[source]¶ Saves an object to a disk file.
See also: saving-loading-tensors
- Parameters
 obj – saved object
f – a file-like object (has to implement write and flush) or a string or os.PathLike object containing a file name
pickle_module – module used for pickling metadata and objects
pickle_protocol – can be specified to override the default protocol
Note
A common PyTorch convention is to save tensors using .pt file extension.
Note
PyTorch preserves storage sharing across serialization. See preserve-storage-sharing for more details.
Note
The 1.6 release of PyTorch switched
torch.saveto use a new zipfile-based file format.torch.loadstill retains the ability to load files in the old format. If for any reason you wanttorch.saveto use the old format, pass the kwarg_use_new_zipfile_serialization=False.Example
>>> # Save to file >>> x = torch.tensor([0, 1, 2, 3, 4]) >>> torch.save(x, 'tensor.pt') >>> # Save to io.BytesIO buffer >>> buffer = io.BytesIO() >>> torch.save(x, buffer)
- 
torch.load(f, map_location=None, pickle_module=<module 'pickle' from '/Users/brandongreen/opt/anaconda3/envs/pytorch/lib/python3.6/pickle.py'>, **pickle_load_args)[source]¶ Loads an object saved with
torch.save()from a file.torch.load()uses Python’s unpickling facilities but treats storages, which underlie tensors, specially. They are first deserialized on the CPU and are then moved to the device they were saved from. If this fails (e.g. because the run time system doesn’t have certain devices), an exception is raised. However, storages can be dynamically remapped to an alternative set of devices using themap_locationargument.If
map_locationis a callable, it will be called once for each serialized storage with two arguments: storage and location. The storage argument will be the initial deserialization of the storage, residing on the CPU. Each serialized storage has a location tag associated with it which identifies the device it was saved from, and this tag is the second argument passed tomap_location. The builtin location tags are'cpu'for CPU tensors and'cuda:device_id'(e.g.'cuda:2') for CUDA tensors.map_locationshould return eitherNoneor a storage. Ifmap_locationreturns a storage, it will be used as the final deserialized object, already moved to the right device. Otherwise,torch.load()will fall back to the default behavior, as ifmap_locationwasn’t specified.If
map_locationis atorch.deviceobject or a string containing a device tag, it indicates the location where all tensors should be loaded.Otherwise, if
map_locationis a dict, it will be used to remap location tags appearing in the file (keys), to ones that specify where to put the storages (values).User extensions can register their own location tags and tagging and deserialization methods using
torch.serialization.register_package().- Parameters
 f – a file-like object (has to implement
read(),readline(),tell(), andseek()), or a string or os.PathLike object containing a file namemap_location – a function,
torch.device, string or a dict specifying how to remap storage locationspickle_module – module used for unpickling metadata and objects (has to match the
pickle_moduleused to serialize file)pickle_load_args – (Python 3 only) optional keyword arguments passed over to
pickle_module.load()andpickle_module.Unpickler(), e.g.,errors=....
Warning
torch.load()usespicklemodule implicitly, which is known to be insecure. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never load data that could have come from an untrusted source, or that could have been tampered with. Only load data you trust.Note
When you call
torch.load()on a file which contains GPU tensors, those tensors will be loaded to GPU by default. You can calltorch.load(.., map_location='cpu')and thenload_state_dict()to avoid GPU RAM surge when loading a model checkpoint.Note
By default, we decode byte strings as
utf-8. This is to avoid a common error caseUnicodeDecodeError: 'ascii' codec can't decode byte 0x...when loading files saved by Python 2 in Python 3. If this default is incorrect, you may use an extraencodingkeyword argument to specify how these objects should be loaded, e.g.,encoding='latin1'decodes them to strings usinglatin1encoding, andencoding='bytes'keeps them as byte arrays which can be decoded later withbyte_array.decode(...).Example
>>> torch.load('tensors.pt') # Load all tensors onto the CPU >>> torch.load('tensors.pt', map_location=torch.device('cpu')) # Load all tensors onto the CPU, using a function >>> torch.load('tensors.pt', map_location=lambda storage, loc: storage) # Load all tensors onto GPU 1 >>> torch.load('tensors.pt', map_location=lambda storage, loc: storage.cuda(1)) # Map tensors from GPU 1 to GPU 0 >>> torch.load('tensors.pt', map_location={'cuda:1':'cuda:0'}) # Load tensor from io.BytesIO object >>> with open('tensor.pt', 'rb') as f: ... buffer = io.BytesIO(f.read()) >>> torch.load(buffer) # Load a module with 'ascii' encoding for unpickling >>> torch.load('module.pt', encoding='ascii')
Parallelism¶
- 
torch.set_num_threads(int)¶ Sets the number of threads used for intraop parallelism on CPU.
Warning
To ensure that the correct number of threads is used, set_num_threads must be called before running eager, JIT or autograd code.
Locally disabling gradient computation¶
The context managers torch.no_grad(), torch.enable_grad(), and
torch.set_grad_enabled() are helpful for locally disabling and enabling
gradient computation. See Locally disabling gradient computation for more details on
their usage.
Examples:
>>> x = torch.zeros(1, requires_grad=True)
>>> with torch.no_grad():
...     y = x * 2
>>> y.requires_grad
False
>>> is_train = False
>>> with torch.set_grad_enabled(is_train):
...     y = x * 2
>>> y.requires_grad
False
>>> torch.set_grad_enabled(True)  # this can also be used as a function
>>> y = x * 2
>>> y.requires_grad
True
>>> torch.set_grad_enabled(False)
>>> y = x * 2
>>> y.requires_grad
False
Math operations¶
Pointwise Ops¶
- 
torch.abs(input, *, out=None) → Tensor¶ Computes the absolute value of each element in
input.\[\text{out}_{i} = |\text{input}_{i}| \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.abs(torch.tensor([-1, -2, 3])) tensor([ 1, 2, 3])
- 
torch.acos(input, *, out=None) → Tensor¶ Computes the inverse cosine of each element in
input.\[\text{out}_{i} = \cos^{-1}(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.3348, -0.5889, 0.2005, -0.1584]) >>> torch.acos(a) tensor([ 1.2294, 2.2004, 1.3690, 1.7298])
- 
torch.add(input, other, *, out=None)¶ Adds the scalar
otherto each element of the inputinputand returns a new resulting tensor.\[\text{out} = \text{input} + \text{other} \]If
inputis of type FloatTensor or DoubleTensor,othermust be a real number, otherwise it should be an integer.- Parameters
 input (Tensor) – the input tensor.
value (Number) – the number to be added to each element of
input
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.0202, 1.0985, 1.3506, -0.6056]) >>> torch.add(a, 20) tensor([ 20.0202, 21.0985, 21.3506, 19.3944])
- 
torch.add(input, other, *, alpha=1, out=None)¶ 
Each element of the tensor
otheris multiplied by the scalaralphaand added to each element of the tensorinput. The resulting tensor is returned.The shapes of
inputandothermust be broadcastable.\[\text{out} = \text{input} + \text{alpha} \times \text{other} \]If
otheris of type FloatTensor or DoubleTensor,alphamust be a real number, otherwise it should be an integer.- Parameters
 - Keyword Arguments
 alpha (Number) – the scalar multiplier for
otherout (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.9732, -0.3497, 0.6245, 0.4022]) >>> b = torch.randn(4, 1) >>> b tensor([[ 0.3743], [-1.7724], [-0.5811], [-0.8017]]) >>> torch.add(a, b, alpha=10) tensor([[ 2.7695, 3.3930, 4.3672, 4.1450], [-18.6971, -18.0736, -17.0994, -17.3216], [ -6.7845, -6.1610, -5.1868, -5.4090], [ -8.9902, -8.3667, -7.3925, -7.6147]])
- 
torch.addcdiv(input, tensor1, tensor2, *, value=1, out=None) → Tensor¶ Performs the element-wise division of
tensor1bytensor2, multiply the result by the scalarvalueand add it toinput.Warning
Integer division with addcdiv is no longer supported, and in a future release addcdiv will perform a true division of tensor1 and tensor2. The historic addcdiv behavior can be implemented as (input + value * torch.trunc(tensor1 / tensor2)).to(input.dtype) for integer inputs and as (input + value * tensor1 / tensor2) for float inputs. The future addcdiv behavior is just the latter implementation: (input + value * tensor1 / tensor2), for all dtypes.
\[\text{out}_i = \text{input}_i + \text{value} \times \frac{\text{tensor1}_i}{\text{tensor2}_i} \]The shapes of
input,tensor1, andtensor2must be broadcastable.For inputs of type FloatTensor or DoubleTensor,
valuemust be a real number, otherwise an integer.- Parameters
 - Keyword Arguments
 value (Number, optional) – multiplier for \(\text{tensor1} / \text{tensor2}\)
out (Tensor, optional) – the output tensor.
Example:
>>> t = torch.randn(1, 3) >>> t1 = torch.randn(3, 1) >>> t2 = torch.randn(1, 3) >>> torch.addcdiv(t, t1, t2, value=0.1) tensor([[-0.2312, -3.6496, 0.1312], [-1.0428, 3.4292, -0.1030], [-0.5369, -0.9829, 0.0430]])
- 
torch.addcmul(input, tensor1, tensor2, *, value=1, out=None) → Tensor¶ Performs the element-wise multiplication of
tensor1bytensor2, multiply the result by the scalarvalueand add it toinput.\[\text{out}_i = \text{input}_i + \text{value} \times \text{tensor1}_i \times \text{tensor2}_i \]The shapes of
tensor,tensor1, andtensor2must be broadcastable.For inputs of type FloatTensor or DoubleTensor,
valuemust be a real number, otherwise an integer.- Parameters
 - Keyword Arguments
 value (Number, optional) – multiplier for \(tensor1 .* tensor2\)
out (Tensor, optional) – the output tensor.
Example:
>>> t = torch.randn(1, 3) >>> t1 = torch.randn(3, 1) >>> t2 = torch.randn(1, 3) >>> torch.addcmul(t, t1, t2, value=0.1) tensor([[-0.8635, -0.6391, 1.6174], [-0.7617, -0.5879, 1.7388], [-0.8353, -0.6249, 1.6511]])
- 
torch.asin(input, *, out=None) → Tensor¶ Returns a new tensor with the arcsine of the elements of
input.\[\text{out}_{i} = \sin^{-1}(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.5962, 1.4985, -0.4396, 1.4525]) >>> torch.asin(a) tensor([-0.6387, nan, -0.4552, nan])
- 
torch.atan(input, *, out=None) → Tensor¶ Returns a new tensor with the arctangent of the elements of
input.\[\text{out}_{i} = \tan^{-1}(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.2341, 0.2539, -0.6256, -0.6448]) >>> torch.atan(a) tensor([ 0.2299, 0.2487, -0.5591, -0.5727])
- 
torch.atan2(input, other, *, out=None) → Tensor¶ Element-wise arctangent of \(\text{input}_{i} / \text{other}_{i}\) with consideration of the quadrant. Returns a new tensor with the signed angles in radians between vector \((\text{other}_{i}, \text{input}_{i})\) and vector \((1, 0)\). (Note that \(\text{other}_{i}\), the second parameter, is the x-coordinate, while \(\text{input}_{i}\), the first parameter, is the y-coordinate.)
The shapes of
inputandothermust be broadcastable.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.9041, 0.0196, -0.3108, -2.4423]) >>> torch.atan2(a, torch.randn(4)) tensor([ 0.9833, 0.0811, -1.9743, -1.4151])
- 
torch.ceil(input, *, out=None) → Tensor¶ Returns a new tensor with the ceil of the elements of
input, the smallest integer greater than or equal to each element.\[\text{out}_{i} = \left\lceil \text{input}_{i} \right\rceil = \left\lfloor \text{input}_{i} \right\rfloor + 1 \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.6341, -1.4208, -1.0900, 0.5826]) >>> torch.ceil(a) tensor([-0., -1., -1., 1.])
- 
torch.clamp(input, min, max, *, out=None) → Tensor¶ Clamp all elements in
inputinto the range [min,max]. Let min_value and max_value beminandmax, respectively, this returns:\[y_i = \min(\max(x_i, \text{min\_value}), \text{max\_value}) \]- Parameters
 input (Tensor) – the input tensor.
min (Number) – lower-bound of the range to be clamped to
max (Number) – upper-bound of the range to be clamped to
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-1.7120, 0.1734, -0.0478, -0.0922]) >>> torch.clamp(a, min=-0.5, max=0.5) tensor([-0.5000, 0.1734, -0.0478, -0.0922])
Clamps all elements in
inputto be larger or equalmin.- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 min (Number) – minimal value of each element in the output
out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.0299, -2.3184, 2.1593, -0.8883]) >>> torch.clamp(a, min=0.5) tensor([ 0.5000, 0.5000, 2.1593, 0.5000])
Clamps all elements in
inputto be smaller or equalmax.- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 max (Number) – maximal value of each element in the output
out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.7753, -0.4702, -0.4599, 1.1899]) >>> torch.clamp(a, max=0.5) tensor([ 0.5000, -0.4702, -0.4599, 0.5000])
- 
torch.cos(input, *, out=None) → Tensor¶ Returns a new tensor with the cosine of the elements of
input.\[\text{out}_{i} = \cos(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 1.4309, 1.2706, -0.8562, 0.9796]) >>> torch.cos(a) tensor([ 0.1395, 0.2957, 0.6553, 0.5574])
- 
torch.cosh(input, *, out=None) → Tensor¶ Returns a new tensor with the hyperbolic cosine of the elements of
input.\[\text{out}_{i} = \cosh(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.1632, 1.1835, -0.6979, -0.7325]) >>> torch.cosh(a) tensor([ 1.0133, 1.7860, 1.2536, 1.2805])
Note
When
inputis on the CPU, the implementation of torch.cosh may use the Sleef library, which rounds very large results to infinity or negative infinity. See here for details.
- 
torch.div(input, other, *, rounding_mode=None, out=None) → Tensor¶ Divides each element of the input
inputby the corresponding element ofother.\[\text{out}_i = \frac{\text{input}_i}{\text{other}_i} \]Note
By default, this performs a “true” division like Python 3. See the
rounding_modeargument for floor division.Supports broadcasting to a common shape, type promotion, and integer, float, and complex inputs. Always promotes integer types to the default scalar type.
- Parameters
 - Keyword Arguments
 rounding_mode (str, optional) –
Type of rounding applied to the result:
None - default behavior. Performs no rounding and, if both
inputandotherare integer types, promotes the inputs to the default scalar type. Equivalent to true division in Python (the/operator) and NumPy’snp.true_divide."trunc"- rounds the results of the division towards zero. Equivalent to C-style integer division."floor"- rounds the results of the division down. Equivalent to floor division in Python (the//operator) and NumPy’snp.floor_divide.
out (Tensor, optional) – the output tensor.
Examples:
>>> x = torch.tensor([ 0.3810, 1.2774, -0.2972, -0.3719, 0.4637]) >>> torch.div(x, 0.5) tensor([ 0.7620, 2.5548, -0.5944, -0.7438, 0.9274]) >>> a = torch.tensor([[-0.3711, -1.9353, -0.4605, -0.2917], ... [ 0.1815, -1.0111, 0.9805, -1.5923], ... [ 0.1062, 1.4581, 0.7759, -1.2344], ... [-0.1830, -0.0313, 1.1908, -1.4757]]) >>> b = torch.tensor([ 0.8032, 0.2930, -0.8113, -0.2308]) >>> torch.div(a, b) tensor([[-0.4620, -6.6051, 0.5676, 1.2639], [ 0.2260, -3.4509, -1.2086, 6.8990], [ 0.1322, 4.9764, -0.9564, 5.3484], [-0.2278, -0.1068, -1.4678, 6.3938]]) >>> torch.div(a, b, rounding_mode='trunc') tensor([[-0., -6., 0., 1.], [ 0., -3., -1., 6.], [ 0., 4., -0., 5.], [-0., -0., -1., 6.]]) >>> torch.div(a, b, rounding_mode='floor') tensor([[-1., -7., 0., 1.], [ 0., -4., -2., 6.], [ 0., 4., -1., 5.], [-1., -1., -2., 6.]])
- 
torch.digamma(input, *, out=None) → Tensor¶ Computes the logarithmic derivative of the gamma function on input.
\[\psi(x) = \frac{d}{dx} \ln\left(\Gamma\left(x\right)\right) = \frac{\Gamma'(x)}{\Gamma(x)} \]- Parameters
 input (Tensor) – the tensor to compute the digamma function on
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Note
This function is similar to SciPy’s scipy.special.digamma.
Note
From PyTorch 1.8 onwards, the digamma function returns -Inf for 0. Previously it returned NaN for 0.
Example:
>>> a = torch.tensor([1, 0.5]) >>> torch.digamma(a) tensor([-0.5772, -1.9635])
- 
torch.erf(input, *, out=None) → Tensor¶ Computes the error function of each element. The error function is defined as follows:
\[\mathrm{erf}(x) = \frac{2}{\sqrt{\pi}} \int_{0}^{x} e^{-t^2} dt \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.erf(torch.tensor([0, -1., 10.])) tensor([ 0.0000, -0.8427, 1.0000])
- 
torch.erfc(input, *, out=None) → Tensor¶ Computes the complementary error function of each element of
input. The complementary error function is defined as follows:\[\mathrm{erfc}(x) = 1 - \frac{2}{\sqrt{\pi}} \int_{0}^{x} e^{-t^2} dt \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.erfc(torch.tensor([0, -1., 10.])) tensor([ 1.0000, 1.8427, 0.0000])
- 
torch.erfinv(input, *, out=None) → Tensor¶ Computes the inverse error function of each element of
input. The inverse error function is defined in the range \((-1, 1)\) as:\[\mathrm{erfinv}(\mathrm{erf}(x)) = x \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.erfinv(torch.tensor([0, 0.5, -1.])) tensor([ 0.0000, 0.4769, -inf])
- 
torch.exp(input, *, out=None) → Tensor¶ Returns a new tensor with the exponential of the elements of the input tensor
input.\[y_{i} = e^{x_{i}} \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.exp(torch.tensor([0, math.log(2.)])) tensor([ 1., 2.])
- 
torch.expm1(input, *, out=None) → Tensor¶ Returns a new tensor with the exponential of the elements minus 1 of
input.\[y_{i} = e^{x_{i}} - 1 \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.expm1(torch.tensor([0, math.log(2.)])) tensor([ 0., 1.])
- 
torch.floor(input, *, out=None) → Tensor¶ Returns a new tensor with the floor of the elements of
input, the largest integer less than or equal to each element.\[\text{out}_{i} = \left\lfloor \text{input}_{i} \right\rfloor \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.8166, 1.5308, -0.2530, -0.2091]) >>> torch.floor(a) tensor([-1., 1., -1., -1.])
- 
torch.fmod(input, other, *, out=None) → Tensor¶ Computes the element-wise remainder of division.
The dividend and divisor may contain both for integer and floating point numbers. The remainder has the same sign as the dividend
input.Supports broadcasting to a common shape, type promotion, and integer and float inputs.
Note
When the divisor is zero, returns
NaNfor floating point dtypes on both CPU and GPU; raisesRuntimeErrorfor integer division by zero on CPU; Integer division by zero on GPU may return any value.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.fmod(torch.tensor([-3., -2, -1, 1, 2, 3]), 2) tensor([-1., -0., -1., 1., 0., 1.]) >>> torch.fmod(torch.tensor([1, 2, 3, 4, 5]), 1.5) tensor([1.0000, 0.5000, 0.0000, 1.0000, 0.5000])
- 
torch.frac(input, *, out=None) → Tensor¶ Computes the fractional portion of each element in
input.\[\text{out}_{i} = \text{input}_{i} - \left\lfloor |\text{input}_{i}| \right\rfloor * \operatorname{sgn}(\text{input}_{i}) \]Example:
>>> torch.frac(torch.tensor([1, 2.5, -3.2])) tensor([ 0.0000, 0.5000, -0.2000])
- 
torch.lerp(input, end, weight, *, out=None)¶ Does a linear interpolation of two tensors
start(given byinput) andendbased on a scalar or tensorweightand returns the resultingouttensor.\[\text{out}_i = \text{start}_i + \text{weight}_i \times (\text{end}_i - \text{start}_i) \]The shapes of
startandendmust be broadcastable. Ifweightis a tensor, then the shapes ofweight,start, andendmust be broadcastable.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> start = torch.arange(1., 5.) >>> end = torch.empty(4).fill_(10) >>> start tensor([ 1., 2., 3., 4.]) >>> end tensor([ 10., 10., 10., 10.]) >>> torch.lerp(start, end, 0.5) tensor([ 5.5000, 6.0000, 6.5000, 7.0000]) >>> torch.lerp(start, end, torch.full_like(start, 0.5)) tensor([ 5.5000, 6.0000, 6.5000, 7.0000])
- 
torch.log(input, *, out=None) → Tensor¶ Returns a new tensor with the natural logarithm of the elements of
input.\[y_{i} = \log_{e} (x_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(5) >>> a tensor([-0.7168, -0.5471, -0.8933, -1.4428, -0.1190]) >>> torch.log(a) tensor([ nan, nan, nan, nan, nan])
- 
torch.log10(input, *, out=None) → Tensor¶ Returns a new tensor with the logarithm to the base 10 of the elements of
input.\[y_{i} = \log_{10} (x_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.rand(5) >>> a tensor([ 0.5224, 0.9354, 0.7257, 0.1301, 0.2251]) >>> torch.log10(a) tensor([-0.2820, -0.0290, -0.1392, -0.8857, -0.6476])
- 
torch.log1p(input, *, out=None) → Tensor¶ Returns a new tensor with the natural logarithm of (1 +
input).\[y_i = \log_{e} (x_i + 1) \]Note
This function is more accurate than
torch.log()for small values ofinput- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(5) >>> a tensor([-1.0090, -0.9923, 1.0249, -0.5372, 0.2492]) >>> torch.log1p(a) tensor([ nan, -4.8653, 0.7055, -0.7705, 0.2225])
- 
torch.log2(input, *, out=None) → Tensor¶ Returns a new tensor with the logarithm to the base 2 of the elements of
input.\[y_{i} = \log_{2} (x_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.rand(5) >>> a tensor([ 0.8419, 0.8003, 0.9971, 0.5287, 0.0490]) >>> torch.log2(a) tensor([-0.2483, -0.3213, -0.0042, -0.9196, -4.3504])
- 
torch.mul(input, other, *, out=None)¶ Multiplies each element of the input
inputwith the scalarotherand returns a new resulting tensor.\[\text{out}_i = \text{other} \times \text{input}_i \]If
inputis of type FloatTensor or DoubleTensor,othershould be a real number, otherwise it should be an integer- Parameters
 {input} –
other (Number) – the number to be multiplied to each element of
input
- Keyword Arguments
 {out} –
Example:
>>> a = torch.randn(3) >>> a tensor([ 0.2015, -0.4255, 2.6087]) >>> torch.mul(a, 100) tensor([ 20.1494, -42.5491, 260.8663])
- 
torch.mul(input, other, *, out=None)¶ 
Each element of the tensor
inputis multiplied by the corresponding element of the Tensorother. The resulting tensor is returned.The shapes of
inputandothermust be broadcastable.\[\text{out}_i = \text{input}_i \times \text{other}_i \]- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4, 1) >>> a tensor([[ 1.1207], [-0.3137], [ 0.0700], [ 0.8378]]) >>> b = torch.randn(1, 4) >>> b tensor([[ 0.5146, 0.1216, -0.5244, 2.2382]]) >>> torch.mul(a, b) tensor([[ 0.5767, 0.1363, -0.5877, 2.5083], [-0.1614, -0.0382, 0.1645, -0.7021], [ 0.0360, 0.0085, -0.0367, 0.1567], [ 0.4312, 0.1019, -0.4394, 1.8753]])
- 
torch.mvlgamma(input, p) → Tensor¶ Computes the multivariate log-gamma function) with dimension \(p\) element-wise, given by
\[\log(\Gamma_{p}(a)) = C + \displaystyle \sum_{i=1}^{p} \log\left(\Gamma\left(a - \frac{i - 1}{2}\right)\right) \]where \(C = \log(\pi) \times \frac{p (p - 1)}{4}\) and \(\Gamma(\cdot)\) is the Gamma function.
All elements must be greater than \(\frac{p - 1}{2}\), otherwise an error would be thrown.
- Parameters
 
Example:
>>> a = torch.empty(2, 3).uniform_(1, 2) >>> a tensor([[1.6835, 1.8474, 1.1929], [1.0475, 1.7162, 1.4180]]) >>> torch.mvlgamma(a, 2) tensor([[0.3928, 0.4007, 0.7586], [1.0311, 0.3901, 0.5049]])
- 
torch.neg(input, *, out=None) → Tensor¶ Returns a new tensor with the negative of the elements of
input.\[\text{out} = -1 \times \text{input} \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(5) >>> a tensor([ 0.0090, -0.2262, -0.0682, -0.2866, 0.3940]) >>> torch.neg(a) tensor([-0.0090, 0.2262, 0.0682, 0.2866, -0.3940])
- 
torch.pow(input, exponent, *, out=None) → Tensor¶ Takes the power of each element in
inputwithexponentand returns a tensor with the result.exponentcan be either a singlefloatnumber or a Tensor with the same number of elements asinput.When
exponentis a scalar value, the operation applied is:\[\text{out}_i = x_i ^ \text{exponent} \]When
exponentis a tensor, the operation applied is:\[\text{out}_i = x_i ^ {\text{exponent}_i} \]When
exponentis a tensor, the shapes ofinputandexponentmust be broadcastable.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.4331, 1.2475, 0.6834, -0.2791]) >>> torch.pow(a, 2) tensor([ 0.1875, 1.5561, 0.4670, 0.0779]) >>> exp = torch.arange(1., 5.) >>> a = torch.arange(1., 5.) >>> a tensor([ 1., 2., 3., 4.]) >>> exp tensor([ 1., 2., 3., 4.]) >>> torch.pow(a, exp) tensor([ 1., 4., 27., 256.])
selfis a scalarfloatvalue, andexponentis a tensor. The returned tensoroutis of the same shape asexponentThe operation applied is:
\[\text{out}_i = \text{self} ^ {\text{exponent}_i} \]- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> exp = torch.arange(1., 5.) >>> base = 2 >>> torch.pow(base, exp) tensor([ 2., 4., 8., 16.])
- 
torch.reciprocal(input, *, out=None) → Tensor¶ Returns a new tensor with the reciprocal of the elements of
inputNote
Unlike NumPy’s reciprocal, torch.reciprocal supports integral inputs. Integral inputs to reciprocal are automatically promoted to the default scalar type.
\[\text{out}_{i} = \frac{1}{\text{input}_{i}} \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.4595, -2.1219, -1.4314, 0.7298]) >>> torch.reciprocal(a) tensor([-2.1763, -0.4713, -0.6986, 1.3702])
- 
torch.remainder(input, other, *, out=None) → Tensor¶ Computes the element-wise remainder of division.
The dividend and divisor may contain both for integer and floating point numbers. The remainder has the same sign as the divisor
other.Supports broadcasting to a common shape, type promotion, and integer and float inputs.
Note
Complex inputs are not supported. In some cases, it is not mathematically possible to satisfy the definition of a modulo operation with complex numbers. See
torch.fmod()for how division by zero is handled.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> torch.remainder(torch.tensor([-3., -2, -1, 1, 2, 3]), 2) tensor([ 1., 0., 1., 1., 0., 1.]) >>> torch.remainder(torch.tensor([1, 2, 3, 4, 5]), 1.5) tensor([ 1.0000, 0.5000, 0.0000, 1.0000, 0.5000])
See also
torch.fmod(), which computes the element-wise remainder of division equivalently to the C library functionfmod().
- 
torch.round(input, *, out=None) → Tensor¶ Returns a new tensor with each of the elements of
inputrounded to the closest integer.- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.9920, 0.6077, 0.9734, -1.0362]) >>> torch.round(a) tensor([ 1., 1., 1., -1.])
- 
torch.rsqrt(input, *, out=None) → Tensor¶ Returns a new tensor with the reciprocal of the square-root of each of the elements of
input.\[\text{out}_{i} = \frac{1}{\sqrt{\text{input}_{i}}} \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.0370, 0.2970, 1.5420, -0.9105]) >>> torch.rsqrt(a) tensor([ nan, 1.8351, 0.8053, nan])
- 
torch.sigmoid(input, *, out=None) → Tensor¶ Returns a new tensor with the sigmoid of the elements of
input.\[\text{out}_{i} = \frac{1}{1 + e^{-\text{input}_{i}}} \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.9213, 1.0887, -0.8858, -1.7683]) >>> torch.sigmoid(a) tensor([ 0.7153, 0.7481, 0.2920, 0.1458])
- 
torch.sign(input, *, out=None) → Tensor¶ Returns a new tensor with the signs of the elements of
input.\[\text{out}_{i} = \operatorname{sgn}(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.tensor([0.7, -1.2, 0., 2.3]) >>> a tensor([ 0.7000, -1.2000, 0.0000, 2.3000]) >>> torch.sign(a) tensor([ 1., -1., 0., 1.])
- 
torch.sin(input, *, out=None) → Tensor¶ Returns a new tensor with the sine of the elements of
input.\[\text{out}_{i} = \sin(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-0.5461, 0.1347, -2.7266, -0.2746]) >>> torch.sin(a) tensor([-0.5194, 0.1343, -0.4032, -0.2711])
- 
torch.sinh(input, *, out=None) → Tensor¶ Returns a new tensor with the hyperbolic sine of the elements of
input.\[\text{out}_{i} = \sinh(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.5380, -0.8632, -0.1265, 0.9399]) >>> torch.sinh(a) tensor([ 0.5644, -0.9744, -0.1268, 1.0845])
Note
When
inputis on the CPU, the implementation of torch.sinh may use the Sleef library, which rounds very large results to infinity or negative infinity. See here for details.
- 
torch.sqrt(input, *, out=None) → Tensor¶ Returns a new tensor with the square-root of the elements of
input.\[\text{out}_{i} = \sqrt{\text{input}_{i}} \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-2.0755, 1.0226, 0.0831, 0.4806]) >>> torch.sqrt(a) tensor([ nan, 1.0112, 0.2883, 0.6933])
- 
torch.tan(input, *, out=None) → Tensor¶ Returns a new tensor with the tangent of the elements of
input.\[\text{out}_{i} = \tan(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([-1.2027, -1.7687, 0.4412, -1.3856]) >>> torch.tan(a) tensor([-2.5930, 4.9859, 0.4722, -5.3366])
- 
torch.tanh(input, *, out=None) → Tensor¶ Returns a new tensor with the hyperbolic tangent of the elements of
input.\[\text{out}_{i} = \tanh(\text{input}_{i}) \]- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 0.8986, -0.7279, 1.1745, 0.2611]) >>> torch.tanh(a) tensor([ 0.7156, -0.6218, 0.8257, 0.2553])
- 
torch.trunc(input, *, out=None) → Tensor¶ Returns a new tensor with the truncated integer values of the elements of
input.- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4) >>> a tensor([ 3.4742, 0.5466, -0.8008, -0.9079]) >>> torch.trunc(a) tensor([ 3., 0., -0., -0.])
Reduction Ops¶
- 
torch.argmax(input) → LongTensor¶ Returns the indices of the maximum value of all elements in the
inputtensor.This is the second value returned by
torch.max(). See its documentation for the exact semantics of this method.Note
If there are multiple minimal values then the indices of the first minimal value are returned.
- Parameters
 input (Tensor) – the input tensor.
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[ 1.3398, 0.2663, -0.2686, 0.2450], [-0.7401, -0.8805, -0.3402, -1.1936], [ 0.4907, -1.3948, -1.0691, -0.3132], [-1.6092, 0.5419, -0.2993, 0.3195]]) >>> torch.argmax(a) tensor(0)
- 
torch.argmax(input, dim, keepdim=False) → LongTensor¶ 
Returns the indices of the maximum values of a tensor across a dimension.
This is the second value returned by
torch.max(). See its documentation for the exact semantics of this method.- Parameters
 
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[ 1.3398, 0.2663, -0.2686, 0.2450], [-0.7401, -0.8805, -0.3402, -1.1936], [ 0.4907, -1.3948, -1.0691, -0.3132], [-1.6092, 0.5419, -0.2993, 0.3195]]) >>> torch.argmax(a, dim=1) tensor([ 0, 2, 0, 1])
- 
torch.argmin(input, dim=None, keepdim=False) → LongTensor¶ Returns the indices of the minimum value(s) of the flattened tensor or along a dimension
This is the second value returned by
torch.min(). See its documentation for the exact semantics of this method.Note
If there are multiple minimal values then the indices of the first minimal value are returned.
- Parameters
 
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[ 0.1139, 0.2254, -0.1381, 0.3687], [ 1.0100, -1.1975, -0.0102, -0.4732], [-0.9240, 0.1207, -0.7506, -1.0213], [ 1.7809, -1.2960, 0.9384, 0.1438]]) >>> torch.argmin(a) tensor(13) >>> torch.argmin(a, dim=1) tensor([ 2, 1, 3, 1]) >>> torch.argmin(a, dim=1, keepdim=True) tensor([[2], [1], [3], [1]])
- 
torch.cumprod(input, dim, *, dtype=None, out=None) → Tensor¶ Returns the cumulative product of elements of
inputin the dimensiondim.For example, if
inputis a vector of size N, the result will also be a vector of size N, with elements.\[y_i = x_1 \times x_2\times x_3\times \dots \times x_i \]- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted todtypebefore the operation is performed. This is useful for preventing data type overflows. Default: None.out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(10) >>> a tensor([ 0.6001, 0.2069, -0.1919, 0.9792, 0.6727, 1.0062, 0.4126, -0.2129, -0.4206, 0.1968]) >>> torch.cumprod(a, dim=0) tensor([ 0.6001, 0.1241, -0.0238, -0.0233, -0.0157, -0.0158, -0.0065, 0.0014, -0.0006, -0.0001]) >>> a[5] = 0.0 >>> torch.cumprod(a, dim=0) tensor([ 0.6001, 0.1241, -0.0238, -0.0233, -0.0157, -0.0000, -0.0000, 0.0000, -0.0000, -0.0000])
- 
torch.cumsum(input, dim, *, dtype=None, out=None) → Tensor¶ Returns the cumulative sum of elements of
inputin the dimensiondim.For example, if
inputis a vector of size N, the result will also be a vector of size N, with elements.\[y_i = x_1 + x_2 + x_3 + \dots + x_i \]- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted todtypebefore the operation is performed. This is useful for preventing data type overflows. Default: None.out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(10) >>> a tensor([-0.8286, -0.4890, 0.5155, 0.8443, 0.1865, -0.1752, -2.0595, 0.1850, -1.1571, -0.4243]) >>> torch.cumsum(a, dim=0) tensor([-0.8286, -1.3175, -0.8020, 0.0423, 0.2289, 0.0537, -2.0058, -1.8209, -2.9780, -3.4022])
- 
torch.dist(input, other, p=2) → Tensor¶ Returns the p-norm of (
input-other)The shapes of
inputandothermust be broadcastable.- Parameters
 
Example:
>>> x = torch.randn(4) >>> x tensor([-1.5393, -0.8675, 0.5916, 1.6321]) >>> y = torch.randn(4) >>> y tensor([ 0.0967, -1.0511, 0.6295, 0.8360]) >>> torch.dist(x, y, 3.5) tensor(1.6727) >>> torch.dist(x, y, 3) tensor(1.6973) >>> torch.dist(x, y, 0) tensor(inf) >>> torch.dist(x, y, 1) tensor(2.6537)
- 
torch.logsumexp(input, dim, keepdim=False, *, out=None)¶ Returns the log of summed exponentials of each row of the
inputtensor in the given dimensiondim. The computation is numerically stabilized.For summation index \(j\) given by dim and other indices \(i\), the result is
\[\text{logsumexp}(x)_{i} = \log \sum_j \exp(x_{ij}) \]If
keepdimisTrue, the output tensor is of the same size asinputexcept in the dimension(s)dimwhere it is of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensor having 1 (orlen(dim)) fewer dimension(s).- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Example::
 >>> a = torch.randn(3, 3) >>> torch.logsumexp(a, 1) tensor([ 0.8442, 1.4322, 0.8711])
- 
torch.mean(input) → Tensor¶ Returns the mean value of all elements in the
inputtensor.- Parameters
 input (Tensor) – the input tensor.
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[ 0.2294, -0.5481, 1.3288]]) >>> torch.mean(a) tensor(0.3367)
Returns the mean value of each row of the
inputtensor in the given dimensiondim. Ifdimis a list of dimensions, reduce over all of them.If
keepdimisTrue, the output tensor is of the same size asinputexcept in the dimension(s)dimwhere it is of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensor having 1 (orlen(dim)) fewer dimension(s).- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[-0.3841, 0.6320, 0.4254, -0.7384], [-0.9644, 1.0131, -0.6549, -1.4279], [-0.2951, -1.3350, -0.7694, 0.5600], [ 1.0842, -0.9580, 0.3623, 0.2343]]) >>> torch.mean(a, 1) tensor([-0.0163, -0.5085, -0.4599, 0.1807]) >>> torch.mean(a, 1, True) tensor([[-0.0163], [-0.5085], [-0.4599], [ 0.1807]])
- 
torch.median(input) → Tensor¶ Returns the median of the values in
input.Note
The median is not unique for
inputtensors with an even number of elements. In this case the lower of the two medians is returned. To compute the mean of both medians, usetorch.quantile()withq=0.5instead.Warning
This function produces deterministic (sub)gradients unlike
median(dim=0)- Parameters
 input (Tensor) – the input tensor.
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[ 1.5219, -1.5212, 0.2202]]) >>> torch.median(a) tensor(0.2202)
- 
torch.median(input, dim=- 1, keepdim=False, *, out=None)¶ 
Returns a namedtuple
(values, indices)wherevaluescontains the median of each row ofinputin the dimensiondim, andindicescontains the index of the median values found in the dimensiondim.By default,
dimis the last dimension of theinputtensor.If
keepdimisTrue, the output tensors are of the same size asinputexcept in the dimensiondimwhere they are of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the outputs tensor having 1 fewer dimension thaninput.Note
The median is not unique for
inputtensors with an even number of elements in the dimensiondim. In this case the lower of the two medians is returned. To compute the mean of both medians ininput, usetorch.quantile()withq=0.5instead.Warning
indicesdoes not necessarily contain the first occurrence of each median value found, unless it is unique. The exact implementation details are device-specific. Do not expect the same result when run on CPU and GPU in general. For the same reason do not expect the gradients to be deterministic.- Parameters
 - Keyword Arguments
 out ((Tensor, Tensor), optional) – The first tensor will be populated with the median values and the second tensor, which must have dtype long, with their indices in the dimension
dimofinput.
Example:
>>> a = torch.randn(4, 5) >>> a tensor([[ 0.2505, -0.3982, -0.9948, 0.3518, -1.3131], [ 0.3180, -0.6993, 1.0436, 0.0438, 0.2270], [-0.2751, 0.7303, 0.2192, 0.3321, 0.2488], [ 1.0778, -1.9510, 0.7048, 0.4742, -0.7125]]) >>> torch.median(a, 1) torch.return_types.median(values=tensor([-0.3982, 0.2270, 0.2488, 0.4742]), indices=tensor([1, 4, 4, 3]))
- 
torch.mode(input, dim=- 1, keepdim=False, *, out=None)¶ Returns a namedtuple
(values, indices)wherevaluesis the mode value of each row of theinputtensor in the given dimensiondim, i.e. a value which appears most often in that row, andindicesis the index location of each mode value found.By default,
dimis the last dimension of theinputtensor.If
keepdimisTrue, the output tensors are of the same size asinputexcept in the dimensiondimwhere they are of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensors having 1 fewer dimension thaninput.Note
This function is not defined for
torch.cuda.Tensoryet.- Parameters
 - Keyword Arguments
 out (tuple, optional) – the result tuple of two output tensors (values, indices)
Example:
>>> a = torch.randint(10, (5,)) >>> a tensor([6, 5, 1, 0, 2]) >>> b = a + (torch.randn(50, 1) * 5).long() >>> torch.mode(b, 0) torch.return_types.mode(values=tensor([6, 5, 1, 0, 2]), indices=tensor([2, 2, 2, 2, 2]))
- 
torch.norm(input, p='fro', dim=None, keepdim=False, out=None, dtype=None)[source]¶ Returns the matrix norm or vector norm of a given tensor.
Warning
torch.norm is deprecated and may be removed in a future PyTorch release. Use
torch.linalg.norm()instead, but note thattorch.linalg.norm()has a different signature and slightly different behavior that is more consistent with NumPy’s numpy.linalg.norm.- Parameters
 input (Tensor) – The input tensor. Its data type must be either a floating point or complex type. For complex inputs, the norm is calculated using the absolute value of each element. If the input is complex and neither
dtypenoroutis specified, the result’s data type will be the corresponding floating point type (e.g. float ifinputis complexfloat).p (int, float, inf, -inf, 'fro', 'nuc', optional) –
the order of norm. Default:
'fro'The following norms can be calculated:ord
matrix norm
vector norm
’fro’
Frobenius norm
–
‘nuc’
nuclear norm
–
Number
–
sum(abs(x)**ord)**(1./ord)
The vector norm can be calculated across any number of dimensions. The corresponding dimensions of
inputare flattened into one dimension, and the norm is calculated on the flattened dimension.Frobenius norm produces the same result as
p=2in all cases except whendimis a list of three or more dims, in which case Frobenius norm throws an error.Nuclear norm can only be calculated across exactly two dimensions.
dim (int, tuple of python:ints, list of python:ints, optional) – Specifies which dimension or dimensions of
inputto calculate the norm across. IfdimisNone, the norm will be calculated across all dimensions ofinput. If the norm type indicated bypdoes not support the specified number of dimensions, an error will occur.keepdim (bool, optional) – whether the output tensors have
dimretained or not. Ignored ifdim=Noneandout=None. Default:Falseout (Tensor, optional) – the output tensor. Ignored if
dim=Noneandout=None.dtype (
torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted to :attr:’dtype’ while performing the operation. Default: None.
Note
Even though
p='fro'supports any number of dimensions, the true mathematical definition of Frobenius norm only applies to tensors with exactly two dimensions.torch.linalg.norm()withord='fro'aligns with the mathematical definition, since it can only be applied across exactly two dimensions.Example:
>>> import torch >>> a = torch.arange(9, dtype= torch.float) - 4 >>> b = a.reshape((3, 3)) >>> torch.norm(a) tensor(7.7460) >>> torch.norm(b) tensor(7.7460) >>> torch.norm(a, float('inf')) tensor(4.) >>> torch.norm(b, float('inf')) tensor(4.) >>> c = torch.tensor([[ 1, 2, 3],[-1, 1, 4]] , dtype= torch.float) >>> torch.norm(c, dim=0) tensor([1.4142, 2.2361, 5.0000]) >>> torch.norm(c, dim=1) tensor([3.7417, 4.2426]) >>> torch.norm(c, p=1, dim=1) tensor([6., 6.]) >>> d = torch.arange(8, dtype= torch.float).reshape(2,2,2) >>> torch.norm(d, dim=(1,2)) tensor([ 3.7417, 11.2250]) >>> torch.norm(d[0, :, :]), torch.norm(d[1, :, :]) (tensor(3.7417), tensor(11.2250))
- 
torch.prod(input, *, dtype=None) → Tensor¶ Returns the product of all elements in the
inputtensor.- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted todtypebefore the operation is performed. This is useful for preventing data type overflows. Default: None.
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[-0.8020, 0.5428, -1.5854]]) >>> torch.prod(a) tensor(0.6902)
Returns the product of each row of the
inputtensor in the given dimensiondim.If
keepdimisTrue, the output tensor is of the same size asinputexcept in the dimensiondimwhere it is of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensor having 1 fewer dimension thaninput.- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted todtypebefore the operation is performed. This is useful for preventing data type overflows. Default: None.
Example:
>>> a = torch.randn(4, 2) >>> a tensor([[ 0.5261, -0.3837], [ 1.1857, -0.2498], [-1.1646, 0.0705], [ 1.1131, -1.0629]]) >>> torch.prod(a, 1) tensor([-0.2018, -0.2962, -0.0821, -1.1831])
- 
torch.std(input, unbiased=True) → Tensor¶ Returns the standard-deviation of all elements in the
inputtensor.If
unbiasedisFalse, then the standard-deviation will be calculated via the biased estimator. Otherwise, Bessel’s correction will be used.- Parameters
 
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[-0.8166, -1.3802, -0.3560]]) >>> torch.std(a) tensor(0.5130)
Returns the standard-deviation of each row of the
inputtensor in the dimensiondim. Ifdimis a list of dimensions, reduce over all of them.If
keepdimisTrue, the output tensor is of the same size asinputexcept in the dimension(s)dimwhere it is of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensor having 1 (orlen(dim)) fewer dimension(s).If
unbiasedisFalse, then the standard-deviation will be calculated via the biased estimator. Otherwise, Bessel’s correction will be used.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[ 0.2035, 1.2959, 1.8101, -0.4644], [ 1.5027, -0.3270, 0.5905, 0.6538], [-1.5745, 1.3330, -0.5596, -0.6548], [ 0.1264, -0.5080, 1.6420, 0.1992]]) >>> torch.std(a, dim=1) tensor([ 1.0311, 0.7477, 1.2204, 0.9087])
- 
torch.sum(input, *, dtype=None) → Tensor¶ Returns the sum of all elements in the
inputtensor.- Parameters
 input (Tensor) – the input tensor.
- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted todtypebefore the operation is performed. This is useful for preventing data type overflows. Default: None.
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[ 0.1133, -0.9567, 0.2958]]) >>> torch.sum(a) tensor(-0.5475)
Returns the sum of each row of the
inputtensor in the given dimensiondim. Ifdimis a list of dimensions, reduce over all of them.If
keepdimisTrue, the output tensor is of the same size asinputexcept in the dimension(s)dimwhere it is of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensor having 1 (orlen(dim)) fewer dimension(s).- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. If specified, the input tensor is casted todtypebefore the operation is performed. This is useful for preventing data type overflows. Default: None.
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[ 0.0569, -0.2475, 0.0737, -0.3429], [-0.2993, 0.9138, 0.9337, -1.6864], [ 0.1132, 0.7892, -0.1003, 0.5688], [ 0.3637, -0.9906, -0.4752, -1.5197]]) >>> torch.sum(a, 1) tensor([-0.4598, -0.1381, 1.3708, -2.6217]) >>> b = torch.arange(4 * 5 * 6).view(4, 5, 6) >>> torch.sum(b, (2, 1)) tensor([ 435., 1335., 2235., 3135.])
- 
torch.unique(*args, **kwargs)¶ Returns the unique elements of the input tensor.
Note
This function is different from
torch.unique_consecutive()in the sense that this function also eliminates non-consecutive duplicate values.Note
Currently in the CUDA implementation and the CPU implementation when dim is specified, torch.unique always sort the tensor at the beginning regardless of the sort argument. Sorting could be slow, so if your input tensor is already sorted, it is recommended to use
torch.unique_consecutive()which avoids the sorting.- Parameters
 input (Tensor) – the input tensor
sorted (bool) – Whether to sort the unique elements in ascending order before returning as output.
return_inverse (bool) – Whether to also return the indices for where elements in the original input ended up in the returned unique list.
return_counts (bool) – Whether to also return the counts for each unique element.
dim (int) – the dimension to apply unique. If
None, the unique of the flattened input is returned. default:None
- Returns
 A tensor or a tuple of tensors containing
output (Tensor): the output list of unique scalar elements.
inverse_indices (Tensor): (optional) if
return_inverseis True, there will be an additional returned tensor (same shape as input) representing the indices for where elements in the original input map to in the output; otherwise, this function will only return a single tensor.counts (Tensor): (optional) if
return_countsis True, there will be an additional returned tensor (same shape as output or output.size(dim), if dim was specified) representing the number of occurrences for each unique value or tensor.
- Return type
 
Example:
>>> output = torch.unique(torch.tensor([1, 3, 2, 3], dtype=torch.long)) >>> output tensor([ 2, 3, 1]) >>> output, inverse_indices = torch.unique( ... torch.tensor([1, 3, 2, 3], dtype=torch.long), sorted=True, return_inverse=True) >>> output tensor([ 1, 2, 3]) >>> inverse_indices tensor([ 0, 2, 1, 2]) >>> output, inverse_indices = torch.unique( ... torch.tensor([[1, 3], [2, 3]], dtype=torch.long), sorted=True, return_inverse=True) >>> output tensor([ 1, 2, 3]) >>> inverse_indices tensor([[ 0, 2], [ 1, 2]])
- 
torch.var(input, unbiased=True) → Tensor¶ Returns the variance of all elements in the
inputtensor.If
unbiasedisFalse, then the variance will be calculated via the biased estimator. Otherwise, Bessel’s correction will be used.- Parameters
 
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[-0.3425, -1.2636, -0.4864]]) >>> torch.var(a) tensor(0.2455)
Returns the variance of each row of the
inputtensor in the given dimensiondim.If
keepdimisTrue, the output tensor is of the same size asinputexcept in the dimension(s)dimwhere it is of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensor having 1 (orlen(dim)) fewer dimension(s).If
unbiasedisFalse, then the variance will be calculated via the biased estimator. Otherwise, Bessel’s correction will be used.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[-0.3567, 1.7385, -1.3042, 0.7423], [ 1.3436, -0.1015, -0.9834, -0.8438], [ 0.6056, 0.1089, -0.3112, -1.4085], [-0.7700, 0.6074, -0.1469, 0.7777]]) >>> torch.var(a, 1) tensor([ 1.7444, 1.1363, 0.7356, 0.5112])
Comparison Ops¶
- 
torch.allclose(input, other, rtol=1e-05, atol=1e-08, equal_nan=False) → bool¶ This function checks if all
inputandothersatisfy the condition:\[\lvert \text{input} - \text{other} \rvert \leq \texttt{atol} + \texttt{rtol} \times \lvert \text{other} \rvert \]elementwise, for all elements of
inputandother. The behaviour of this function is analogous to numpy.allclose- Parameters
 
Example:
>>> torch.allclose(torch.tensor([10000., 1e-07]), torch.tensor([10000.1, 1e-08])) False >>> torch.allclose(torch.tensor([10000., 1e-08]), torch.tensor([10000.1, 1e-09])) True >>> torch.allclose(torch.tensor([1.0, float('nan')]), torch.tensor([1.0, float('nan')])) False >>> torch.allclose(torch.tensor([1.0, float('nan')]), torch.tensor([1.0, float('nan')]), equal_nan=True) True
- 
torch.argsort(input, dim=- 1, descending=False) → LongTensor¶ Returns the indices that sort a tensor along a given dimension in ascending order by value.
This is the second value returned by
torch.sort(). See its documentation for the exact semantics of this method.- Parameters
 
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[ 0.0785, 1.5267, -0.8521, 0.4065], [ 0.1598, 0.0788, -0.0745, -1.2700], [ 1.2208, 1.0722, -0.7064, 1.2564], [ 0.0669, -0.2318, -0.8229, -0.9280]]) >>> torch.argsort(a, dim=1) tensor([[2, 0, 3, 1], [3, 2, 1, 0], [2, 1, 0, 3], [3, 2, 1, 0]])
- 
torch.eq(input, other, *, out=None) → Tensor¶ Computes element-wise equality
The second argument can be a number or a tensor whose shape is broadcastable with the first argument.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Returns
 A boolean tensor that is True where
inputis equal tootherand False elsewhere
Example:
>>> torch.eq(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[ True, False], [False, True]])
- 
torch.equal(input, other) → bool¶ Trueif two tensors have the same size and elements,Falseotherwise.Example:
>>> torch.equal(torch.tensor([1, 2]), torch.tensor([1, 2])) True
- 
torch.ge(input, other, *, out=None) → Tensor¶ Computes \(\text{input} \geq \text{other}\) element-wise.
The second argument can be a number or a tensor whose shape is broadcastable with the first argument.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Returns
 A boolean tensor that is True where
inputis greater than or equal tootherand False elsewhere
Example:
>>> torch.ge(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[True, True], [False, True]])
- 
torch.gt(input, other, *, out=None) → Tensor¶ Computes \(\text{input} > \text{other}\) element-wise.
The second argument can be a number or a tensor whose shape is broadcastable with the first argument.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Returns
 A boolean tensor that is True where
inputis greater thanotherand False elsewhere
Example:
>>> torch.gt(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[False, True], [False, False]])
- 
torch.isfinite(input) → Tensor¶ Returns a new tensor with boolean elements representing if each element is finite or not.
Real values are finite when they are not NaN, negative infinity, or infinity. Complex values are finite when both their real and imaginary parts are finite.
- Args:
 input (Tensor): the input tensor.
- Returns:
 A boolean tensor that is True where
inputis finite and False elsewhere
Example:
>>> torch.isfinite(torch.tensor([1, float('inf'), 2, float('-inf'), float('nan')])) tensor([True, False, True, False, False])
- 
torch.isinf(input) → Tensor¶ Tests if each element of
inputis infinite (positive or negative infinity) or not.Note
Complex values are infinite when their real or imaginary part is infinite.
- Args:
 {input}
- Returns:
 A boolean tensor that is True where
inputis infinite and False elsewhere
Example:
>>> torch.isinf(torch.tensor([1, float('inf'), 2, float('-inf'), float('nan')])) tensor([False, True, False, True, False])
- 
torch.isnan(input) → Tensor¶ Returns a new tensor with boolean elements representing if each element of
inputis NaN or not. Complex values are considered NaN when either their real and/or imaginary part is NaN.- Parameters
 input (Tensor) – the input tensor.
- Returns
 A boolean tensor that is True where
inputis NaN and False elsewhere
Example:
>>> torch.isnan(torch.tensor([1, float('nan'), 2])) tensor([False, True, False])
- 
torch.kthvalue(input, k, dim=None, keepdim=False, *, out=None)¶ Returns a namedtuple
(values, indices)wherevaluesis thekth smallest element of each row of theinputtensor in the given dimensiondim. Andindicesis the index location of each element found.If
dimis not given, the last dimension of the input is chosen.If
keepdimisTrue, both thevaluesandindicestensors are the same size asinput, except in the dimensiondimwhere they are of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in both thevaluesandindicestensors having 1 fewer dimension than theinputtensor.Note
When
inputis a CUDA tensor and there are multiple validkth values, this function may nondeterministically returnindicesfor any of them.- Parameters
 - Keyword Arguments
 out (tuple, optional) – the output tuple of (Tensor, LongTensor) can be optionally given to be used as output buffers
Example:
>>> x = torch.arange(1., 6.) >>> x tensor([ 1., 2., 3., 4., 5.]) >>> torch.kthvalue(x, 4) torch.return_types.kthvalue(values=tensor(4.), indices=tensor(3)) >>> x=torch.arange(1.,7.).resize_(2,3) >>> x tensor([[ 1., 2., 3.], [ 4., 5., 6.]]) >>> torch.kthvalue(x, 2, 0, True) torch.return_types.kthvalue(values=tensor([[4., 5., 6.]]), indices=tensor([[1, 1, 1]]))
- 
torch.le(input, other, *, out=None) → Tensor¶ Computes \(\text{input} \leq \text{other}\) element-wise.
The second argument can be a number or a tensor whose shape is broadcastable with the first argument.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Returns
 A boolean tensor that is True where
inputis less than or equal tootherand False elsewhere
Example:
>>> torch.le(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[True, False], [True, True]])
- 
torch.lt(input, other, *, out=None) → Tensor¶ Computes \(\text{input} < \text{other}\) element-wise.
The second argument can be a number or a tensor whose shape is broadcastable with the first argument.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Returns
 A boolean tensor that is True where
inputis less thanotherand False elsewhere
Example:
>>> torch.lt(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[False, False], [True, False]])
- 
torch.max(input) → Tensor¶ Returns the maximum value of all elements in the
inputtensor.Warning
This function produces deterministic (sub)gradients unlike
max(dim=0)- Parameters
 input (Tensor) – the input tensor.
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[ 0.6763, 0.7445, -2.2369]]) >>> torch.max(a) tensor(0.7445)
- 
torch.max(input, dim, keepdim=False, *, out=None)¶ 
Returns a namedtuple
(values, indices)wherevaluesis the maximum value of each row of theinputtensor in the given dimensiondim. Andindicesis the index location of each maximum value found (argmax).If
keepdimisTrue, the output tensors are of the same size asinputexcept in the dimensiondimwhere they are of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensors having 1 fewer dimension thaninput.Note
If there are multiple maximal values in a reduced row then the indices of the first maximal value are returned.
- Parameters
 - Keyword Arguments
 out (tuple, optional) – the result tuple of two output tensors (max, max_indices)
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[-1.2360, -0.2942, -0.1222, 0.8475], [ 1.1949, -1.1127, -2.2379, -0.6702], [ 1.5717, -0.9207, 0.1297, -1.8768], [-0.6172, 1.0036, -0.6060, -0.2432]]) >>> torch.max(a, 1) torch.return_types.max(values=tensor([0.8475, 1.1949, 1.5717, 1.0036]), indices=tensor([3, 0, 0, 1]))
See
torch.maximum().
- 
torch.min(input) → Tensor¶ Returns the minimum value of all elements in the
inputtensor.Warning
This function produces deterministic (sub)gradients unlike
min(dim=0)- Parameters
 input (Tensor) – the input tensor.
Example:
>>> a = torch.randn(1, 3) >>> a tensor([[ 0.6750, 1.0857, 1.7197]]) >>> torch.min(a) tensor(0.6750)
- 
torch.min(input, dim, keepdim=False, *, out=None)¶ 
Returns a namedtuple
(values, indices)wherevaluesis the minimum value of each row of theinputtensor in the given dimensiondim. Andindicesis the index location of each minimum value found (argmin).If
keepdimisTrue, the output tensors are of the same size asinputexcept in the dimensiondimwhere they are of size 1. Otherwise,dimis squeezed (seetorch.squeeze()), resulting in the output tensors having 1 fewer dimension thaninput.Note
If there are multiple minimal values in a reduced row then the indices of the first minimal value are returned.
- Parameters
 - Keyword Arguments
 out (tuple, optional) – the tuple of two output tensors (min, min_indices)
Example:
>>> a = torch.randn(4, 4) >>> a tensor([[-0.6248, 1.1334, -1.1899, -0.2803], [-1.4644, -0.2635, -0.3651, 0.6134], [ 0.2457, 0.0384, 1.0128, 0.7015], [-0.1153, 2.9849, 2.1458, 0.5788]]) >>> torch.min(a, 1) torch.return_types.min(values=tensor([-1.1899, -1.4644, 0.0384, -0.1153]), indices=tensor([2, 0, 1, 0]))
See
torch.minimum().
- 
torch.ne(input, other, *, out=None) → Tensor¶ Computes \(\text{input} \neq \text{other}\) element-wise.
The second argument can be a number or a tensor whose shape is broadcastable with the first argument.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Returns
 A boolean tensor that is True where
inputis not equal tootherand False elsewhere
Example:
>>> torch.ne(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[False, True], [True, False]])
- 
torch.sort(input, dim=- 1, descending=False, *, out=None)¶ Sorts the elements of the
inputtensor along a given dimension in ascending order by value.If
dimis not given, the last dimension of the input is chosen.If
descendingisTruethen the elements are sorted in descending order by value.A namedtuple of (values, indices) is returned, where the values are the sorted values and indices are the indices of the elements in the original input tensor.
- Parameters
 - Keyword Arguments
 out (tuple, optional) – the output tuple of (Tensor, LongTensor) that can be optionally given to be used as output buffers
Example:
>>> x = torch.randn(3, 4) >>> sorted, indices = torch.sort(x) >>> sorted tensor([[-0.2162, 0.0608, 0.6719, 2.3332], [-0.5793, 0.0061, 0.6058, 0.9497], [-0.5071, 0.3343, 0.9553, 1.0960]]) >>> indices tensor([[ 1, 0, 2, 3], [ 3, 1, 0, 2], [ 0, 3, 1, 2]]) >>> sorted, indices = torch.sort(x, 0) >>> sorted tensor([[-0.5071, -0.2162, 0.6719, -0.5793], [ 0.0608, 0.0061, 0.9497, 0.3343], [ 0.6058, 0.9553, 1.0960, 2.3332]]) >>> indices tensor([[ 2, 0, 0, 1], [ 0, 1, 1, 2], [ 1, 2, 2, 0]])
- 
torch.topk(input, k, dim=None, largest=True, sorted=True, *, out=None)¶ Returns the
klargest elements of the giveninputtensor along a given dimension.If
dimis not given, the last dimension of the input is chosen.If
largestisFalsethen the k smallest elements are returned.A namedtuple of (values, indices) is returned, where the indices are the indices of the elements in the original input tensor.
The boolean option
sortedifTrue, will make sure that the returned k elements are themselves sorted- Parameters
 - Keyword Arguments
 out (tuple, optional) – the output tuple of (Tensor, LongTensor) that can be optionally given to be used as output buffers
Example:
>>> x = torch.arange(1., 6.) >>> x tensor([ 1., 2., 3., 4., 5.]) >>> torch.topk(x, 3) torch.return_types.topk(values=tensor([5., 4., 3.]), indices=tensor([4, 3, 2]))
Spectral Ops¶
- 
torch.fft()¶ 
- 
torch.stft(input: torch.Tensor, n_fft: int, hop_length: Optional[int] = None, win_length: Optional[int] = None, window: Optional[torch.Tensor] = None, center: bool = True, pad_mode: str = 'reflect', normalized: bool = False, onesided: Optional[bool] = None, return_complex: Optional[bool] = None) → torch.Tensor[source]¶ Short-time Fourier transform (STFT).
Warning
From version 1.8.0,
return_complexmust always be given explicitly for real inputs and return_complex=False has been deprecated. Strongly prefer return_complex=True as in a future pytorch release, this function will only return complex tensors.Note that
torch.view_as_real()can be used to recover a real tensor with an extra last dimension for real and imaginary components.The STFT computes the Fourier transform of short overlapping windows of the input. This giving frequency components of the signal as they change over time. The interface of this function is modeled after the librosa stft function.
Ignoring the optional batch dimension, this method computes the following expression:
\[X[m, \omega] = \sum_{k = 0}^{\text{win\_length-1}}% \text{window}[k]\ \text{input}[m \times \text{hop\_length} + k]\ % \exp\left(- j \frac{2 \pi \cdot \omega k}{\text{win\_length}}\right), \]where \(m\) is the index of the sliding window, and \(\omega\) is the frequency that \(0 \leq \omega < \text{n\_fft}\). When
onesidedis the default valueTrue,inputmust be either a 1-D time sequence or a 2-D batch of time sequences.If
hop_lengthisNone(default), it is treated as equal tofloor(n_fft / 4).If
win_lengthisNone(default), it is treated as equal ton_fft.windowcan be a 1-D tensor of sizewin_length, e.g., fromtorch.hann_window(). IfwindowisNone(default), it is treated as if having \(1\) everywhere in the window. If \(\text{win\_length} < \text{n\_fft}\),windowwill be padded on both sides to lengthn_fftbefore being applied.If
centerisTrue(default),inputwill be padded on both sides so that the \(t\)-th frame is centered at time \(t \times \text{hop\_length}\). Otherwise, the \(t\)-th frame begins at time \(t \times \text{hop\_length}\).pad_modedetermines the padding method used oninputwhencenterisTrue. Seetorch.nn.functional.pad()for all available options. Default is"reflect".If
onesidedisTrue(default for real input), only values for \(\omega\) in \(\left[0, 1, 2, \dots, \left\lfloor \frac{\text{n\_fft}}{2} \right\rfloor + 1\right]\) are returned because the real-to-complex Fourier transform satisfies the conjugate symmetry, i.e., \(X[m, \omega] = X[m, \text{n\_fft} - \omega]^*\). Note if the input or window tensors are complex, thenonesidedoutput is not possible.If
normalizedisTrue(default isFalse), the function returns the normalized STFT results, i.e., multiplied by \((\text{frame\_length})^{-0.5}\).If
return_complexisTrue(default if input is complex), the return is ainput.dim() + 1dimensional complex tensor. IfFalse, the output is ainput.dim() + 2dimensional real tensor where the last dimension represents the real and imaginary components.
Returns either a complex tensor of size \((* \times N \times T)\) if
return_complexis true, or a real tensor of size \((* \times N \times T \times 2)\). Where \(*\) is the optional batch size ofinput, \(N\) is the number of frequencies where STFT is applied and \(T\) is the total number of frames used.Warning
This function changed signature at version 0.4.1. Calling with the previous signature may cause error or return incorrect result.
- Parameters
 input (Tensor) – the input tensor
n_fft (int) – size of Fourier transform
hop_length (int, optional) – the distance between neighboring sliding window frames. Default:
None(treated as equal tofloor(n_fft / 4))win_length (int, optional) – the size of window frame and STFT filter. Default:
None(treated as equal ton_fft)window (Tensor, optional) – the optional window function. Default:
None(treated as window of all \(1\) s)center (bool, optional) – whether to pad
inputon both sides so that the \(t\)-th frame is centered at time \(t \times \text{hop\_length}\). Default:Truepad_mode (string, optional) – controls the padding method used when
centerisTrue. Default:"reflect"normalized (bool, optional) – controls whether to return the normalized STFT results Default:
Falseonesided (bool, optional) – controls whether to return half of results to avoid redundancy for real inputs. Default:
Truefor realinputandwindow,Falseotherwise.return_complex (bool, optional) – whether to return a complex tensor, or a real tensor with an extra last dimension for the real and imaginary components.
- Returns
 A tensor containing the STFT result with shape described above
- Return type
 
- 
torch.bartlett_window(window_length, periodic=True, *, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Bartlett window function.
\[w[n] = 1 - \left| \frac{2n}{N-1} - 1 \right| = \begin{cases} \frac{2n}{N - 1} & \text{if } 0 \leq n \leq \frac{N - 1}{2} \\ 2 - \frac{2n}{N - 1} & \text{if } \frac{N - 1}{2} < n < N \\ \end{cases}, \]where \(N\) is the full window size.
The input
window_lengthis a positive integer controlling the returned window size.periodicflag determines whether the returned window trims off the last duplicate value from the symmetric window and is ready to be used as a periodic window with functions liketorch.stft(). Therefore, ifperiodicis true, the \(N\) in above formula is in fact \(\text{window\_length} + 1\). Also, we always havetorch.bartlett_window(L, periodic=True)equal totorch.bartlett_window(L + 1, periodic=False)[:-1]).Note
If
window_length\(=1\), the returned window contains a single value 1.- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()). Only floating point types are supported.layout (
torch.layout, optional) – the desired layout of returned window tensor. Onlytorch.strided(dense layout) is supported.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
- Returns
 A 1-D tensor of size \((\text{window\_length},)\) containing the window
- Return type
 
- 
torch.blackman_window(window_length, periodic=True, *, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Blackman window function.
\[w[n] = 0.42 - 0.5 \cos \left( \frac{2 \pi n}{N - 1} \right) + 0.08 \cos \left( \frac{4 \pi n}{N - 1} \right) \]where \(N\) is the full window size.
The input
window_lengthis a positive integer controlling the returned window size.periodicflag determines whether the returned window trims off the last duplicate value from the symmetric window and is ready to be used as a periodic window with functions liketorch.stft(). Therefore, ifperiodicis true, the \(N\) in above formula is in fact \(\text{window\_length} + 1\). Also, we always havetorch.blackman_window(L, periodic=True)equal totorch.blackman_window(L + 1, periodic=False)[:-1]).Note
If
window_length\(=1\), the returned window contains a single value 1.- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()). Only floating point types are supported.layout (
torch.layout, optional) – the desired layout of returned window tensor. Onlytorch.strided(dense layout) is supported.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
- Returns
 A 1-D tensor of size \((\text{window\_length},)\) containing the window
- Return type
 
- 
torch.hamming_window(window_length, periodic=True, alpha=0.54, beta=0.46, *, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Hamming window function.
\[w[n] = \alpha - \beta\ \cos \left( \frac{2 \pi n}{N - 1} \right), \]where \(N\) is the full window size.
The input
window_lengthis a positive integer controlling the returned window size.periodicflag determines whether the returned window trims off the last duplicate value from the symmetric window and is ready to be used as a periodic window with functions liketorch.stft(). Therefore, ifperiodicis true, the \(N\) in above formula is in fact \(\text{window\_length} + 1\). Also, we always havetorch.hamming_window(L, periodic=True)equal totorch.hamming_window(L + 1, periodic=False)[:-1]).Note
If
window_length\(=1\), the returned window contains a single value 1.Note
This is a generalized version of
torch.hann_window().- Parameters
 window_length (int) – the size of returned window
periodic (bool, optional) – If True, returns a window to be used as periodic function. If False, return a symmetric window.
alpha (float, optional) – The coefficient \(\alpha\) in the equation above
beta (float, optional) – The coefficient \(\beta\) in the equation above
- Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()). Only floating point types are supported.layout (
torch.layout, optional) – the desired layout of returned window tensor. Onlytorch.strided(dense layout) is supported.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
- Returns
 A 1-D tensor of size \((\text{window\_length},)\) containing the window
- Return type
 
- 
torch.hann_window(window_length, periodic=True, *, dtype=None, layout=torch.strided, device=None, requires_grad=False) → Tensor¶ Hann window function.
\[w[n] = \frac{1}{2}\ \left[1 - \cos \left( \frac{2 \pi n}{N - 1} \right)\right] = \sin^2 \left( \frac{\pi n}{N - 1} \right), \]where \(N\) is the full window size.
The input
window_lengthis a positive integer controlling the returned window size.periodicflag determines whether the returned window trims off the last duplicate value from the symmetric window and is ready to be used as a periodic window with functions liketorch.stft(). Therefore, ifperiodicis true, the \(N\) in above formula is in fact \(\text{window\_length} + 1\). Also, we always havetorch.hann_window(L, periodic=True)equal totorch.hann_window(L + 1, periodic=False)[:-1]).Note
If
window_length\(=1\), the returned window contains a single value 1.- Parameters
 - Keyword Arguments
 dtype (
torch.dtype, optional) – the desired data type of returned tensor. Default: ifNone, uses a global default (seetorch.set_default_tensor_type()). Only floating point types are supported.layout (
torch.layout, optional) – the desired layout of returned window tensor. Onlytorch.strided(dense layout) is supported.device (
torch.device, optional) – the desired device of returned tensor. Default: ifNone, uses the current device for the default tensor type (seetorch.set_default_tensor_type()).devicewill be the CPU for CPU tensor types and the current CUDA device for CUDA tensor types.requires_grad (bool, optional) – If autograd should record operations on the returned tensor. Default:
False.
- Returns
 A 1-D tensor of size \((\text{window\_length},)\) containing the window
- Return type
 
Other Operations¶
- 
torch.bincount(input, weights=None, minlength=0) → Tensor¶ Count the frequency of each value in an array of non-negative ints.
The number of bins (size 1) is one larger than the largest value in
inputunlessinputis empty, in which case the result is a tensor of size 0. Ifminlengthis specified, the number of bins is at leastminlengthand ifinputis empty, then the result is tensor of sizeminlengthfilled with zeros. Ifnis the value at positioni,out[n] += weights[i]ifweightsis specified elseout[n] += 1.Note
This operation may produce nondeterministic gradients when given tensors on a CUDA device. See Reproducibility for more information.
- Parameters
 - Returns
 a tensor of shape
Size([max(input) + 1])ifinputis non-empty, elseSize(0)- Return type
 output (Tensor)
Example:
>>> input = torch.randint(0, 8, (5,), dtype=torch.int64) >>> weights = torch.linspace(0, 1, steps=5) >>> input, weights (tensor([4, 3, 6, 3, 4]), tensor([ 0.0000, 0.2500, 0.5000, 0.7500, 1.0000]) >>> torch.bincount(input) tensor([0, 0, 0, 2, 2, 0, 1]) >>> input.bincount(weights) tensor([0.0000, 0.0000, 0.0000, 1.0000, 1.0000, 0.0000, 0.5000])
- 
torch.broadcast_tensors(*tensors) → List of Tensors[source]¶ Broadcasts the given tensors according to Broadcasting semantics.
- Parameters
 *tensors – any number of tensors of the same type
Warning
More than one element of a broadcasted tensor may refer to a single memory location. As a result, in-place operations (especially ones that are vectorized) may result in incorrect behavior. If you need to write to the tensors, please clone them first.
Example:
>>> x = torch.arange(3).view(1, 3) >>> y = torch.arange(2).view(2, 1) >>> a, b = torch.broadcast_tensors(x, y) >>> a.size() torch.Size([2, 3]) >>> a tensor([[0, 1, 2], [0, 1, 2]])
- 
torch.cross(input, other, dim=None, *, out=None) → Tensor¶ Returns the cross product of vectors in dimension
dimofinputandother.inputandothermust have the same size, and the size of theirdimdimension should be 3.If
dimis not given, it defaults to the first dimension found with the size 3. Note that this might be unexpected.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(4, 3) >>> a tensor([[-0.3956, 1.1455, 1.6895], [-0.5849, 1.3672, 0.3599], [-1.1626, 0.7180, -0.0521], [-0.1339, 0.9902, -2.0225]]) >>> b = torch.randn(4, 3) >>> b tensor([[-0.0257, -1.4725, -1.2251], [-1.1479, -0.7005, -1.9757], [-1.3904, 0.3726, -1.1836], [-0.9688, -0.7153, 0.2159]]) >>> torch.cross(a, b, dim=1) tensor([[ 1.0844, -0.5281, 0.6120], [-2.4490, -1.5687, 1.9792], [-0.8304, -1.3037, 0.5650], [-1.2329, 1.9883, 1.0551]]) >>> torch.cross(a, b) tensor([[ 1.0844, -0.5281, 0.6120], [-2.4490, -1.5687, 1.9792], [-0.8304, -1.3037, 0.5650], [-1.2329, 1.9883, 1.0551]])
- 
torch.diag(input, diagonal=0, *, out=None) → Tensor¶ If
inputis a vector (1-D tensor), then returns a 2-D square tensor with the elements ofinputas the diagonal.If
inputis a matrix (2-D tensor), then returns a 1-D tensor with the diagonal elements ofinput.
The argument
diagonalcontrols which diagonal to consider:If
diagonal= 0, it is the main diagonal.If
diagonal> 0, it is above the main diagonal.If
diagonal< 0, it is below the main diagonal.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
See also
torch.diagonal()always returns the diagonal of its input.torch.diagflat()always constructs a tensor with diagonal elements specified by the input.Examples:
Get the square matrix where the input vector is the diagonal:
>>> a = torch.randn(3) >>> a tensor([ 0.5950,-0.0872, 2.3298]) >>> torch.diag(a) tensor([[ 0.5950, 0.0000, 0.0000], [ 0.0000,-0.0872, 0.0000], [ 0.0000, 0.0000, 2.3298]]) >>> torch.diag(a, 1) tensor([[ 0.0000, 0.5950, 0.0000, 0.0000], [ 0.0000, 0.0000,-0.0872, 0.0000], [ 0.0000, 0.0000, 0.0000, 2.3298], [ 0.0000, 0.0000, 0.0000, 0.0000]])
Get the k-th diagonal of a given matrix:
>>> a = torch.randn(3, 3) >>> a tensor([[-0.4264, 0.0255,-0.1064], [ 0.8795,-0.2429, 0.1374], [ 0.1029,-0.6482,-1.6300]]) >>> torch.diag(a, 0) tensor([-0.4264,-0.2429,-1.6300]) >>> torch.diag(a, 1) tensor([ 0.0255, 0.1374])
- 
torch.diag_embed(input, offset=0, dim1=- 2, dim2=- 1) → Tensor¶ Creates a tensor whose diagonals of certain 2D planes (specified by
dim1anddim2) are filled byinput. To facilitate creating batched diagonal matrices, the 2D planes formed by the last two dimensions of the returned tensor are chosen by default.The argument
offsetcontrols which diagonal to consider:If
offset= 0, it is the main diagonal.If
offset> 0, it is above the main diagonal.If
offset< 0, it is below the main diagonal.
The size of the new matrix will be calculated to make the specified diagonal of the size of the last input dimension. Note that for
offsetother than \(0\), the order ofdim1anddim2matters. Exchanging them is equivalent to changing the sign ofoffset.Applying
torch.diagonal()to the output of this function with the same arguments yields a matrix identical to input. However,torch.diagonal()has different default dimensions, so those need to be explicitly specified.- Parameters
 input (Tensor) – the input tensor. Must be at least 1-dimensional.
offset (int, optional) – which diagonal to consider. Default: 0 (main diagonal).
dim1 (int, optional) – first dimension with respect to which to take diagonal. Default: -2.
dim2 (int, optional) – second dimension with respect to which to take diagonal. Default: -1.
Example:
>>> a = torch.randn(2, 3) >>> torch.diag_embed(a) tensor([[[ 1.5410, 0.0000, 0.0000], [ 0.0000, -0.2934, 0.0000], [ 0.0000, 0.0000, -2.1788]], [[ 0.5684, 0.0000, 0.0000], [ 0.0000, -1.0845, 0.0000], [ 0.0000, 0.0000, -1.3986]]]) >>> torch.diag_embed(a, offset=1, dim1=0, dim2=2) tensor([[[ 0.0000, 1.5410, 0.0000, 0.0000], [ 0.0000, 0.5684, 0.0000, 0.0000]], [[ 0.0000, 0.0000, -0.2934, 0.0000], [ 0.0000, 0.0000, -1.0845, 0.0000]], [[ 0.0000, 0.0000, 0.0000, -2.1788], [ 0.0000, 0.0000, 0.0000, -1.3986]], [[ 0.0000, 0.0000, 0.0000, 0.0000], [ 0.0000, 0.0000, 0.0000, 0.0000]]])
- 
torch.diagflat(input, offset=0) → Tensor¶ If
inputis a vector (1-D tensor), then returns a 2-D square tensor with the elements ofinputas the diagonal.If
inputis a tensor with more than one dimension, then returns a 2-D tensor with diagonal elements equal to a flattenedinput.
The argument
offsetcontrols which diagonal to consider:If
offset= 0, it is the main diagonal.If
offset> 0, it is above the main diagonal.If
offset< 0, it is below the main diagonal.
- Parameters
 
Examples:
>>> a = torch.randn(3) >>> a tensor([-0.2956, -0.9068, 0.1695]) >>> torch.diagflat(a) tensor([[-0.2956, 0.0000, 0.0000], [ 0.0000, -0.9068, 0.0000], [ 0.0000, 0.0000, 0.1695]]) >>> torch.diagflat(a, 1) tensor([[ 0.0000, -0.2956, 0.0000, 0.0000], [ 0.0000, 0.0000, -0.9068, 0.0000], [ 0.0000, 0.0000, 0.0000, 0.1695], [ 0.0000, 0.0000, 0.0000, 0.0000]]) >>> a = torch.randn(2, 2) >>> a tensor([[ 0.2094, -0.3018], [-0.1516, 1.9342]]) >>> torch.diagflat(a) tensor([[ 0.2094, 0.0000, 0.0000, 0.0000], [ 0.0000, -0.3018, 0.0000, 0.0000], [ 0.0000, 0.0000, -0.1516, 0.0000], [ 0.0000, 0.0000, 0.0000, 1.9342]])
- 
torch.diagonal(input, offset=0, dim1=0, dim2=1) → Tensor¶ Returns a partial view of
inputwith the its diagonal elements with respect todim1anddim2appended as a dimension at the end of the shape.The argument
offsetcontrols which diagonal to consider:If
offset= 0, it is the main diagonal.If
offset> 0, it is above the main diagonal.If
offset< 0, it is below the main diagonal.
Applying
torch.diag_embed()to the output of this function with the same arguments yields a diagonal matrix with the diagonal entries of the input. However,torch.diag_embed()has different default dimensions, so those need to be explicitly specified.- Parameters
 input (Tensor) – the input tensor. Must be at least 2-dimensional.
offset (int, optional) – which diagonal to consider. Default: 0 (main diagonal).
dim1 (int, optional) – first dimension with respect to which to take diagonal. Default: 0.
dim2 (int, optional) – second dimension with respect to which to take diagonal. Default: 1.
Note
To take a batch diagonal, pass in dim1=-2, dim2=-1.
Examples:
>>> a = torch.randn(3, 3) >>> a tensor([[-1.0854, 1.1431, -0.1752], [ 0.8536, -0.0905, 0.0360], [ 0.6927, -0.3735, -0.4945]]) >>> torch.diagonal(a, 0) tensor([-1.0854, -0.0905, -0.4945]) >>> torch.diagonal(a, 1) tensor([ 1.1431, 0.0360]) >>> x = torch.randn(2, 5, 4, 2) >>> torch.diagonal(x, offset=-1, dim1=1, dim2=2) tensor([[[-1.2631, 0.3755, -1.5977, -1.8172], [-1.1065, 1.0401, -0.2235, -0.7938]], [[-1.7325, -0.3081, 0.6166, 0.2335], [ 1.0500, 0.7336, -0.3836, -1.1015]]])
- 
torch.einsum(equation, *operands) → Tensor[source]¶ Sums the product of the elements of the input
operandsalong dimensions specified using a notation based on the Einstein summation convention.Einsum allows computing many common multi-dimensional linear algebraic array operations by representing them in a short-hand format based on the Einstein summation convention, given by
equation. The details of this format are described below, but the general idea is to label every dimension of the inputoperandswith some subscript and define which subscripts are part of the output. The output is then computed by summing the product of the elements of theoperandsalong the dimensions whose subscripts are not part of the output. For example, matrix multiplication can be computed using einsum as torch.einsum(“ij,jk->ik”, A, B). Here, j is the summation subscript and i and k the output subscripts (see section below for more details on why).Equation:
The
equationstring specifies the subscripts (lower case letters [‘a’, ‘z’]) for each dimension of the inputoperandsin the same order as the dimensions, separating subcripts for each operand by a comma (‘,’), e.g. ‘ij,jk’ specify subscripts for two 2D operands. The dimensions labeled with the same subscript must be broadcastable, that is, their size must either match or be 1. The exception is if a subscript is repeated for the same input operand, in which case the dimensions labeled with this subscript for this operand must match in size and the operand will be replaced by its diagonal along these dimensions. The subscripts that appear exactly once in theequationwill be part of the output, sorted in increasing alphabetical order. The output is computed by multiplying the inputoperandselement-wise, with their dimensions aligned based on the subscripts, and then summing out the dimensions whose subscripts are not part of the output.Optionally, the output subscripts can be explicitly defined by adding an arrow (‘->’) at the end of the equation followed by the subscripts for the output. For instance, the following equation computes the transpose of a matrix multiplication: ‘ij,jk->ki’. The output subscripts must appear at least once for some input operand and at most once for the output.
Ellipsis (’…’) can be used in place of subscripts to broadcast the dimensions covered by the ellipsis. Each input operand may contain at most one ellipsis which will cover the dimensions not covered by subscripts, e.g. for an input operand with 5 dimensions, the ellipsis in the equation ‘ab…c’ cover the third and fourth dimensions. The ellipsis does not need to cover the same number of dimensions across the
operandsbut the ‘shape’ of the ellipsis (the size of the dimensions covered by them) must broadcast together. If the output is not explicitly defined with the arrow (‘->’) notation, the ellipsis will come first in the output (left-most dimensions), before the subscript labels that appear exactly once for the input operands. e.g. the following equation implements batch matrix multiplication ‘…ij,…jk’.A few final notes: the equation may contain whitespaces between the different elements (subscripts, ellipsis, arrow and comma) but something like ‘…’ is not valid. An empty string ‘’ is valid for scalar operands.
Note
torch.einsumhandles ellipsis (’…’) differently from NumPy in that it allows dimensions covered by the ellipsis to be summed over, that is, ellipsis are not required to be part of the output.Note
This function does not optimize the given expression, so a different formula for the same computation may run faster or consume less memory. Projects like opt_einsum (https://optimized-einsum.readthedocs.io/en/stable/) can optimize the formula for you.
- Parameters
 equation (string) – The subscripts for the Einstein summation.
operands (Tensor) – The operands to compute the Einstein sum of.
Examples:
# trace >>> torch.einsum('ii', torch.randn(4, 4)) tensor(-1.2104) # diagonal >>> torch.einsum('ii->i', torch.randn(4, 4)) tensor([-0.1034, 0.7952, -0.2433, 0.4545]) # outer product >>> x = torch.randn(5) >>> y = torch.randn(4) >>> torch.einsum('i,j->ij', x, y) tensor([[ 0.1156, -0.2897, -0.3918, 0.4963], [-0.3744, 0.9381, 1.2685, -1.6070], [ 0.7208, -1.8058, -2.4419, 3.0936], [ 0.1713, -0.4291, -0.5802, 0.7350], [ 0.5704, -1.4290, -1.9323, 2.4480]]) # batch matrix multiplication >>> As = torch.randn(3,2,5) >>> Bs = torch.randn(3,5,4) >>> torch.einsum('bij,bjk->bik', As, Bs) tensor([[[-1.0564, -1.5904, 3.2023, 3.1271], [-1.6706, -0.8097, -0.8025, -2.1183]], [[ 4.2239, 0.3107, -0.5756, -0.2354], [-1.4558, -0.3460, 1.5087, -0.8530]], [[ 2.8153, 1.8787, -4.3839, -1.2112], [ 0.3728, -2.1131, 0.0921, 0.8305]]]) # batch permute >>> A = torch.randn(2, 3, 4, 5) >>> torch.einsum('...ij->...ji', A).shape torch.Size([2, 3, 5, 4]) # equivalent to torch.nn.functional.bilinear >>> A = torch.randn(3,5,4) >>> l = torch.randn(2,5) >>> r = torch.randn(2,4) >>> torch.einsum('bn,anm,bm->ba', l, A, r) tensor([[-0.3430, -5.2405, 0.4494], [ 0.3311, 5.5201, -3.0356]])
- 
torch.flatten(input, start_dim=0, end_dim=- 1) → Tensor¶ Flattens
inputby reshaping it into a one-dimensional tensor. Ifstart_dimorend_dimare passed, only dimensions starting withstart_dimand ending withend_dimare flattened. The order of elements ininputis unchanged.Unlike NumPy’s flatten, which always copies input’s data, this function may return the original object, a view, or copy. If no dimensions are flattened, then the original object
inputis returned. Otherwise, if input can be viewed as the flattened shape, then that view is returned. Finally, only if the input cannot be viewed as the flattened shape is input’s data copied. Seetorch.Tensor.view()for details on when a view will be returned.Note
Flattening a zero-dimensional tensor will return a one-dimensional view.
- Parameters
 
Example:
>>> t = torch.tensor([[[1, 2], ... [3, 4]], ... [[5, 6], ... [7, 8]]]) >>> torch.flatten(t) tensor([1, 2, 3, 4, 5, 6, 7, 8]) >>> torch.flatten(t, start_dim=1) tensor([[1, 2, 3, 4], [5, 6, 7, 8]])
- 
torch.flip(input, dims) → Tensor¶ Reverse the order of a n-D tensor along given axis in dims.
Note
torch.flip makes a copy of
input’s data. This is different from NumPy’s np.flip, which returns a view in constant time. Since copying a tensor’s data is more work than viewing that data, torch.flip is expected to be slower than np.flip.Example:
>>> x = torch.arange(8).view(2, 2, 2) >>> x tensor([[[ 0, 1], [ 2, 3]], [[ 4, 5], [ 6, 7]]]) >>> torch.flip(x, [0, 1]) tensor([[[ 6, 7], [ 4, 5]], [[ 2, 3], [ 0, 1]]])
- 
torch.histc(input, bins=100, min=0, max=0, *, out=None) → Tensor¶ Computes the histogram of a tensor.
The elements are sorted into equal width bins between
minandmax. Ifminandmaxare both zero, the minimum and maximum values of the data are used.Elements lower than min and higher than max are ignored.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
- Returns
 Histogram represented as a tensor
- Return type
 
Example:
>>> torch.histc(torch.tensor([1., 2, 1]), bins=4, min=0, max=3) tensor([ 0., 2., 1., 0.])
- 
torch.meshgrid(*tensors)[source]¶ Take \(N\) tensors, each of which can be either scalar or 1-dimensional vector, and create \(N\) N-dimensional grids, where the \(i\) th grid is defined by expanding the \(i\) th input over dimensions defined by other inputs.
- Parameters
 tensors (list of Tensor) – list of scalars or 1 dimensional tensors. Scalars will be treated as tensors of size \((1,)\) automatically
- Returns
 If the input has \(k\) tensors of size \((N_1,), (N_2,), \ldots , (N_k,)\), then the output would also have \(k\) tensors, where all tensors are of size \((N_1, N_2, \ldots , N_k)\).
- Return type
 seq (sequence of Tensors)
Example:
>>> x = torch.tensor([1, 2, 3]) >>> y = torch.tensor([4, 5, 6]) >>> grid_x, grid_y = torch.meshgrid(x, y) >>> grid_x tensor([[1, 1, 1], [2, 2, 2], [3, 3, 3]]) >>> grid_y tensor([[4, 5, 6], [4, 5, 6], [4, 5, 6]])
- 
torch.renorm(input, p, dim, maxnorm, *, out=None) → Tensor¶ Returns a tensor where each sub-tensor of
inputalong dimensiondimis normalized such that the p-norm of the sub-tensor is lower than the valuemaxnormNote
If the norm of a row is lower than maxnorm, the row is unchanged
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> x = torch.ones(3, 3) >>> x[1].fill_(2) tensor([ 2., 2., 2.]) >>> x[2].fill_(3) tensor([ 3., 3., 3.]) >>> x tensor([[ 1., 1., 1.], [ 2., 2., 2.], [ 3., 3., 3.]]) >>> torch.renorm(x, 1, 0, 5) tensor([[ 1.0000, 1.0000, 1.0000], [ 1.6667, 1.6667, 1.6667], [ 1.6667, 1.6667, 1.6667]])
- 
torch.tensordot(a, b, dims=2, out=None)[source]¶ Returns a contraction of a and b over multiple dimensions.
tensordotimplements a generalized matrix product.- Parameters
 
When called with a non-negative integer argument
dims= \(d\), and the number of dimensions ofaandbis \(m\) and \(n\), respectively,tensordot()computes\[r_{i_0,...,i_{m-d}, i_d,...,i_n} = \sum_{k_0,...,k_{d-1}} a_{i_0,...,i_{m-d},k_0,...,k_{d-1}} \times b_{k_0,...,k_{d-1}, i_d,...,i_n}. \]When called with
dimsof the list form, the given dimensions will be contracted in place of the last \(d\) ofaand the first \(d\) of \(b\). The sizes in these dimensions must match, buttensordot()will deal with broadcasted dimensions.Examples:
>>> a = torch.arange(60.).reshape(3, 4, 5) >>> b = torch.arange(24.).reshape(4, 3, 2) >>> torch.tensordot(a, b, dims=([1, 0], [0, 1])) tensor([[4400., 4730.], [4532., 4874.], [4664., 5018.], [4796., 5162.], [4928., 5306.]]) >>> a = torch.randn(3, 4, 5, device='cuda') >>> b = torch.randn(4, 5, 6, device='cuda') >>> c = torch.tensordot(a, b, dims=2).cpu() tensor([[ 8.3504, -2.5436, 6.2922, 2.7556, -1.0732, 3.2741], [ 3.3161, 0.0704, 5.0187, -0.4079, -4.3126, 4.8744], [ 0.8223, 3.9445, 3.2168, -0.2400, 3.4117, 1.7780]]) >>> a = torch.randn(3, 5, 4, 6) >>> b = torch.randn(6, 4, 5, 3) >>> torch.tensordot(a, b, dims=([2, 1, 3], [1, 2, 0])) tensor([[ 7.7193, -2.4867, -10.3204], [ 1.5513, -14.4737, -6.5113], [ -0.2850, 4.2573, -3.5997]])
- 
torch.trace(input) → Tensor¶ Returns the sum of the elements of the diagonal of the input 2-D matrix.
Example:
>>> x = torch.arange(1., 10.).view(3, 3) >>> x tensor([[ 1., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.]]) >>> torch.trace(x) tensor(15.)
- 
torch.tril(input, diagonal=0, *, out=None) → Tensor¶ Returns the lower triangular part of the matrix (2-D tensor) or batch of matrices
input, the other elements of the result tensoroutare set to 0.The lower triangular part of the matrix is defined as the elements on and below the diagonal.
The argument
diagonalcontrols which diagonal to consider. Ifdiagonal= 0, all elements on and below the main diagonal are retained. A positive value includes just as many diagonals above the main diagonal, and similarly a negative value excludes just as many diagonals below the main diagonal. The main diagonal are the set of indices \(\lbrace (i, i) \rbrace\) for \(i \in [0, \min\{d_{1}, d_{2}\} - 1]\) where \(d_{1}, d_{2}\) are the dimensions of the matrix.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(3, 3) >>> a tensor([[-1.0813, -0.8619, 0.7105], [ 0.0935, 0.1380, 2.2112], [-0.3409, -0.9828, 0.0289]]) >>> torch.tril(a) tensor([[-1.0813, 0.0000, 0.0000], [ 0.0935, 0.1380, 0.0000], [-0.3409, -0.9828, 0.0289]]) >>> b = torch.randn(4, 6) >>> b tensor([[ 1.2219, 0.5653, -0.2521, -0.2345, 1.2544, 0.3461], [ 0.4785, -0.4477, 0.6049, 0.6368, 0.8775, 0.7145], [ 1.1502, 3.2716, -1.1243, -0.5413, 0.3615, 0.6864], [-0.0614, -0.7344, -1.3164, -0.7648, -1.4024, 0.0978]]) >>> torch.tril(b, diagonal=1) tensor([[ 1.2219, 0.5653, 0.0000, 0.0000, 0.0000, 0.0000], [ 0.4785, -0.4477, 0.6049, 0.0000, 0.0000, 0.0000], [ 1.1502, 3.2716, -1.1243, -0.5413, 0.0000, 0.0000], [-0.0614, -0.7344, -1.3164, -0.7648, -1.4024, 0.0000]]) >>> torch.tril(b, diagonal=-1) tensor([[ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000], [ 0.4785, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000], [ 1.1502, 3.2716, 0.0000, 0.0000, 0.0000, 0.0000], [-0.0614, -0.7344, -1.3164, 0.0000, 0.0000, 0.0000]])
- 
torch.triu(input, diagonal=0, *, out=None) → Tensor¶ Returns the upper triangular part of a matrix (2-D tensor) or batch of matrices
input, the other elements of the result tensoroutare set to 0.The upper triangular part of the matrix is defined as the elements on and above the diagonal.
The argument
diagonalcontrols which diagonal to consider. Ifdiagonal= 0, all elements on and above the main diagonal are retained. A positive value excludes just as many diagonals above the main diagonal, and similarly a negative value includes just as many diagonals below the main diagonal. The main diagonal are the set of indices \(\lbrace (i, i) \rbrace\) for \(i \in [0, \min\{d_{1}, d_{2}\} - 1]\) where \(d_{1}, d_{2}\) are the dimensions of the matrix.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.randn(3, 3) >>> a tensor([[ 0.2309, 0.5207, 2.0049], [ 0.2072, -1.0680, 0.6602], [ 0.3480, -0.5211, -0.4573]]) >>> torch.triu(a) tensor([[ 0.2309, 0.5207, 2.0049], [ 0.0000, -1.0680, 0.6602], [ 0.0000, 0.0000, -0.4573]]) >>> torch.triu(a, diagonal=1) tensor([[ 0.0000, 0.5207, 2.0049], [ 0.0000, 0.0000, 0.6602], [ 0.0000, 0.0000, 0.0000]]) >>> torch.triu(a, diagonal=-1) tensor([[ 0.2309, 0.5207, 2.0049], [ 0.2072, -1.0680, 0.6602], [ 0.0000, -0.5211, -0.4573]]) >>> b = torch.randn(4, 6) >>> b tensor([[ 0.5876, -0.0794, -1.8373, 0.6654, 0.2604, 1.5235], [-0.2447, 0.9556, -1.2919, 1.3378, -0.1768, -1.0857], [ 0.4333, 0.3146, 0.6576, -1.0432, 0.9348, -0.4410], [-0.9888, 1.0679, -1.3337, -1.6556, 0.4798, 0.2830]]) >>> torch.triu(b, diagonal=1) tensor([[ 0.0000, -0.0794, -1.8373, 0.6654, 0.2604, 1.5235], [ 0.0000, 0.0000, -1.2919, 1.3378, -0.1768, -1.0857], [ 0.0000, 0.0000, 0.0000, -1.0432, 0.9348, -0.4410], [ 0.0000, 0.0000, 0.0000, 0.0000, 0.4798, 0.2830]]) >>> torch.triu(b, diagonal=-1) tensor([[ 0.5876, -0.0794, -1.8373, 0.6654, 0.2604, 1.5235], [-0.2447, 0.9556, -1.2919, 1.3378, -0.1768, -1.0857], [ 0.0000, 0.3146, 0.6576, -1.0432, 0.9348, -0.4410], [ 0.0000, 0.0000, -1.3337, -1.6556, 0.4798, 0.2830]])
BLAS and LAPACK Operations¶
- 
torch.addbmm(input, batch1, batch2, *, beta=1, alpha=1, out=None) → Tensor¶ Performs a batch matrix-matrix product of matrices stored in
batch1andbatch2, with a reduced add step (all matrix multiplications get accumulated along the first dimension).inputis added to the final result.batch1andbatch2must be 3-D tensors each containing the same number of matrices.If
batch1is a \((b \times n \times m)\) tensor,batch2is a \((b \times m \times p)\) tensor,inputmust be broadcastable with a \((n \times p)\) tensor andoutwill be a \((n \times p)\) tensor.\[out = \beta\ \text{input} + \alpha\ (\sum_{i=0}^{b-1} \text{batch1}_i \mathbin{@} \text{batch2}_i) \]If
betais 0, theninputwill be ignored, and nan and inf in it will not be propagated.For inputs of type FloatTensor or DoubleTensor, arguments
betaandalphamust be real numbers, otherwise they should be integers.This operator supports TensorFloat32.
- Parameters
 - Keyword Arguments
 
Example:
>>> M = torch.randn(3, 5) >>> batch1 = torch.randn(10, 3, 4) >>> batch2 = torch.randn(10, 4, 5) >>> torch.addbmm(M, batch1, batch2) tensor([[ 6.6311, 0.0503, 6.9768, -12.0362, -2.1653], [ -4.8185, -1.4255, -6.6760, 8.9453, 2.5743], [ -3.8202, 4.3691, 1.0943, -1.1109, 5.4730]])
- 
torch.addmm(input, mat1, mat2, *, beta=1, alpha=1, out=None) → Tensor¶ Performs a matrix multiplication of the matrices
mat1andmat2. The matrixinputis added to the final result.If
mat1is a \((n \times m)\) tensor,mat2is a \((m \times p)\) tensor, theninputmust be broadcastable with a \((n \times p)\) tensor andoutwill be a \((n \times p)\) tensor.alphaandbetaare scaling factors on matrix-vector product betweenmat1andmat2and the added matrixinputrespectively.\[\text{out} = \beta\ \text{input} + \alpha\ (\text{mat1}_i \mathbin{@} \text{mat2}_i) \]If
betais 0, theninputwill be ignored, and nan and inf in it will not be propagated.For inputs of type FloatTensor or DoubleTensor, arguments
betaandalphamust be real numbers, otherwise they should be integers.This operator supports TensorFloat32.
- Parameters
 - Keyword Arguments
 beta (Number, optional) – multiplier for
input(\(\beta\))alpha (Number, optional) – multiplier for \(mat1 @ mat2\) (\(\alpha\))
out (Tensor, optional) – the output tensor.
Example:
>>> M = torch.randn(2, 3) >>> mat1 = torch.randn(2, 3) >>> mat2 = torch.randn(3, 3) >>> torch.addmm(M, mat1, mat2) tensor([[-4.8716, 1.4671, -1.3746], [ 0.7573, -3.9555, -2.8681]])
- 
torch.addmv(input, mat, vec, *, beta=1, alpha=1, out=None) → Tensor¶ Performs a matrix-vector product of the matrix
matand the vectorvec. The vectorinputis added to the final result.If
matis a \((n \times m)\) tensor,vecis a 1-D tensor of size m, theninputmust be broadcastable with a 1-D tensor of size n andoutwill be 1-D tensor of size n.alphaandbetaare scaling factors on matrix-vector product betweenmatandvecand the added tensorinputrespectively.\[\text{out} = \beta\ \text{input} + \alpha\ (\text{mat} \mathbin{@} \text{vec}) \]If
betais 0, theninputwill be ignored, and nan and inf in it will not be propagated.For inputs of type FloatTensor or DoubleTensor, arguments
betaandalphamust be real numbers, otherwise they should be integers- Parameters
 - Keyword Arguments
 beta (Number, optional) – multiplier for
input(\(\beta\))alpha (Number, optional) – multiplier for \(mat @ vec\) (\(\alpha\))
out (Tensor, optional) – the output tensor.
Example:
>>> M = torch.randn(2) >>> mat = torch.randn(2, 3) >>> vec = torch.randn(3) >>> torch.addmv(M, mat, vec) tensor([-0.3768, -5.5565])
- 
torch.addr(input, vec1, vec2, *, beta=1, alpha=1, out=None) → Tensor¶ Performs the outer-product of vectors
vec1andvec2and adds it to the matrixinput.Optional values
betaandalphaare scaling factors on the outer product betweenvec1andvec2and the added matrixinputrespectively.\[\text{out} = \beta\ \text{input} + \alpha\ (\text{vec1} \otimes \text{vec2}) \]If
betais 0, theninputwill be ignored, and nan and inf in it will not be propagated.If
vec1is a vector of size n andvec2is a vector of size m, theninputmust be broadcastable with a matrix of size \((n \times m)\) andoutwill be a matrix of size \((n \times m)\).- Parameters
 - Keyword Arguments
 beta (Number, optional) – multiplier for
input(\(\beta\))alpha (Number, optional) – multiplier for \(\text{vec1} \otimes \text{vec2}\) (\(\alpha\))
out (Tensor, optional) – the output tensor.
Example:
>>> vec1 = torch.arange(1., 4.) >>> vec2 = torch.arange(1., 3.) >>> M = torch.zeros(3, 2) >>> torch.addr(M, vec1, vec2) tensor([[ 1., 2.], [ 2., 4.], [ 3., 6.]])
- 
torch.baddbmm(input, batch1, batch2, *, beta=1, alpha=1, out=None) → Tensor¶ Performs a batch matrix-matrix product of matrices in
batch1andbatch2.inputis added to the final result.batch1andbatch2must be 3-D tensors each containing the same number of matrices.If
batch1is a \((b \times n \times m)\) tensor,batch2is a \((b \times m \times p)\) tensor, theninputmust be broadcastable with a \((b \times n \times p)\) tensor andoutwill be a \((b \times n \times p)\) tensor. Bothalphaandbetamean the same as the scaling factors used intorch.addbmm().\[\text{out}_i = \beta\ \text{input}_i + \alpha\ (\text{batch1}_i \mathbin{@} \text{batch2}_i) \]If
betais 0, theninputwill be ignored, and nan and inf in it will not be propagated.For inputs of type FloatTensor or DoubleTensor, arguments
betaandalphamust be real numbers, otherwise they should be integers.This operator supports TensorFloat32.
- Parameters
 - Keyword Arguments
 beta (Number, optional) – multiplier for
input(\(\beta\))alpha (Number, optional) – multiplier for \(\text{batch1} \mathbin{@} \text{batch2}\) (\(\alpha\))
out (Tensor, optional) – the output tensor.
Example:
>>> M = torch.randn(10, 3, 5) >>> batch1 = torch.randn(10, 3, 4) >>> batch2 = torch.randn(10, 4, 5) >>> torch.baddbmm(M, batch1, batch2).size() torch.Size([10, 3, 5])
- 
torch.bmm(input, mat2, *, deterministic=False, out=None) → Tensor¶ Performs a batch matrix-matrix product of matrices stored in
inputandmat2.inputandmat2must be 3-D tensors each containing the same number of matrices.If
inputis a \((b \times n \times m)\) tensor,mat2is a \((b \times m \times p)\) tensor,outwill be a \((b \times n \times p)\) tensor.\[\text{out}_i = \text{input}_i \mathbin{@} \text{mat2}_i \]This operator supports TensorFloat32.
Note
This function does not broadcast. For broadcasting matrix products, see
torch.matmul().- Parameters
 - Keyword Arguments
 
Example:
>>> input = torch.randn(10, 3, 4) >>> mat2 = torch.randn(10, 4, 5) >>> res = torch.bmm(input, mat2) >>> res.size() torch.Size([10, 3, 5])
- 
torch.chain_matmul(*matrices)[source]¶ Returns the matrix product of the \(N\) 2-D tensors. This product is efficiently computed using the matrix chain order algorithm which selects the order in which incurs the lowest cost in terms of arithmetic operations ([CLRS]). Note that since this is a function to compute the product, \(N\) needs to be greater than or equal to 2; if equal to 2 then a trivial matrix-matrix product is returned. If \(N\) is 1, then this is a no-op - the original matrix is returned as is.
- Parameters
 matrices (Tensors...) – a sequence of 2 or more 2-D tensors whose product is to be determined.
- Returns
 if the \(i^{th}\) tensor was of dimensions \(p_{i} \times p_{i + 1}\), then the product would be of dimensions \(p_{1} \times p_{N + 1}\).
- Return type
 
Example:
>>> a = torch.randn(3, 4) >>> b = torch.randn(4, 5) >>> c = torch.randn(5, 6) >>> d = torch.randn(6, 7) >>> torch.chain_matmul(a, b, c, d) tensor([[ -2.3375, -3.9790, -4.1119, -6.6577, 9.5609, -11.5095, -3.2614], [ 21.4038, 3.3378, -8.4982, -5.2457, -10.2561, -2.4684, 2.7163], [ -0.9647, -5.8917, -2.3213, -5.2284, 12.8615, -12.2816, -2.5095]])
- 
torch.cholesky(input, upper=False, *, out=None) → Tensor¶ Computes the Cholesky decomposition of a symmetric positive-definite matrix \(A\) or for batches of symmetric positive-definite matrices.
If
upperisTrue, the returned matrixUis upper-triangular, and the decomposition has the form:\[A = U^TU\]If
upperisFalse, the returned matrixLis lower-triangular, and the decomposition has the form:\[A = LL^T\]If
upperisTrue, and \(A\) is a batch of symmetric positive-definite matrices, then the returned tensor will be composed of upper-triangular Cholesky factors of each of the individual matrices. Similarly, whenupperisFalse, the returned tensor will be composed of lower-triangular Cholesky factors of each of the individual matrices.Note
torch.linalg.cholesky()should be used overtorch.choleskywhen possible. Note however thattorch.linalg.cholesky()does not yet support theupperparameter and instead always returns the lower triangular matrix.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output matrix
Example:
>>> a = torch.randn(3, 3) >>> a = torch.mm(a, a.t()) # make symmetric positive-definite >>> l = torch.cholesky(a) >>> a tensor([[ 2.4112, -0.7486, 1.4551], [-0.7486, 1.3544, 0.1294], [ 1.4551, 0.1294, 1.6724]]) >>> l tensor([[ 1.5528, 0.0000, 0.0000], [-0.4821, 1.0592, 0.0000], [ 0.9371, 0.5487, 0.7023]]) >>> torch.mm(l, l.t()) tensor([[ 2.4112, -0.7486, 1.4551], [-0.7486, 1.3544, 0.1294], [ 1.4551, 0.1294, 1.6724]]) >>> a = torch.randn(3, 2, 2) >>> a = torch.matmul(a, a.transpose(-1, -2)) + 1e-03 # make symmetric positive-definite >>> l = torch.cholesky(a) >>> z = torch.matmul(l, l.transpose(-1, -2)) >>> torch.max(torch.abs(z - a)) # Max non-zero tensor(2.3842e-07)
- 
torch.dot(input, other, *, out=None) → Tensor¶ Computes the dot product of two 1D tensors.
Note
Unlike NumPy’s dot, torch.dot intentionally only supports computing the dot product of two 1D tensors with the same number of elements.
- Parameters
 - Keyword Arguments
 {out} –
Example:
>>> torch.dot(torch.tensor([2, 3]), torch.tensor([2, 1])) tensor(7)
- 
torch.eig(input, eigenvectors=False, *, out=None)¶ Computes the eigenvalues and eigenvectors of a real square matrix.
Note
Since eigenvalues and eigenvectors might be complex, backward pass is supported only if eigenvalues and eigenvectors are all real valued.
When
inputis on CUDA,torch.eig()causes host-device synchronization.- Parameters
 - Keyword Arguments
 out (tuple, optional) – the output tensors
- Returns
 A namedtuple (eigenvalues, eigenvectors) containing
eigenvalues (Tensor): Shape \((n \times 2)\). Each row is an eigenvalue of
input, where the first element is the real part and the second element is the imaginary part. The eigenvalues are not necessarily ordered.eigenvectors (Tensor): If
eigenvectors=False, it’s an empty tensor. Otherwise, this tensor of shape \((n \times n)\) can be used to compute normalized (unit length) eigenvectors of corresponding eigenvalues as follows. If the corresponding eigenvalues[j] is a real number, column eigenvectors[:, j] is the eigenvector corresponding to eigenvalues[j]. If the corresponding eigenvalues[j] and eigenvalues[j + 1] form a complex conjugate pair, then the true eigenvectors can be computed as \(\text{true eigenvector}[j] = eigenvectors[:, j] + i \times eigenvectors[:, j + 1]\), \(\text{true eigenvector}[j + 1] = eigenvectors[:, j] - i \times eigenvectors[:, j + 1]\).
- Return type
 
Example:
Trivial example with a diagonal matrix. By default, only eigenvalues are computed: >>> a = torch.diag(torch.tensor([1, 2, 3], dtype=torch.double)) >>> e, v = torch.eig(a) >>> e tensor([[1., 0.], [2., 0.], [3., 0.]], dtype=torch.float64) >>> v tensor([], dtype=torch.float64) Compute also the eigenvectors: >>> e, v = torch.eig(a, eigenvectors=True) >>> e tensor([[1., 0.], [2., 0.], [3., 0.]], dtype=torch.float64) >>> v tensor([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]], dtype=torch.float64)
- 
torch.geqrf(input, *, out=None)¶ This is a low-level function for calling LAPACK directly. This function returns a namedtuple (a, tau) as defined in LAPACK documentation for geqrf .
You’ll generally want to use
torch.qr()instead.Computes a QR decomposition of
input, but without constructing \(Q\) and \(R\) as explicit separate matrices.Rather, this directly calls the underlying LAPACK function ?geqrf which produces a sequence of ‘elementary reflectors’.
See LAPACK documentation for geqrf for further details.
- 
torch.ger(input, vec2, *, out=None) → Tensor¶ Alias of
torch.outer().Warning
This function is deprecated and will be removed in a future PyTorch release. Use
torch.outer()instead.
- 
torch.inverse(input, *, out=None) → Tensor¶ Takes the inverse of the square matrix
input.inputcan be batches of 2D square tensors, in which case this function would return a tensor composed of individual inverses.Supports real and complex input.
Note
torch.inverse()is deprecated. Please usetorch.linalg.inv()instead.Note
Irrespective of the original strides, the returned tensors will be transposed, i.e. with strides like input.contiguous().transpose(-2, -1).stride()
- Parameters
 input (Tensor) – the input tensor of size \((*, n, n)\) where * is zero or more batch dimensions
- Keyword Arguments
 out (Tensor, optional) – the output tensor.
Examples:
>>> x = torch.rand(4, 4) >>> y = torch.inverse(x) >>> z = torch.mm(x, y) >>> z tensor([[ 1.0000, -0.0000, -0.0000, 0.0000], [ 0.0000, 1.0000, 0.0000, 0.0000], [ 0.0000, 0.0000, 1.0000, 0.0000], [ 0.0000, -0.0000, -0.0000, 1.0000]]) >>> torch.max(torch.abs(z - torch.eye(4))) # Max non-zero tensor(1.1921e-07) >>> # Batched inverse example >>> x = torch.randn(2, 3, 4, 4) >>> y = torch.inverse(x) >>> z = torch.matmul(x, y) >>> torch.max(torch.abs(z - torch.eye(4).expand_as(x))) # Max non-zero tensor(1.9073e-06) >>> x = torch.rand(4, 4, dtype=torch.cdouble) >>> y = torch.inverse(x) >>> z = torch.mm(x, y) >>> z tensor([[ 1.0000e+00+0.0000e+00j, -1.3878e-16+3.4694e-16j, 5.5511e-17-1.1102e-16j, 0.0000e+00-1.6653e-16j], [ 5.5511e-16-1.6653e-16j, 1.0000e+00+6.9389e-17j, 2.2204e-16-1.1102e-16j, -2.2204e-16+1.1102e-16j], [ 3.8858e-16-1.2490e-16j, 2.7756e-17+3.4694e-17j, 1.0000e+00+0.0000e+00j, -4.4409e-16+5.5511e-17j], [ 4.4409e-16+5.5511e-16j, -3.8858e-16+1.8041e-16j, 2.2204e-16+0.0000e+00j, 1.0000e+00-3.4694e-16j]], dtype=torch.complex128) >>> torch.max(torch.abs(z - torch.eye(4, dtype=torch.cdouble))) # Max non-zero tensor(7.5107e-16, dtype=torch.float64)
- 
torch.det(input) → Tensor¶ Calculates determinant of a square matrix or batches of square matrices.
Note
torch.det()is deprecated. Please usetorch.linalg.det()instead.Note
Backward through \(det\) internally uses SVD results when
inputis not invertible. In this case, double backward through \(det\) will be unstable wheninputdoesn’t have distinct singular values. See \(~torch.svd\) for details.- Parameters
 input (Tensor) – the input tensor of size
(*, n, n)where*is zero or more batch dimensions.
Example:
>>> A = torch.randn(3, 3) >>> torch.det(A) tensor(3.7641) >>> A = torch.randn(3, 2, 2) >>> A tensor([[[ 0.9254, -0.6213], [-0.5787, 1.6843]], [[ 0.3242, -0.9665], [ 0.4539, -0.0887]], [[ 1.1336, -0.4025], [-0.7089, 0.9032]]]) >>> A.det() tensor([1.1990, 0.4099, 0.7386])
- 
torch.logdet(input) → Tensor¶ Calculates log determinant of a square matrix or batches of square matrices.
Note
Result is
-infifinputhas zero log determinant, and isnanifinputhas negative determinant.Note
Backward through
logdet()internally uses SVD results wheninputis not invertible. In this case, double backward throughlogdet()will be unstable in wheninputdoesn’t have distinct singular values. Seesvd()for details.- Parameters
 input (Tensor) – the input tensor of size
(*, n, n)where*is zero or more batch dimensions.
Example:
>>> A = torch.randn(3, 3) >>> torch.det(A) tensor(0.2611) >>> torch.logdet(A) tensor(-1.3430) >>> A tensor([[[ 0.9254, -0.6213], [-0.5787, 1.6843]], [[ 0.3242, -0.9665], [ 0.4539, -0.0887]], [[ 1.1336, -0.4025], [-0.7089, 0.9032]]]) >>> A.det() tensor([1.1990, 0.4099, 0.7386]) >>> A.det().log() tensor([ 0.1815, -0.8917, -0.3031])
- 
torch.slogdet(input)¶ Calculates the sign and log absolute value of the determinant(s) of a square matrix or batches of square matrices.
Note
torch.slogdet()is deprecated. Please usetorch.linalg.slogdet()instead.Note
If
inputhas zero determinant, this returns(0, -inf).Note
Backward through
slogdet()internally uses SVD results wheninputis not invertible. In this case, double backward throughslogdet()will be unstable in wheninputdoesn’t have distinct singular values. Seesvd()for details.- Parameters
 input (Tensor) – the input tensor of size
(*, n, n)where*is zero or more batch dimensions.- Returns
 A namedtuple (sign, logabsdet) containing the sign of the determinant, and the log value of the absolute determinant.
Example:
>>> A = torch.randn(3, 3) >>> A tensor([[ 0.0032, -0.2239, -1.1219], [-0.6690, 0.1161, 0.4053], [-1.6218, -0.9273, -0.0082]]) >>> torch.det(A) tensor(-0.7576) >>> torch.logdet(A) tensor(nan) >>> torch.slogdet(A) torch.return_types.slogdet(sign=tensor(-1.), logabsdet=tensor(-0.2776))
- 
torch.matmul(input, other, *, out=None) → Tensor¶ Matrix product of two tensors.
The behavior depends on the dimensionality of the tensors as follows:
If both tensors are 1-dimensional, the dot product (scalar) is returned.
If both arguments are 2-dimensional, the matrix-matrix product is returned.
If the first argument is 1-dimensional and the second argument is 2-dimensional, a 1 is prepended to its dimension for the purpose of the matrix multiply. After the matrix multiply, the prepended dimension is removed.
If the first argument is 2-dimensional and the second argument is 1-dimensional, the matrix-vector product is returned.
If both arguments are at least 1-dimensional and at least one argument is N-dimensional (where N > 2), then a batched matrix multiply is returned. If the first argument is 1-dimensional, a 1 is prepended to its dimension for the purpose of the batched matrix multiply and removed after. If the second argument is 1-dimensional, a 1 is appended to its dimension for the purpose of the batched matrix multiple and removed after. The non-matrix (i.e. batch) dimensions are broadcasted (and thus must be broadcastable). For example, if
inputis a \((j \times 1 \times n \times n)\) tensor andotheris a \((k \times n \times n)\) tensor,outwill be a \((j \times k \times n \times n)\) tensor.Note that the broadcasting logic only looks at the batch dimensions when determining if the inputs are broadcastable, and not the matrix dimensions. For example, if
inputis a \((j \times 1 \times n \times m)\) tensor andotheris a \((k \times m \times p)\) tensor, these inputs are valid for broadcasting even though the final two dimensions (i.e. the matrix dimensions) are different.outwill be a \((j \times k \times n \times p)\) tensor.
This operator supports TensorFloat32.
Note
The 1-dimensional dot product version of this function does not support an
outparameter.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> # vector x vector >>> tensor1 = torch.randn(3) >>> tensor2 = torch.randn(3) >>> torch.matmul(tensor1, tensor2).size() torch.Size([]) >>> # matrix x vector >>> tensor1 = torch.randn(3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([3]) >>> # batched matrix x broadcasted vector >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3]) >>> # batched matrix x batched matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(10, 4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5]) >>> # batched matrix x broadcasted matrix >>> tensor1 = torch.randn(10, 3, 4) >>> tensor2 = torch.randn(4, 5) >>> torch.matmul(tensor1, tensor2).size() torch.Size([10, 3, 5])
- 
torch.matrix_power(input, n) → Tensor¶ Returns the matrix raised to the power
nfor square matrices. For batch of matrices, each individual matrix is raised to the powern.If
nis negative, then the inverse of the matrix (if invertible) is raised to the powern. For a batch of matrices, the batched inverse (if invertible) is raised to the powern. Ifnis 0, then an identity matrix is returned.Example:
>>> a = torch.randn(2, 2, 2) >>> a tensor([[[-1.9975, -1.9610], [ 0.9592, -2.3364]], [[-1.2534, -1.3429], [ 0.4153, -1.4664]]]) >>> torch.matrix_power(a, 3) tensor([[[ 3.9392, -23.9916], [ 11.7357, -0.2070]], [[ 0.2468, -6.7168], [ 2.0774, -0.8187]]])
- 
torch.matrix_rank(input, tol=None, symmetric=False, *, out=None) → Tensor¶ Returns the numerical rank of a 2-D tensor. The method to compute the matrix rank is done using SVD by default. If
symmetricisTrue, theninputis assumed to be symmetric, and the computation of the rank is done by obtaining the eigenvalues.tolis the threshold below which the singular values (or the eigenvalues whensymmetricisTrue) are considered to be 0. Iftolis not specified,tolis set toS.max() * max(S.size()) * epswhere S is the singular values (or the eigenvalues whensymmetricisTrue), andepsis the epsilon value for the datatype ofinput.Note
torch.matrix_rank()is deprecated. Please usetorch.linalg.matrix_rank()instead. The parametersymmetricwas renamed intorch.linalg.matrix_rank()tohermitian.- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> a = torch.eye(10) >>> torch.matrix_rank(a) tensor(10) >>> b = torch.eye(10) >>> b[0, 0] = 0 >>> torch.matrix_rank(b) tensor(9)
- 
torch.mm(input, mat2, *, out=None) → Tensor¶ Performs a matrix multiplication of the matrices
inputandmat2.If
inputis a \((n \times m)\) tensor,mat2is a \((m \times p)\) tensor,outwill be a \((n \times p)\) tensor.Note
This function does not broadcast. For broadcasting matrix products, see
torch.matmul().Supports strided and sparse 2-D tensors as inputs, autograd with respect to strided inputs.
This operator supports TensorFloat32.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> mat1 = torch.randn(2, 3) >>> mat2 = torch.randn(3, 3) >>> torch.mm(mat1, mat2) tensor([[ 0.4851, 0.5037, -0.3633], [-0.0760, -3.6705, 2.4784]])
- 
torch.mv(input, vec, *, out=None) → Tensor¶ Performs a matrix-vector product of the matrix
inputand the vectorvec.If
inputis a \((n \times m)\) tensor,vecis a 1-D tensor of size \(m\),outwill be 1-D of size \(n\).Note
This function does not broadcast.
- Parameters
 - Keyword Arguments
 out (Tensor, optional) – the output tensor.
Example:
>>> mat = torch.randn(2, 3) >>> vec = torch.randn(3) >>> torch.mv(mat, vec) tensor([ 1.0404, -0.6361])
- 
torch.orgqr(input, input2) → Tensor¶ Computes the orthogonal matrix Q of a QR factorization, from the (input, input2) tuple returned by
torch.geqrf().This directly calls the underlying LAPACK function ?orgqr. See LAPACK documentation for orgqr for further details.
- Parameters
 input (Tensor) – the a from
torch.geqrf().input2 (Tensor) – the tau from
torch.geqrf().
- 
torch.ormqr(input, input2, input3, left=True, transpose=False) → Tensor¶ Multiplies mat (given by
input3) by the orthogonal Q matrix of the QR factorization formed bytorch.geqrf()that is represented by (a, tau) (given by (input,input2)).This directly calls the underlying LAPACK function ?ormqr. See LAPACK documentation for ormqr for further details.
- Parameters
 input (Tensor) – the a from
torch.geqrf().input2 (Tensor) – the tau from
torch.geqrf().input3 (Tensor) – the matrix to be multiplied.
- 
torch.pinverse(input, rcond=1e-15) → Tensor¶ Calculates the pseudo-inverse (also known as the Moore-Penrose inverse) of a 2D tensor. Please look at Moore-Penrose inverse for more details
Note
torch.pinverse()is deprecated. Please usetorch.linalg.pinv()instead which includes new parametershermitianandout.Note
This method is implemented using the Singular Value Decomposition.
Note
The pseudo-inverse is not necessarily a continuous function in the elements of the matrix [1]. Therefore, derivatives are not always existent, and exist for a constant rank only [2]. However, this method is backprop-able due to the implementation by using SVD results, and could be unstable. Double-backward will also be unstable due to the usage of SVD internally. See
svd()for more details.Note
Supports real and complex inputs. Batched version for complex inputs is only supported on the CPU.
- Parameters
 - Returns
 The pseudo-inverse of
inputof dimensions \((*, n, m)\)
Example:
>>> input = torch.randn(3, 5) >>> input tensor([[ 0.5495, 0.0979, -1.4092, -0.1128, 0.4132], [-1.1143, -0.3662, 0.3042, 1.6374, -0.9294], [-0.3269, -0.5745, -0.0382, -0.5922, -0.6759]]) >>> torch.pinverse(input) tensor([[ 0.0600, -0.1933, -0.2090], [-0.0903, -0.0817, -0.4752], [-0.7124, -0.1631, -0.2272], [ 0.1356, 0.3933, -0.5023], [-0.0308, -0.1725, -0.5216]]) >>> # Batched pinverse example >>> a = torch.randn(2,6,3) >>> b = torch.pinverse(a) >>> torch.matmul(b, a) tensor([[[ 1.0000e+00, 1.6391e-07, -1.1548e-07], [ 8.3121e-08, 1.0000e+00, -2.7567e-07], [ 3.5390e-08, 1.4901e-08, 1.0000e+00]], [[ 1.0000e+00, -8.9407e-08, 2.9802e-08], [-2.2352e-07, 1.0000e+00, 1.1921e-07], [ 0.0000e+00, 8.9407e-08, 1.0000e+00]]])
- 
torch.qr(input, some=True, *, out=None)¶ Computes the QR decomposition of a matrix or a batch of matrices
input, and returns a namedtuple (Q, R) of tensors such that \(\text{input} = Q R\) with \(Q\) being an orthogonal matrix or batch of orthogonal matrices and \(R\) being an upper triangular matrix or batch of upper triangular matrices.If
someisTrue, then this function returns the thin (reduced) QR factorization. Otherwise, ifsomeisFalse, this function returns the complete QR factorization.Warning
torch.qris deprecated. Please usetorch.linalg.qr()instead.Differences with
torch.linalg.qr:torch.linalg.qrtakes a string parametermodeinstead ofsome:some=Trueis equivalent ofmode='reduced': both are the defaultsome=Falseis equivalent ofmode='complete'.
Warning
If you plan to backpropagate through QR, note that the current backward implementation is only well-defined when the first \(\min(input.size(-1), input.size(-2))\) columns of
inputare linearly independent. This behavior will propably change once QR supports pivoting.Note
This function uses LAPACK for CPU inputs and MAGMA for CUDA inputs, and may produce different (valid) decompositions on different device types or different platforms.
- Parameters
 input (Tensor) – the input tensor of size \((*, m, n)\) where * is zero or more batch dimensions consisting of matrices of dimension \(m \times n\).
some (bool, optional) –
Set to
Truefor reduced QR decomposition andFalsefor complete QR decomposition. If k = min(m, n) then:some=True: returns (Q, R) with dimensions (m, k), (k, n) (default)'some=False': returns (Q, R) with dimensions (m, m), (m, n)
- Keyword Arguments
 out (tuple, optional) – tuple of Q and R tensors. The dimensions of Q and R are detailed in the description of
someabove.
Example:
>>> a = torch.tensor([[12., -51, 4], [6, 167, -68], [-4, 24, -41]]) >>> q, r = torch.qr(a) >>> q tensor([[-0.8571, 0.3943, 0.3314], [-0.4286, -0.9029, -0.0343], [ 0.2857, -0.1714, 0.9429]]) >>> r tensor([[ -14.0000, -21.0000, 14.0000], [ 0.0000, -175.0000, 70.0000], [ 0.0000, 0.0000, -35.0000]]) >>> torch.mm(q, r).round() tensor([[ 12., -51., 4.], [ 6., 167., -68.], [ -4., 24., -41.]]) >>> torch.mm(q.t(), q).round() tensor([[ 1., 0., 0.], [ 0., 1., -0.], [ 0., -0., 1.]]) >>> a = torch.randn(3, 4, 5) >>> q, r = torch.qr(a, some=False) >>> torch.allclose(torch.matmul(q, r), a) True >>> torch.allclose(torch.matmul(q.transpose(-2, -1), q), torch.eye(5)) True
- 
torch.svd(input, some=True, compute_uv=True, *, out=None)¶ Computes the singular value decomposition of either a matrix or batch of matrices
input. The singular value decomposition is represented as a namedtuple (U,S,V), such thatinput= U diag(S) Vᴴ, where Vᴴ is the transpose of V for the real-valued inputs, or the conjugate transpose of V for the complex-valued inputs. Ifinputis a batch of tensors, then U, S, and V are also batched with the same batch dimensions asinput.If
someisTrue(default), the method returns the reduced singular value decomposition i.e., if the last two dimensions ofinputare m and n, then the returned U and V matrices will contain only min(n, m) orthonormal columns.If
compute_uvisFalse, the returned U and V will be zero-filled matrices of shape (m × m) and (n × n) respectively, and the same device asinput. Thesomeargument has no effect whencompute_uvisFalse.Supports input of float, double, cfloat and cdouble data types. The dtypes of U and V are the same as
input’s. S will always be real-valued, even ifinputis complex.Warning
torch.svd()is deprecated. Please usetorch.linalg.svd()instead, which is similar to NumPy’snumpy.linalg.svd.Note
Differences with
torch.linalg.svd():someis the opposite oftorch.linalg.svd()’sfull_matricies. Note that default value for both isTrue, so the default behavior is effectively the opposite.torch.svd()returns V, whereastorch.linalg.svd()returns Vᴴ.If
compute_uv=False,torch.svd()returns zero-filled tensors forUandVh, whereastorch.linalg.svd()returns empty tensors.
Note
The singular values are returned in descending order. If
inputis a batch of matrices, then the singular values of each matrix in the batch is returned in descending order.Note
The implementation of SVD on CPU uses the LAPACK routine ?gesdd (a divide-and-conquer algorithm) instead of ?gesvd for speed. Analogously, the SVD on GPU uses the cuSOLVER routines gesvdj and gesvdjBatched on CUDA 10.1.243 and later, and uses the MAGMA routine gesdd on earlier versions of CUDA.
Note
The returned matrix U will be transposed, i.e. with strides
U.contiguous().transpose(-2, -1).stride().Note
Gradients computed using U and V may be unstable if
inputis not full rank or has non-unique singular values.Note
When
some=False, the gradients onU[..., :, min(m, n):]andV[..., :, min(m, n):]will be ignored in backward as those vectors can be arbitrary bases of the subspaces.Note
The S tensor can only be used to compute gradients if
compute_uvis True.Note
With the complex-valued input the backward operation works correctly only for gauge invariant loss functions. Please look at Gauge problem in AD for more details.
Note
Since U and V of an SVD is not unique, each vector can be multiplied by an arbitrary phase factor \(e^{i \phi}\) while the SVD result is still correct. Different platforms, like Numpy, or inputs on different device types, may produce different U and V tensors.
- Parameters
 input (Tensor) – the input tensor of size (*, m, n) where * is zero or more batch dimensions consisting of (m × n) matrices.
some (bool, optional) – controls whether to compute the reduced or full decomposition, and consequently the shape of returned U and V. Defaults to True.
compute_uv (bool, optional) – option whether to compute U and V or not. Defaults to True.
- Keyword Arguments
 out (tuple, optional) – the output tuple of tensors
Example:
>>> a = torch.randn(5, 3) >>> a tensor([[ 0.2364, -0.7752, 0.6372], [ 1.7201, 0.7394, -0.0504], [-0.3371, -1.0584, 0.5296], [ 0.3550, -0.4022, 1.5569], [ 0.2445, -0.0158, 1.1414]]) >>> u, s, v = torch.svd(a) >>> u tensor([[ 0.4027, 0.0287, 0.5434], [-0.1946, 0.8833, 0.3679], [ 0.4296, -0.2890, 0.5261], [ 0.6604, 0.2717, -0.2618], [ 0.4234, 0.2481, -0.4733]]) >>> s tensor([2.3289, 2.0315, 0.7806]) >>> v tensor([[-0.0199, 0.8766, 0.4809], [-0.5080, 0.4054, -0.7600], [ 0.8611, 0.2594, -0.4373]]) >>> torch.dist(a, torch.mm(torch.mm(u, torch.diag(s)), v.t())) tensor(8.6531e-07) >>> a_big = torch.randn(7, 5, 3) >>> u, s, v = torch.svd(a_big) >>> torch.dist(a_big, torch.matmul(torch.matmul(u, torch.diag_embed(s)), v.transpose(-2, -1))) tensor(2.6503e-06)
- 
torch.symeig(input, eigenvectors=False, upper=True, *, out=None)¶ This function returns eigenvalues and eigenvectors of a real symmetric matrix
inputor a batch of real symmetric matrices, represented by a namedtuple (eigenvalues, eigenvectors).This function calculates all eigenvalues (and vectors) of
inputsuch that \(\text{input} = V \text{diag}(e) V^T\).The boolean argument
eigenvectorsdefines computation of both eigenvectors and eigenvalues or eigenvalues only.If it is
False, only eigenvalues are computed. If it isTrue, both eigenvalues and eigenvectors are computed.Since the input matrix
inputis supposed to be symmetric, only the upper triangular portion is used by default.If
upperisFalse, then lower triangular portion is used.Note
The eigenvalues are returned in ascending order. If
inputis a batch of matrices, then the eigenvalues of each matrix in the batch is returned in ascending order.Note
Irrespective of the original strides, the returned matrix V will be transposed, i.e. with strides V.contiguous().transpose(-1, -2).stride().
Warning
Extra care needs to be taken when backward through outputs. Such operation is only stable when all eigenvalues are distinct and becomes less stable the smaller \(\min_{i \neq j} |\lambda_i - \lambda_j|\) is.
- Parameters
 input (Tensor) – the input tensor of size \((*, n, n)\) where * is zero or more batch dimensions consisting of symmetric matrices.
eigenvectors (bool, optional) – controls whether eigenvectors have to be computed
upper (boolean, optional) – controls whether to consider upper-triangular or lower-triangular region
- Keyword Arguments
 out (tuple, optional) – the output tuple of (Tensor, Tensor)
- Returns
 A namedtuple (eigenvalues, eigenvectors) containing
eigenvalues (Tensor): Shape \((*, m)\). The eigenvalues in ascending order.
eigenvectors (Tensor): Shape \((*, m, m)\). If
eigenvectors=False, it’s an empty tensor. Otherwise, this tensor contains the orthonormal eigenvectors of theinput.
- Return type
 
Examples:
>>> a = torch.randn(5, 5) >>> a = a + a.t() # To make a symmetric >>> a tensor([[-5.7827, 4.4559, -0.2344, -1.7123, -1.8330], [ 4.4559, 1.4250, -2.8636, -3.2100, -0.1798], [-0.2344, -2.8636, 1.7112, -5.5785, 7.1988], [-1.7123, -3.2100, -5.5785, -2.6227, 3.1036], [-1.8330, -0.1798, 7.1988, 3.1036, -5.1453]]) >>> e, v = torch.symeig(a, eigenvectors=True) >>> e tensor([-13.7012, -7.7497, -2.3163, 5.2477, 8.1050]) >>> v tensor([[ 0.1643, 0.9034, -0.0291, 0.3508, 0.1817], [-0.2417, -0.3071, -0.5081, 0.6534, 0.4026], [-0.5176, 0.1223, -0.0220, 0.3295, -0.7798], [-0.4850, 0.2695, -0.5773, -0.5840, 0.1337], [ 0.6415, -0.0447, -0.6381, -0.0193, -0.4230]]) >>> a_big = torch.randn(5, 2, 2) >>> a_big = a_big + a_big.transpose(-2, -1) # To make a_big symmetric >>> e, v = a_big.symeig(eigenvectors=True) >>> torch.allclose(torch.matmul(v, torch.matmul(e.diag_embed(), v.transpose(-2, -1))), a_big) True