Internals reference

Operation types


OperationType is an @enum for classifying supported operations that can appear in @avx blocks. Type LoopVectorization.OperationType to see the different types.


loopvalue indicates an loop variable (i in for i in ...). These are the "parents" of compute operations that involve the loop variables.




A structure to encode a particular action occuring inside an @avx block.


  • identifier::Int64

    A unique identifier for this operation. identifer(op::Operation) returns the index of this operation within operations(ls::LoopSet).

  • variable::Symbol

    The name of the variable storing the result of this operation. For a = val this would be :a. For array assignments A[i,j] = val this would be :A.

  • elementbytes::Int64

    Intended to be the size of the result, in bytes. Often inaccurate, not to be relied on.

  • instruction::LoopVectorization.Instruction

    The specific operator, e.g., identity or +

  • node_type::LoopVectorization.OperationType

    The OperationType associated with this operation

  • dependencies::Array{Symbol,1}

    The loop variables this operation depends on

  • reduced_deps::Array{Symbol,1}

    Additional loop dependencies that must execute before this operation can be performed successfully (often needed in reductions)

  • parents::Array{LoopVectorization.Operation,1}

    Operations whose result this operation depends on

  • children::Array{LoopVectorization.Operation,1}

    Operations who depend on this result

  • ref::LoopVectorization.ArrayReferenceMeta

    For memload or memstore, encodes the array location

  • mangledvariable::Symbol

    gensymmed name of result.

  • reduced_children::Array{Symbol,1}

    Loop variables that consumers of this operation depend on. Often used in reductions to replicate assignment of initializers when unrolling.

  • u₁unrolled::Bool

    Cached value for whether u₁loopsym ∈ loopdependencies(op)

  • u₂unrolled::Bool

    Cached value for whether u₂loopsym ∈ loopdependencies(op)

  • vectorized::Bool

    Cached value for whether vectorized ∈ loopdependencies(op)


julia> using LoopVectorization

julia> AmulBq = :(for m ∈ 1:M, n ∈ 1:N
           C[m,n] = zero(eltype(B))
           for k ∈ 1:K
               C[m,n] += A[m,k] * B[k,n]

julia> lsAmulB = LoopVectorization.LoopSet(AmulBq);

julia> LoopVectorization.operations(lsAmulB)
6-element Array{LoopVectorization.Operation,1}:
 var"##RHS#253" = var"##zero#254"
 C[m, n] = var"##RHS#253"
 var"##tempload#255" = A[m, k]
 var"##tempload#256" = B[k, n]
 var"##RHS#253" = LoopVectorization.vfmadd_fast(var"##tempload#255", var"##tempload#256", var"##RHS#253")
 var"##RHS#253" = LoopVectorization.identity(var"##RHS#253")

Each one of these lines is a pretty-printed Operation.


Instructions and costs


Instruction represents a function via its module and symbol. It is similar to a GlobalRef and may someday be replaced by GlobalRef.


Store parameters related to performance for individual CPU instructions.

  • scaling::Float64

    A flag indicating how instruction cost scales with vector width (128, 256, or 512 bits)

  • scalar_reciprocal_throughput::Float64

    The number of clock cycles per operation when many of the same operation are repeated in sequence. Think of it as the inverse of the flow rate at steady-state. It is typically ≤ the scalar_latency.

  • scalar_latency::Int64

    The minimum delay, in clock cycles, associated with the instruction. Think of it as the delay from turning on a faucet to when water starts coming out the end of the pipe. See also scalar_reciprocal_throughput.

  • register_pressure::Int64

    Number of floating-point registered used


Array references


A type for encoding an array reference A[i,j] occurring inside an @avx block.


  • array::Symbol

    The array variable

  • indices::Array{Symbol,1}

    The list of indices (e.g., [:i, :j]), or name(op) for computed indices.

  • offsets::Array{Int8,1}

    Index offset, e.g., a[i+7] would store the 7. offsets is also used to help identify opportunities for avoiding reloads, for example in y[i] = x[i] - x[i-1], the previous load x[i-1] can be "carried over" to the next iteration. Only used for small (Int8) offsets.


A type similar to ArrayReference but holding additional information.


  • ref::LoopVectorization.ArrayReference

    The ArrayReference

  • loopedindex::Array{Bool,1}

    A vector of Bools indicating whether each index is a loop variable (false for operation-computed indices)

  • ptr::Symbol

    Variable holding the pointer to the array's underlying storage


Condensed types

These are used when encoding the @avx block as a type parameter for passing through to the @generated function.


A condensed representation of an ArrayReference. It supports array-references with up to 8 indexes, where the data for each consecutive index is packed into corresponding 8-bit fields of index_types (storing the enum IndexType), indices (the id for each index symbol), and offsets (currently unused).