Skip to content

Conversation

@RChrHill
Copy link

Currently, the functions TraceColour and TraceSpin, as well as the more generic TraceIndex ( Grid/lattice/Lattice_trace.h) are available for tracing over internal indices of lattice objects. However, these functions cannot take an expression template as an argument, requiring them to be called as e.g. TraceIndex<ColourIndex>(closure(a*b)) or for the expression to be assigned to a temporary first, forcing the full expression evaluation before the trace is performed.

This pull request adds expression-template-compatible traceColour and traceSpin overloads in the same spirit as trace, removing the need for a closure() and reducing the cost of a trace over individual indices. This is currently being used to obtain performance gains in a contraction-heavy in-development workflow.

This is a minimally-intrusive implementation that may not be ideal, in large part due to replicating the macros from Grid/lattice/Lattice_ET.h. To avoid this the macro #undefs at the bottom of Grid/lattice/Lattice_ET.h could be removed.
These operators can't live inside Grid/lattice/Lattice_ET.h because SpinIndex and ColourIndex are defined in Grid/qcd/QCD.h, and implementing a traceIndex op would require additional macro definitions to account for the Index template parameter -- which might be the better way to achieve this.

I will update the pull request with production GPU benchmarks in the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant