Conversation
|
And there could also be some normal PyTorch functions that shouldn't be allowed to compute at all. For instance, multiplying two feature-sets that contain vectors (as opposed to cross-products, which would be allowed... but what does the cross-product do to the features that are not vector components?!?). By the way, the use-case for features that are not vector components is for all of the other variables that describe a particle, such as isolation, charge, mass, etc. Since this is an input to an ML model, you probably want to throw the whole kitchen sink into it. Here's another normal PyTorch function that shouldn't be allowed, or should be restricted in complex ways: |
I've become familiar with PyTorch recently because of writing https://github.com/hsf-training/deep-learning-intro-for-hep/
I've also been looking at the Vector documentation because I think it needs an overhaul to be more physicist-friendly. Along the way, I noticed that there's no PyTorch backend yet, but it would be really useful to have one. Vector's approach to NumPy arrays is to expect them to be structured arrays, but feature vectors in an ML model are always unstructured. (Note: there's a conversion function: np.lib.recfunctions.structured_to_unstructured.)
Generally, feature vectors in an ML model will have a few indexes corresponding to vector coordinates and many others that don't. If the first 4 features are$p_T$ , $\eta$ , $\phi$ , and mass, we might want to denote that with
pt_index=0, phi_index=2, eta_index=1, mass_index=3in such a way that they can be picked out of a tensor namedfeatureslikeIt would be nice if the
featuresvector was a subclass oftorch.Tensorthat produces the above viaAnd then if someone asks for
it would compute$p_z$ using the appropriate compute function. With
torchas thelibargument of thevector._computefunctions, they would all be autodiffed and could be used in an optimization procedure with backpropagation. The library functions thatvector._computeneeds,vector/tests/test_compute_features.py
Lines 357 to 380 in 7cd311d
are all defined in the
torchmodule:so they probably don't even need a shim (which SymPy needed).
Below is the start of an implementation, using https://pytorch.org/docs/stable/notes/extending.html#extending-torch-python-api as a guide. PyTorch defines a
__torch_function__method (see this investigation), making it possible to overload without even creating real subclasses oftorch.Tensor, but I think it's a good idea to make subclasses oftorch.Tensorbecause these are mostly-normal feature vectors: they just have a few extra properties and methods.But then I got to the point where I'd have to wrap all of the functions and remembered that that's where all of the complexity is. Some functions (possibly methods or properties) take 1 input vectors and return a non-vector, others return a vector, while some other functions take 2 input vectors with both kinds of output, I don't think there are any functions that take more than 2, but there are some functions that don't do anything to the vector properties, like a PyTorch function to move data to and from the GPU or change its dtype. (Possible simplification: maybe all vector components can be forced to be float32?)
Some of the functions will have to shuffle the indexes to make them line up. Say, for instance, that you have
featuresAwithx_index=0, y_index=1andfeaturesBwithx_index=4, y_index=2. When you addfeaturesA + featuresB, you'll need to passinto the
vector._compute.planar.add.dispatchfunction.So that's where I left the implementation, as a sketch of the idea of interpreting the
axis=-1dimension of feature arrays as vector components, passingtorchas the compute functions'lib. Considering that each of the different types of functions has to be handled differently before calling compute functions, this is not as easy as I thought (a one-day project), but it's still not a huge project. I'd also like to find out if there's a "market" for this backend: I had assumed that spatial and momentum vector calculations would be useful as (the first) part of an ML model, but I wonder if anyone has any known use-cases.Also, I have to say that the ML "vector" and "tensor" terminology is incredibly confusing in this context. When we say that a feature-set has 2D, 3D, or 4D spatial or momentum vector components, we have to be sure to not call that feature-set a "feature vector," since that's a different thing.